111 results for “topic:pydata”
Parallel computing with task scheduling
cuDF - GPU DataFrame Library
STUMPY is a powerful and scalable Python library for modern time series analysis
Koalas: pandas API on Apache Spark
Extract data from a wide range of Internet sources into a pandas DataFrame.
A distributed task scheduler for Dask
Clean APIs for data cleaning. Python implementation of R package Janitor
A clean, three-column Sphinx theme with Bootstrap for the PyData community
High Performance Data Processing in Python
PyData, The Complete Works of
Scalable genetics toolkit
RFC document, tooling and other content related to the array API standard
Resources for Advancing into Analytics: From Excel to R and Python by George Mount (O'Reilly Media, 2021)
A consistent table management library in python
Python library for GraphBLAS: high-performance sparse linear algebra for scalable graph analytics
Notebooks for the Seattle PyData 2017 talk on Scattertext
Social network analysis code examples for PyCon 2019 talk
Machine learning with scikit-learn tutorial at PyData Chicago 2016
Introduction to Machine Learning with Time Series at PyData Festival Amsterdam 2020
vtreat is a data frame processor/conditioner that prepares real-world data for predictive modeling in a statistically sound manner. Distributed under a BSD-3-Clause license.
Graph algorithms written in GraphBLAS
Repo for my talk at the PyData Berlin 2017 conference
Data and tooling to compare the API surfaces of various array libraries.
Introduction to sktime at the PyData Global 2021
Simple Python GIS Web Services
WORK UNDER RESTRUCTURING
A `select` accessor for easier subsetting of pandas DataFrames and Series
Accompanying notebook and sources to "A Guide to Pseudolabelling: How to get a Kaggle medal with only one model" (Dec. 2020 PyData Boston-Cambridge Keynote)
This is the code and presentation for my PyData2017 talk "Reverse Image Search Using Out-of-the-box Machine Learning Libraries
Histograms with task scheduling.