Win Vector LLC
WinVector
Expert data science training and consulting.
Languages
Top Repositories
Example R scripts and data for "Practical Data Science with R" 1st edition by Nina Zumel and John Mount (Manning Publications)
vtreat is a data frame processor/conditioner that prepares real-world data for predictive modeling in a statistically sound manner. Distributed under choice of GPL-2 or GPL-3 license.
Various examples for different articles
Code, Data, and Examples for Practical Data Science with R 2nd edition (Nina Zumel and John Mount) https://github.com/WinVector/PDSwR2
Wrap R for Sweet R Code
vtreat is a data frame processor/conditioner that prepares real-world data for predictive modeling in a statistically sound manner. Distributed under a BSD-3-Clause license.
Repositories
51Various examples for different articles
Example R scripts and data for "Practical Data Science with R" 1st edition by Nina Zumel and John Mount (Manning Publications)
All material for "Modeling big data with R, sparklyr, and Apache Spark" Strata Hadoop 2017.
Tools to convert from Jupyter notebooks to and from Python .py files, and render.
Code, Data, and Examples for Practical Data Science with R 2nd edition (Nina Zumel and John Mount) https://github.com/WinVector/PDSwR2
Higher order fluid or coordinatized data transforms in R. Distributed under choice of GPL-2 or GPL-3 license.
Pre-packaged plots in R
vtreat is a data frame processor/conditioner that prepares real-world data for predictive modeling in a statistically sound manner. Distributed under a BSD-3-Clause license.
vtreat is a data frame processor/conditioner that prepares real-world data for predictive modeling in a statistically sound manner. Distributed under choice of GPL-2 or GPL-3 license.
Data Wrangling and Query Generating Operators for R. Distributed under choice of GPL-2 or GPL-3 license.
Improved Standard Evaluation Interfaces for Common Data Manipulation Tasks
Wrap R for Sweet R Code
Patches for using dplyr with Databases and Big Data
Example automatic differentiation code in Scala
Win Vector technical articles and example code
Codd method-chained SQL generator and Pandas data processing in Python.
Working an example of supervised machine learning in Python
Examples of fast grouped row-wise operations in R (no C, C++, data.table, or dplyr used).
Example code for Lesson on Response Campaign planning
Experimental logistic regression code supporting multiple result categories, many levels of categorical modeling variables, good optimization, L2 regularization and more.
Support materials for WinVector talk
Ad-ins and keyboard shortcuts for building calculation pipelines in R
Viewable pages from WinVector LLC view at: http://winvector.github.io
Concise formatting of significances in R (GPL3 license).
Dynamic Programming implemented in Rcpp. Includes example partition and out of sample fitting applications.
Implement the rquery piped query algebra in R using data.table. Distributed under choice of GPL-2 or GPL-3 license.
Simple example of how to use an embedding plus sphering/whitening transform to measure difference in distribution.
Win Vector LLC Python data science teaching tools (graphs and data manipulation)
Example of how to build a simple R package
Example of a neural net model, with regularization on y-conditional activation patterns