A living and ever changing library of useful technologies, libraries and platforms for data science.
Fast.ai is a machine and deep learning library designed by Jeremy Howard with the intention to allow developers without a maths background to develop world class ML/DL models in the shortest time possible. Built on PyTorch.
Accompanying the library is the fast.ai MOOC, which has been the best ML / DL course I have completed to date.
The fast.ai blog is also an invaluable resource
Simple and flexible progress bar for Jupyter Notebook and console, developed by the fast.ai team
Waterfall charts for visualising marginal value contributions using a starting value (bias).
TPOT (Tree-based Pipeline Optimization Tool) is an automated machine learning tool that optimizes machine learning pipelines using genetic programming.
TPOT is built on top of Scikit-learn and it automates the most tedious parts of machine learning like feature selection, model selection, feature construction, etc, by exploring thousands of possible pipelines to find the best one for the data. It then provides you with the Python code for the best pipeline it found for manual exploration and tweaking.
Follow the installation instructions: Installation Guide
Featuretools is a framework to perform automated feature engineering. It excels at transforming temporal and relational datasets into feature matrices for machine learning.
Then follow the quickstart guide: 5 minute Quick Start
A great Python package to visually display the extent of missing values in a dataset.