Data Science books and podcasts

March 1, 2016


Here are some books and podcasts that that I liked, or that I found as good references .

Books

~      
STATISTICAL LEARNING      
An Introduction to Statistical Learning with Applications in R An Introduction to Statistical Learning with Applications in R. Free book, nice introduction to the topic. The Elements of Statistical Learning: Data Mining, Inference, and Prediction. The Elements of Statistical Learning: Data Mining, Inference, and Prediction, by Hastie, Tibshirani and Friedman. Free book, advanced.
http://www.cambridge.org/us/academic/subjects/statistics-probability/statistical-theory-and-methods/statistical-models Statistical Models, by A. C. Davison. Recommended by Joe Blitzstein in CS109. Statistical Models: Theory and Practice Statistical Models: Theory and Practice, by David A. Freedman. Recommended by Joe Blitzstein in CS109.
MACHINE LEARNING      
Machine Learning. A Probabilistic Perspective, By Kevin P. Murphy. Like a bible Pattern Recognition and Machine Learning Pattern Recognition and Machine Learning, by Bishop. Enohasis on bayesian approach.
CONCEPTUAL      
The Art of Data Science The Art of Data Science. A Guide for Anyone Who Works with Data, by Roger D. Peng and Elizabeth Matsui. This is a conceptual review of the data analysis process, but practical. Data Science for Business Data Science for Business, by Provost and Fawcett
PROBABILITY      
    Introduction to Probability Introduction to Probability, by Blitzstein and Hwang
BAYESIAN      
Probabilistic Graphical Models Probabilistic Graphical Models, by Daphne Koller and Nir Friedman. A good companion to Coursera course on Probabilistic graphical models. Mastering Probabilistic Graphical Models Using Python Mastering Probabilistic Graphical Models Using Python, Ankan and Panda
PYTHON      
Python for Data Analysis Python for Data Analysis, by Wes McKinney. A classic for data analysis with Pandas. He announced on twitter that he began writing the second edition to the book!    
R language      
R Programming for Data Science R Programming for Data Science, by Roger Peng. Very basic introduction, the companion to their coursera Data Science course    
FREAK      
Data Science at the Command Line Data Science at the Command Line, by Janssens Think Complexity Think Complexity, by Downey. Free and interesting book.
    Data Analysis with Open Source Tools Data Analysis with Open Source Tools, by Janert. I really liked the topics the author touches and how, maybe because we are both physicists! ;)
SCIENTIFIC      
Effective Computation in Physics Effective Computation in Physics. The book and course I wish I had back at grad school. Learning IPython for Interactive Computing and Data Visualization Learning IPython for Interactive Computing and Data Visualization, by Rossant. Necessary introduction to the ipython jupyter notebook
PENDING      
Python Machine Learning Python Machine Learning, by Sebastian Raschka. Very promising! Machine Learning for Hackers Machine Learning for Hackers, by Drew Conway and John Myles White. Based on R.
Probabilistic Programming & Bayesian Methods for Hackers Probabilistic Programming & Bayesian Methods for Hackers, by Davidson-Pilon. Distributed as ipython notebooks, and based on python and PyMC    

Podcasts

~      
Talking machines. Amazing hosts, Katherine Gorman and Ryan Adams, really like their work. Linear digressions, by Katie Malone and Ben Jaffe. Also very good program and talks. Linked to Udacity.
The O’Reilly Data Show Podcast, by Ben Lorica. Interesting talks with state of the art professionals, leaning towards architectures and frameworks than theory. Data Stories, by Enrico and Moritz. Entertaining talks about graphical representations mostly.

Online books

~
Deep Learning, by Ian Goodfellow, Yoshua Bengio and Aaron Courville. This promises to be the bible of deep learning!
Learn Python the Hard Way, by Zed Shaw
Statistical foundations of machine learning, by Gianluca and Taieb
Scipy Lecture Notes
Forecasting: principles and practice, by Hyndman and Athana­sopou­los
Electric load forecasting: fundamentals and best practices, by Hong and Dickey