Introduction to Statistical Learning, Python Edition: Free Book

The highly anticipated Python edition of Introduction to Statistical Learning is here. And you can read it for free! Here’s everything you need to know about the book.



Introduction to Statistical Learning, Python Edition: Free Book
Image by Author

 

For years, Introduction to Statistical Learning with Applications in R, better known as ISLR, has been cherished—by both machine learning beginners and practitioners alike—as one of the best machine learning textbooks. 

Now that the Python edition of the book, Introduction to Statistical Learning with Applications in Python—or ISL with Python—is here, the community is all the more excited! 

 

ISL with Python is Here. Great! But Why?

 

Glad you asked. ????

If you’ve been in the machine learning space for a while, chances are you’ve already heard, read, or used the R version of the book before. And you know what you liked best about it. But here’s my story. 

The summer before I started grad school, I decided to teach myself machine learning. I was lucky to stumble across ISLR early in my machine learning journey. The authors of ISLR do a great job at breaking down complex machine learning algorithms in an easy-to-follow manner—along with the required mathematical foundations—without overwhelming the learners. This is an aspect of the book I enjoyed.

The code examples and labs in ISLR, however, are in R. Sadly enough, I did not know R back then, but was comfortable programming in Python. So I had two options. 
 

Introduction to Statistical Learning, Python Edition: Free Book
Image by Author

 

I could teach myself R. Or I could use other resources—tutorials and documentation—to build models in Python. Like most other Pythonistas, I chose the second option (yeah, the more familiar route, I know).

While R is great for statistical analysis, Python is a good first language if you’re just starting out on your data journey. 

But this isn’t a problem anymore! Because this new Python edition lets you code along and build machine learning models in Python. No more worries about having to pick up a new programming language to follow along.

Story time’s up! Let’s take a closer look at the contents of the book.

 

Contents of ISL with Python

 

In terms of the content, the Python edition is pretty similar to the R edition. However, it's an appropriate adaptation for Python which is expected. This book also includes a Python programming crash course section to learn the basics.

This book covers sufficient breadth. From foundations of statistical learning, supervised and unsupervised learning algorithms to deep learning and more, the book is organized into the following chapters:

  • Statistical learning 
  • Linear regression 
  • Classification 
  • Resampling methods 
  • Linear model selection and regularization 
  • Moving beyond linearity
  • Tree-based methods 
  • Support Vector Machines
  • Deep Learning (covers vanilla neural networks to ConvNets and recurrent neural networks)
  • Survival Analysis and Censored Data
  • Unsupervised learning
  • Multiple testing (a deep dive into hypothesis testing) 

 

The ISLP Python Package

 

The book uses datasets sourced from publicly available repositories such as the UCI Machine Learning repository and other similar resources. Some examples include datasets on bike sharing, credit card default, fund management, and crime rates.

Learning to collect data from various sources through the process of web scraping, and importing data from sources are super important for a data science project. 

However for a learner who’s unfamiliar with the data collection step, it can introduce friction in the learning process if they want to use the book to get the hang of both the theory and hands-on sections. 

To facilitate a smooth learning experience, the book comes with an accompanying ISLP package:

  • The ISLP package is available for all major platforms: Linux, Windows, and MacOS.
  • You can install ISLP using pip: pip install islp preferably in a virtual environment on your machine. 

The ISLP package has a comprehensive documentation. The ISLP package comes with data loading utilities. When you work with a particular dataset, the docs page gives you ready-to-access information on the various features in the dataset, the number of records, and starter code to load the data into a pandas dataframe.

It also has helper functions and functionality to create higher-order features like polynomial and spline features.

 

Introduction to Statistical Learning, Python Edition: Free Book
Generating polynomial features | Image from ISLP docs

 

For a more complete learning experience, you can read in the data from their sources, perform feature engineering without using the ISLP package.

When you’re building models, you can try scikit-learn-only implementation and PyTorch or Keras for the deep learning sections.

 

So Who’s This Book For Again?

 

Data Science and Machine Learning Beginners: If you are a beginner who prefers a self-taught route to learn machine learning, this book is a great learning resource.

ML Practitioners: As a machine learning practitioner, you’ll have experience building machine learning models. But going back to the basics such as hypothesis testing and other algorithms can be helpful.

Educators: The theory and the labs together make this book a great companion for a first course in machine learning. Most universities and data science bootcamps these days teach machine learning. So if you are an educator who is teaching or looking to teach a machine learning course, this is a great course textbook to consider.

 

Wrapping Up

 

And that's a wrap. Introduction to Statistical Learning with Python has been one of the most exciting releases of this summer.

You can head over to statlearning.com and start reading the Python edition. While the soft copy is free to read, the paperback on Amazon sold out on the very first day. So we're excited to see you make the most of the book. Start reading it today. Happy learning!
 
 
Bala Priya C is a developer and technical writer from India. She likes working at the intersection of math, programming, data science, and content creation. Her areas of interest and expertise include DevOps, data science, and natural language processing. She enjoys reading, writing, coding, and coffee! Currently, she's working on learning and sharing her knowledge with the developer community by authoring tutorials, how-to guides, opinion pieces, and more.