What’s Not in the Book
This is a book that is primarily for ML engineers in the enterprise, not ML scientists in academia or industry research labs.
We purposefully don't discuss areas of active research—you will find very little here, for example, on machine learning model architecture (bidirectional encoders, or the attention mechanism, or short-circuit layers, for example) because we assume that you will be using a pre-built model architecture (Ex: ResNet-50 or GRUCell), not writing your own image classification or recurrent neural network.
Here are some concrete examples of areas that we intentionally stay away from because we believe that these topics are more appropriate for college courses and ML researchers:
ML algorithms -- We do not cover the differences between random forests and neural networks, for example. This is covered in introductory machine learning textbooks.
Building blocks -- We do not cover different types of gradient descent optimizers or activation functions. We recommend using Adam and ReLU—in our experience, the potential for improvements in performance by making different choices in these sorts of things tends to be minor.
ML model architectures -- If you are doing image classification, we recommend that you use an off-the-shelf model like ResNet or whatever the latest hotness is at the time you are reading this. Leave the design of new image classification or text classification models to researchers who specialize in this problem.
Model layers -- You won’t find convolutional neural networks or recurrent neural networks in this book. They are doubly disqualified—first, for being a building block and second, for being something you can use off-the-shelf.
Custom training loops -- Just calling model.fit() in Keras will fit the needs of practitioners.
In this book, we have tried to include only common patterns of the kind that machine learning engineers in enterprises will employ in their day-to-day work.
As an analogy, consider data structures. While a college course on data structures will delve into the implementations of different data structures, and a researcher on data structures will have to learn how to formally represent their mathematical properties, the practitioner can be more pragmatic. An enterprise software developer simply needs to know how to work effectively with arrays, linked lists, maps, sets, and trees. It is for a pragmatic practitioner in machine learning that this book is written.