Performant Deep Learning at Scale with Apache Spark & Apache SystemML
Laying a Foundation
- Data—which includes multiple “examples.” For instance, in health data collection, a patient would be considered an example. There are multiple “features” per “example.” These are the variables for each example, ie, a patient’s demographic or vital signs data. There are also “label(s)” for each “example.” Labels are targeted predictions for every example.
- Model—which is constructed or selected by the data scientist to fit a specific problem. In this context, a model is a mathematical formula that enables mapping from an example to a particular label. Neural Networks (NN) comprise the baseline model used in a deep learning mindset, and represent a class of models rather than a specific model itself.
- Loss—a mathematical “evaluation” of how well the model fits the data.
- Optimizer—which minimizes “loss” by adjusting the model to better fit the data.
The Importance of Declarative Machine Learning (DML)
- When projects require larger data sets.
- When working with a limited amount of data, but unable to achieve the best model fit.
- When the type of data in the domain is more appropriate for a larger cluster than it is for a laptop.