Silicon Valley Code Camp : October 3rd and 4th 2015session
A Survey of Machine Learning Techniques Using Spark.ml 1.5.0
This session will provide an overview of using several core algorithms and performing common machine learning operations using the preferred Pipelines architecture of the latter releases of Spark ml/mllib. This session will focus on *Scala* API's.
About This Session
Recent releases of Spark machine learning libraries have shifted focus from the individual algorithms approach of the spark.mllib package to the data-driven pipelines approach of spark.ml. We will look at how to structure ML processes of data loading, modeling, predictions, and results analysis and distribution using the latest spark.ml api's.
Note: this year's session will focus only on the scala API's.
We will touch on one or more of the algorithms in the following areas:
Dimensionality Reduction / Feature extraction
Clustering
Classification and Regression
Depending on time available we may also touch on the following topics: