Spark library for running Machine Learning algorithms. Supports a range of algorithms (including classifications, regressions, decision trees, recommendations, clustering and topic modelling), including iterative algorithms. As of Spark 2.0 utilises a DataFrame (Spark SQL) based API, with the original RDD based API now in maintenance only. First introduced in Spark 0.8 after being collaboratively developed with the UC Berkeley MLbase project, and still under active development.
Type Sub-Project Parent Project Apache Spark Last Updated August 2017