Streaming MLlib

The following Machine Learning algorithms have their streaming variants in MLlib:

They can train models and predict on streaming data.

The streaming algorithms belong to spark.mllib (the older RDD-based API).

Streaming k-means

org.apache.spark.mllib.clustering.StreamingKMeans

Streaming Linear Regression

org.apache.spark.mllib.regression.StreamingLinearRegressionWithSGD

Streaming Logistic Regression

org.apache.spark.mllib.classification.StreamingLogisticRegressionWithSGD