Search This Blog

Saturday, December 16, 2017

Spark Classes and Resources

There's a-lot of material available for Spark MLlib (RDD based API) - this API may be deprecated with next release i.e. 2.3 ....
https://cognitiveclass.ai/courses/spark-mllib/

Spark ML is Dataframes based API - there are less training resources than core Spark  - MOOCs on edx/datacamp/udemy

Spark ML training at Strata  (full videos are available on safaribooksonline.com) and few more on safari from various authors/publications.

Great resource for anything Spark
https://jaceklaskowski.gitbooks.io/mastering-apache-spark/spark-mllib/spark-mllib-pipelines.html

https://mapr.com/training/certification/mcsd/opic-centric list of high-quality open datasets

https://github.com/caesar0301/awesome-public-datasets

Subscribe to Spark email list or review archives. 

http://apache-spark-user-list.1001560.n3.nabble.com/template/NamlServlet.jtp?macro=search_page&node=1&query=spark+ml&days=0&sort=date
https://spark.apache.org/community.html

Databricks is the founding organization of Spark and largest contributor.
https://databricks.com/training/courses/apache-spark-for-machine-learning-and-data-science

UC Berkeley, Hortonworks, IBM, and Cloudera are other top Spark committers. 

Berkeley has some courses, granddaddy of MLLib.
http://mlbase.org/

Hortonworks
https://hortonworks.com/apache/spark/

IBM
https://www.ibm.com/ca-en/marketplace/spark-as-a-service

Cloudera
https://university.cloudera.com/instructor-led-training/introduction-to-machine-learning-with-spark-ml-and-mllib (paid)

Deep Learning
https://github.com/databricks/spark-deep-learning

Databricks repos
https://github.com/databricks

Spark Roadmap
http://apache-spark-developers-list.1001551.n3.nabble.com/Spark-ml-roadmap-2-3-0-and-beyond-td22892.html#a22972


Certifications search on Github
https://github.com/search?l=Markdown&q=spark+ml+certification&type=Code&utf8=%E2%9C%93

Apache Spark Meetups
https://spark.apache.org/community.html

No comments:

Post a Comment