User Tools

Site Tools


start

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Next revision Both sides next revision
start [2019/07/16 13:56]
deadline
start [2019/12/04 14:10]
deadline added Python Zeppelin notebook
Line 6: Line 6:
 ====Course Descriptions and Links==== ====Course Descriptions and Links====
  
-Click on the course name for availability and further information. For best results, courses should be taken in the recommended order (shown below).  Courses 1 and 2 can be taken out of order. Course 3 builds on course 1 and 2. Course 4 builds on course 3, 2, and 1.  +Click on the course name for availability and further information. For best results, courses should be taken in the recommended order (shown below).  Courses 1 and 2 can be taken out of order. Course 3 builds on courses 1 and 2. Course 4 builds-on and assumes competence with topics in courses 3, 2, and 1.  
 + 
 +**NOTE:** If the link does not lead you to the class, it has not yet been scheduled. Check back at a future date. Also two new courses in the series are coming in the new year (including Kafka coverage 
 +and Data Engineering).
  
 | 1 | [[https://www.safaribooksonline.com/search/?query=Apache%20Hadoop%2C%20Spark%20and%20Big%20Data%20Foundations&field=title|Apache Hadoop, Spark and Big Data Foundations]] - A great introduction to the Hadoop Big Data Ecosystem. A non-programming introduction to Hadoop, Spark, HDFS, and MapReduce. (3 hours-1 day)|{{wiki:foundations-course.png}}| | 1 | [[https://www.safaribooksonline.com/search/?query=Apache%20Hadoop%2C%20Spark%20and%20Big%20Data%20Foundations&field=title|Apache Hadoop, Spark and Big Data Foundations]] - A great introduction to the Hadoop Big Data Ecosystem. A non-programming introduction to Hadoop, Spark, HDFS, and MapReduce. (3 hours-1 day)|{{wiki:foundations-course.png}}|
Line 19: Line 22:
   * [[https://www.clustermonkey.net/download/Hands-on_Hadoop_Spark/Hands_On_Hadoop_Spark-V1.5.tgz|Class Notes]] (tgz format)   * [[https://www.clustermonkey.net/download/Hands-on_Hadoop_Spark/Hands_On_Hadoop_Spark-V1.5.tgz|Class Notes]] (tgz format)
   * [[https://www.clustermonkey.net/download/Hands-on_Hadoop_Spark/Hands_On_Hadoop_Spark-V1.5.zip|Class Notes]] (zip format)   * [[https://www.clustermonkey.net/download/Hands-on_Hadoop_Spark/Hands_On_Hadoop_Spark-V1.5.zip|Class Notes]] (zip format)
 +  * [[https://www.clustermonkey.net/download/Hands-on_Hadoop_Spark/Using-Python-Zeppelin.json| Example Zeppelin notebook]]
  
 ===Class Notes for Practical Linux Command Line for Data Engineers and Analysts=== ===Class Notes for Practical Linux Command Line for Data Engineers and Analysts===
Line 26: Line 30:
  
 ====Zeppelin Notebook for Scalable Data Science with Hadoop and Spark=== ====Zeppelin Notebook for Scalable Data Science with Hadoop and Spark===
-(Updated 14-July-2019)+(Updated 20-Aug-2019)
  
-  * [[https://www.clustermonkey.net/download/Hands-on_Hadoop_Spark/Scalable-Analytics-V1.json|Scalable-Analytics.json]]+  * [[https://www.clustermonkey.net/download/Hands-on_Hadoop_Spark/Scalable-Analytics.json|Scalable-Analytics.json]]
  
 ---- ----
Line 35: Line 39:
   * [[https://www.clustermonkey.net/download/Hands-on_Hadoop_Spark/DOS-Linux-HDFS-cheatsheet.pdf|DOS to Linux/HDFS Cheat-sheet]]   * [[https://www.clustermonkey.net/download/Hands-on_Hadoop_Spark/DOS-Linux-HDFS-cheatsheet.pdf|DOS to Linux/HDFS Cheat-sheet]]
   * [[https://www.clustermonkey.net/download/Hands-on_Hadoop_Spark/ericg_vi-editor.bw.pdf|vi (visual editor) Cheat-sheet]]   * [[https://www.clustermonkey.net/download/Hands-on_Hadoop_Spark/ericg_vi-editor.bw.pdf|vi (visual editor) Cheat-sheet]]
 +  * [[https://www.cs.colostate.edu/helpdocs/vi.html|Additional help with vi]]
  
 ---- ----
Line 40: Line 45:
 ====Linux Hadoop Minimal (LHM) Virtual Machine Sandbox==== ====Linux Hadoop Minimal (LHM) Virtual Machine Sandbox====
  
-(Current Version 0.42, 03-June-2019) Used for //Hands-on//, //Command Line//, and //Scalable Data Science// courses above. Note: This VM can also be used for the //Hadoop and Spark Fundamentals: LiveLessons// video mentioned below.+(Current Version 0.42, 03-June-2019) **Not ready for Scalable Data Science with Hadoop and Spark (soon)** 
 + 
 +Used for //Hands-on//, //Command Line//, and //Scalable Data Science// courses above. Note: This VM can also be used for the //Hadoop and Spark Fundamentals: LiveLessons// video mentioned below.
   * [[Linux Hadoop Minimal Installation Instructions]] (Read First)    * [[Linux Hadoop Minimal Installation Instructions]] (Read First) 
   * [[https://www.clustermonkey.net/download/Hands-on_Hadoop_Spark/Linux-Hadoop-Minimal-0.42.MD5.txt|Linux Hadoop Minimal MD5]]   * [[https://www.clustermonkey.net/download/Hands-on_Hadoop_Spark/Linux-Hadoop-Minimal-0.42.MD5.txt|Linux Hadoop Minimal MD5]]
start.txt · Last modified: 2024/01/29 21:19 by deadline