User Tools

Site Tools


start

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
start [2019/06/12 17:06]
deadline
start [2019/12/04 14:15] (current)
deadline removed notebook
Line 6: Line 6:
 ====Course Descriptions and Links==== ====Course Descriptions and Links====
  
-Click on the course name for availability and further information. For best results, courses should be taken in the recommended order (shown below). ​ Courses 1 and 2 can be taken out of order. Course 3 builds on course ​1 and 2. Course 4 builds on course ​3, 2, and 1.  +Click on the course name for availability and further information. For best results, courses should be taken in the recommended order (shown below). ​ Courses 1 and 2 can be taken out of order. Course 3 builds on courses ​1 and 2. Course 4 builds-on and assumes competence with topics in courses ​3, 2, and 1.  
 + 
 +**NOTE:** If the link does not lead you to the class, it has not yet been scheduled. Check back at a future date. Also two new courses in the series are coming in the new year (including Kafka coverage 
 +and Data Engineering).
  
 | 1 | [[https://​www.safaribooksonline.com/​search/?​query=Apache%20Hadoop%2C%20Spark%20and%20Big%20Data%20Foundations&​field=title|Apache Hadoop, Spark and Big Data Foundations]] - A great introduction to the Hadoop Big Data Ecosystem. A non-programming introduction to Hadoop, Spark, HDFS, and MapReduce. (3 hours-1 day)|{{wiki:​foundations-course.png}}| | 1 | [[https://​www.safaribooksonline.com/​search/?​query=Apache%20Hadoop%2C%20Spark%20and%20Big%20Data%20Foundations&​field=title|Apache Hadoop, Spark and Big Data Foundations]] - A great introduction to the Hadoop Big Data Ecosystem. A non-programming introduction to Hadoop, Spark, HDFS, and MapReduce. (3 hours-1 day)|{{wiki:​foundations-course.png}}|
Line 24: Line 27:
   * [[https://​www.clustermonkey.net/​download/​Hands-on_Hadoop_Spark/​Linux-Command-Line-V1.0.tgz|Class Notes]] (tgz format)   * [[https://​www.clustermonkey.net/​download/​Hands-on_Hadoop_Spark/​Linux-Command-Line-V1.0.tgz|Class Notes]] (tgz format)
   * [[https://​www.clustermonkey.net/​download/​Hands-on_Hadoop_Spark/​Linux-Command-Line-V1.0.zip|Class Notes]] (zip format)   * [[https://​www.clustermonkey.net/​download/​Hands-on_Hadoop_Spark/​Linux-Command-Line-V1.0.zip|Class Notes]] (zip format)
 +
 +====Zeppelin Notebook for Scalable Data Science with Hadoop and Spark===
 +(Updated 20-Aug-2019)
 +
 +  * [[https://​www.clustermonkey.net/​download/​Hands-on_Hadoop_Spark/​Scalable-Analytics.json|Scalable-Analytics.json]]
  
 ---- ----
Line 30: Line 38:
   * [[https://​www.clustermonkey.net/​download/​Hands-on_Hadoop_Spark/​DOS-Linux-HDFS-cheatsheet.pdf|DOS to Linux/HDFS Cheat-sheet]]   * [[https://​www.clustermonkey.net/​download/​Hands-on_Hadoop_Spark/​DOS-Linux-HDFS-cheatsheet.pdf|DOS to Linux/HDFS Cheat-sheet]]
   * [[https://​www.clustermonkey.net/​download/​Hands-on_Hadoop_Spark/​ericg_vi-editor.bw.pdf|vi (visual editor) Cheat-sheet]]   * [[https://​www.clustermonkey.net/​download/​Hands-on_Hadoop_Spark/​ericg_vi-editor.bw.pdf|vi (visual editor) Cheat-sheet]]
 +  * [[https://​www.cs.colostate.edu/​helpdocs/​vi.html|Additional help with vi]]
  
 ---- ----
  
-====Linux Hadoop Minimal Virtual Machine====+====Linux Hadoop Minimal ​(LHM) Virtual Machine ​Sandbox====
  
-(Current Version 0.42, 03-June-2019) Used for "Hands-on" and "Command ​line" ​courses above. Note: This VM can also be used for the //Hadoop and Spark Fundamentals:​ LiveLessons//​ video mentioned below.+(Current Version 0.42, 03-June-2019) ​**Not ready for Scalable Data Science with Hadoop and Spark (soon)** 
 + 
 +Used for //Hands-on//, //Command ​Line//, and //Scalable Data Science// ​courses above. Note: This VM can also be used for the //Hadoop and Spark Fundamentals:​ LiveLessons//​ video mentioned below.
   * [[Linux Hadoop Minimal Installation Instructions]] (Read First) ​   * [[Linux Hadoop Minimal Installation Instructions]] (Read First) ​
   * [[https://​www.clustermonkey.net/​download/​Hands-on_Hadoop_Spark/​Linux-Hadoop-Minimal-0.42.MD5.txt|Linux Hadoop Minimal MD5]]   * [[https://​www.clustermonkey.net/​download/​Hands-on_Hadoop_Spark/​Linux-Hadoop-Minimal-0.42.MD5.txt|Linux Hadoop Minimal MD5]]
   * Linux Hadoop Minimal Virtual Machine OVA file [[https://​www.clustermonkey.net/​download/​Hands-on_Hadoop_Spark/​Linux-Hadoop-Minimal-0.42.ova|US]] [[http://​134.209.239.225/​download/​Linux-Hadoop-Minimal-0.42.ova|Europe]] (3.3G)   * Linux Hadoop Minimal Virtual Machine OVA file [[https://​www.clustermonkey.net/​download/​Hands-on_Hadoop_Spark/​Linux-Hadoop-Minimal-0.42.ova|US]] [[http://​134.209.239.225/​download/​Linux-Hadoop-Minimal-0.42.ova|Europe]] (3.3G)
   * [[https://​www.clustermonkey.net/​download/​Hands-on_Hadoop_Spark/​old|Old Versions]]   * [[https://​www.clustermonkey.net/​download/​Hands-on_Hadoop_Spark/​old|Old Versions]]
 +
 +----
 +
 +====Cloudera-Hortonworks HDP Sandbox====
 +
 +The Cloudera-Hortonworks HDP Sandbox, a full featured Hadoop/​Spark virtual machine that runs under Docker, VirtualBox, or VMWare. Please see [[https://​www.cloudera.com/​downloads/​hortonworks-sandbox.html|Cloudera/​Hortonworks HDP Sandbox]] for more information. Due to the number of applications the HDP Sandbox can require substantial resources to run. 
 +
 +----
 +
 +====Zeppelin Web Notebook====
 +For those taking the //Scalable Data Science// course a 30-day web-based Zeppelin Notebook is available from [[https://​www.basement-supercomputing.com|Basement Supercomputing]]. Please use the [[Sign Up Form]] to get access to the notebook. ​
  
 ---- ----
start.1560359180.txt.gz · Last modified: 2019/06/12 17:06 by deadline