This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision | ||
start [2019/06/11 21:02] deadline |
start [2019/12/04 14:15] (current) deadline removed notebook |
||
---|---|---|---|
Line 6: | Line 6: | ||
====Course Descriptions and Links==== | ====Course Descriptions and Links==== | ||
- | Click on the course name for availability and further information. For best results, courses should be taken in the recommended order (shown below). Courses 1 and 2 can be taken out of order. Course 3 builds on course 1 and 2. Course 4 builds on course 3, 2, and 1. | + | Click on the course name for availability and further information. For best results, courses should be taken in the recommended order (shown below). Courses 1 and 2 can be taken out of order. Course 3 builds on courses 1 and 2. Course 4 builds-on and assumes competence with topics in courses 3, 2, and 1. |
+ | |||
+ | **NOTE:** If the link does not lead you to the class, it has not yet been scheduled. Check back at a future date. Also two new courses in the series are coming in the new year (including Kafka coverage | ||
+ | and Data Engineering). | ||
| 1 | [[https://www.safaribooksonline.com/search/?query=Apache%20Hadoop%2C%20Spark%20and%20Big%20Data%20Foundations&field=title|Apache Hadoop, Spark and Big Data Foundations]] - A great introduction to the Hadoop Big Data Ecosystem. A non-programming introduction to Hadoop, Spark, HDFS, and MapReduce. (3 hours-1 day)|{{wiki:foundations-course.png}}| | | 1 | [[https://www.safaribooksonline.com/search/?query=Apache%20Hadoop%2C%20Spark%20and%20Big%20Data%20Foundations&field=title|Apache Hadoop, Spark and Big Data Foundations]] - A great introduction to the Hadoop Big Data Ecosystem. A non-programming introduction to Hadoop, Spark, HDFS, and MapReduce. (3 hours-1 day)|{{wiki:foundations-course.png}}| | ||
Line 24: | Line 27: | ||
* [[https://www.clustermonkey.net/download/Hands-on_Hadoop_Spark/Linux-Command-Line-V1.0.tgz|Class Notes]] (tgz format) | * [[https://www.clustermonkey.net/download/Hands-on_Hadoop_Spark/Linux-Command-Line-V1.0.tgz|Class Notes]] (tgz format) | ||
* [[https://www.clustermonkey.net/download/Hands-on_Hadoop_Spark/Linux-Command-Line-V1.0.zip|Class Notes]] (zip format) | * [[https://www.clustermonkey.net/download/Hands-on_Hadoop_Spark/Linux-Command-Line-V1.0.zip|Class Notes]] (zip format) | ||
+ | |||
+ | ====Zeppelin Notebook for Scalable Data Science with Hadoop and Spark=== | ||
+ | (Updated 20-Aug-2019) | ||
+ | |||
+ | * [[https://www.clustermonkey.net/download/Hands-on_Hadoop_Spark/Scalable-Analytics.json|Scalable-Analytics.json]] | ||
---- | ---- | ||
Line 30: | Line 38: | ||
* [[https://www.clustermonkey.net/download/Hands-on_Hadoop_Spark/DOS-Linux-HDFS-cheatsheet.pdf|DOS to Linux/HDFS Cheat-sheet]] | * [[https://www.clustermonkey.net/download/Hands-on_Hadoop_Spark/DOS-Linux-HDFS-cheatsheet.pdf|DOS to Linux/HDFS Cheat-sheet]] | ||
* [[https://www.clustermonkey.net/download/Hands-on_Hadoop_Spark/ericg_vi-editor.bw.pdf|vi (visual editor) Cheat-sheet]] | * [[https://www.clustermonkey.net/download/Hands-on_Hadoop_Spark/ericg_vi-editor.bw.pdf|vi (visual editor) Cheat-sheet]] | ||
+ | * [[https://www.cs.colostate.edu/helpdocs/vi.html|Additional help with vi]] | ||
---- | ---- | ||
- | ====Linux Hadoop Minimal Virtual Machine==== | ||
- | (Current Version 0.42, 03-June-2019) Used for "Hands-on" and "Command line" courses above. Note: This VM can also be used for the //Hadoop and Spark Fundamentals: LiveLessons// video mentioned below. | + | ====Linux Hadoop Minimal (LHM) Virtual Machine Sandbox==== |
+ | |||
+ | (Current Version 0.42, 03-June-2019) **Not ready for Scalable Data Science with Hadoop and Spark (soon)** | ||
+ | |||
+ | Used for //Hands-on//, //Command Line//, and //Scalable Data Science// courses above. Note: This VM can also be used for the //Hadoop and Spark Fundamentals: LiveLessons// video mentioned below. | ||
* [[Linux Hadoop Minimal Installation Instructions]] (Read First) | * [[Linux Hadoop Minimal Installation Instructions]] (Read First) | ||
* [[https://www.clustermonkey.net/download/Hands-on_Hadoop_Spark/Linux-Hadoop-Minimal-0.42.MD5.txt|Linux Hadoop Minimal MD5]] | * [[https://www.clustermonkey.net/download/Hands-on_Hadoop_Spark/Linux-Hadoop-Minimal-0.42.MD5.txt|Linux Hadoop Minimal MD5]] | ||
- | * [[https://www.clustermonkey.net/download/Hands-on_Hadoop_Spark/Linux-Hadoop-Minimal-0.42.ova|Linux Hadoop Minimal Virtual Machine OVA file]] (3.3G in size) | + | * Linux Hadoop Minimal Virtual Machine OVA file [[https://www.clustermonkey.net/download/Hands-on_Hadoop_Spark/Linux-Hadoop-Minimal-0.42.ova|US]] [[http://134.209.239.225/download/Linux-Hadoop-Minimal-0.42.ova|Europe]] (3.3G) |
* [[https://www.clustermonkey.net/download/Hands-on_Hadoop_Spark/old|Old Versions]] | * [[https://www.clustermonkey.net/download/Hands-on_Hadoop_Spark/old|Old Versions]] | ||
---- | ---- | ||
+ | |||
+ | ====Cloudera-Hortonworks HDP Sandbox==== | ||
+ | |||
+ | The Cloudera-Hortonworks HDP Sandbox, a full featured Hadoop/Spark virtual machine that runs under Docker, VirtualBox, or VMWare. Please see [[https://www.cloudera.com/downloads/hortonworks-sandbox.html|Cloudera/Hortonworks HDP Sandbox]] for more information. Due to the number of applications the HDP Sandbox can require substantial resources to run. | ||
+ | |||
+ | ---- | ||
+ | |||
+ | ====Zeppelin Web Notebook==== | ||
+ | For those taking the //Scalable Data Science// course a 30-day web-based Zeppelin Notebook is available from [[https://www.basement-supercomputing.com|Basement Supercomputing]]. Please use the [[Sign Up Form]] to get access to the notebook. | ||
+ | |||
+ | ---- | ||
+ | |||
====Other Resources for all Classes==== | ====Other Resources for all Classes==== | ||
* Book: [[https://www.clustermonkey.net/Hadoop2-Quick-Start-Guide/| Hadoop® 2 Quick-Start Guide: Learn the Essentials of Big Data Computing in the Apache Hadoop® 2 Ecosystem]] | * Book: [[https://www.clustermonkey.net/Hadoop2-Quick-Start-Guide/| Hadoop® 2 Quick-Start Guide: Learn the Essentials of Big Data Computing in the Apache Hadoop® 2 Ecosystem]] | ||
* Video Tutorial: [[https://www.safaribooksonline.com/library/view/hadoop-and-spark/9780134770871|Hadoop® and Spark Fundamentals: LiveLessons]] | * Video Tutorial: [[https://www.safaribooksonline.com/library/view/hadoop-and-spark/9780134770871|Hadoop® and Spark Fundamentals: LiveLessons]] | ||
* Book: [[https://www.clustermonkey.net/Practical-Data-Science-with-Hadoop-and-Spark|Practical Data Science with Hadoop® and Spark: Designing and Building Effective Analytics at Scale]] | * Book: [[https://www.clustermonkey.net/Practical-Data-Science-with-Hadoop-and-Spark|Practical Data Science with Hadoop® and Spark: Designing and Building Effective Analytics at Scale]] | ||
- | |||
---- | ---- |