User Tools

Site Tools


start

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
start [2025/02/15 20:51]
deadline [Linux Hadoop Minimal (LHM) Virtual Machine Sandbox]
start [2026/02/09 21:32] (current)
deadline [About the Presenter]
Line 29: Line 29:
  
 Contact: ''deadline''(you know what goes here)''eadline''(and here)''org''\\  Contact: ''deadline''(you know what goes here)''eadline''(and here)''org''\\ 
-Mast: @thedeadline@mast.hpc.social \\  + 
-Twitter: @thedeadline+  * Mast: @thedeadline@mast.hpc.social \\  
 +  Twitter: @thedeadline 
 +  * BlueSky:@thedeadline.bsky.social
  
 ---- ----
Line 73: Line 75:
   * [[https://www.clustermonkey.net/download/Hands-on_Hadoop_Spark/Getting-Started-Kafka-V2.1.tgz|Class Notes]] (tgz format)   * [[https://www.clustermonkey.net/download/Hands-on_Hadoop_Spark/Getting-Started-Kafka-V2.1.tgz|Class Notes]] (tgz format)
   * [[https://www.clustermonkey.net/download/Hands-on_Hadoop_Spark/Getting-Started-Kafka-V2.1.zip|Class Notes]] (zip format)   * [[https://www.clustermonkey.net/download/Hands-on_Hadoop_Spark/Getting-Started-Kafka-V2.1.zip|Class Notes]] (zip format)
 +  * Additional [[https://www.clustermonkey.net/download/Eadline/Lehigh/Week-01/Install-KafkaEsque-Local-Mac-M.pdf|note]] for running Kafkaesque on Apple M based systems (Linux Virtual Machines running on UTM)
  
 ====Class Notes for Kafka Methods and Administration ==== ====Class Notes for Kafka Methods and Administration ====
Line 113: Line 116:
 Used for //Hands-on//, //Command Line//, and //Scalable Data Science// trainings above. Note: This VM can also be used for the //Hadoop and Spark Fundamentals: LiveLessons// video mentioned below. Used for //Hands-on//, //Command Line//, and //Scalable Data Science// trainings above. Note: This VM can also be used for the //Hadoop and Spark Fundamentals: LiveLessons// video mentioned below.
  
-===VERSION 2-8.1: (Current)=== +===VERSION 3.0-beta-2: (Current)===  
 +(Updated Jan-25-2024)
  
 +[[Linux Hadoop Minimal Installation Instructions VERSION 3]]  
  
 +Contents: Rocky Linux 9.7: Python 3.9.25, R 4.5.2, Hadoop 3.3.6, Hive 4.0.1, Apache Spark 3.5.6, Derby 10.14.2.0, Zeppelin 0.11.2, Sqoop 1.4.7, Kafka 3..4.1, HBase 2.6.2, NiFi 1.17.0, KafkaEsque. **Used in all classes, trainings, and workshops after January 1, 2026.**
 +
 +===VERSION 2-8.1: (Previous, no longer supported)=== 
 (Updated Jan-25-2024) (Updated Jan-25-2024)
-CentOS Linux 7.6, Anaconda 3:Python 3.7.4, R 3.6.0, Hadoop 3.3.0, Hive 3.1.2, Apache Spark 2.4.5, Derby 10.14.2.0, Zeppelin 0.8.2, Sqoop 1.4.7, Kafka 2.5.0, HBase 2.4.10, NiFi 1.17.0, KafkaEsque. **Used in all current trainings.** 
  
-[[Linux Hadoop Minimal Installation Instructions VERSION 2]] (Read First) +[[Linux Hadoop Minimal Installation Instructions VERSION 2]]  
  
-==For VirtualBox X86 PC, Mac, Linux Machines== +Contents: CentOS Linux 7.6, Anaconda 3:Python 3.7.4, R 3.6.0, Hadoop 3.3.0, Hive 3.1.2, Apache Spark 2.4.5, Derby 10.14.2.0, Zeppelin 0.8.2, Sqoop 1.4.7, Kafka 2.5.0, HBase 2.4.10, NiFi 1.17.0, KafkaEsque. **Used in all classestrainingsand workshops prior to January 1, 2026).**
-  * [[https://www.clustermonkey.net/download/Hands-on_Hadoop_Spark/Linux-Hadoop-Minimal-V2.0-8.1.ova.MD5.txt|Linux Hadoop Minimal V2.0-8.1MD5]] +
-  * Linux Hadoop Minimal Virtual Machine V2.0-8.1 OVA file [[http://161.35.229.207/download/Linux-Hadoop-Minimal-V2.0-8.1.ova|US]] [[http://134.209.239.225/download/Linux-Hadoop-Minimal-V2.0-8.1.ova|Europe]] (13.0G) **NOTE:** Chrome may prevent //http// downloadsright click the linkchoose "Save Link As" then click "Keep" next to the blue discard box at the bottom of the browser. +
-  [[https://www.clustermonkey.net/download/Hands-on_Hadoop_Spark/Hadoop-Minimal-Install-Notes-V2.0-8.1.tgz|Hadoop Minimal Build Notes x86 Virtual Box]] (tgz format)+
  
-==For UTM Apple Mac M Machines== 
-  * [[https://www.clustermonkey.net/download/Hands-on_Hadoop_Spark/Linux-Hadoop-Minimal-V2.0-M8.2.utm.zip.MD5.txt|Linux Hadoop Minimal V2.0-M8.2.zip MD5]] 
-  * Linux Hadoop Minimal Virtual Machine V2.0-8.1 UTM file [[http://161.35.229.207/download/Linux-Hadoop-Minimal-V2.0-M8.2.utm.zip|US]] [[http://134.209.239.225/download/Linux-Hadoop-Minimal-V2.0-M8.2.utm.zip|Europe]] (8.0G) **NOTE:** Chrome may prevent //http// downloads, right click the link, choose "Save Link As" then click "Keep" next to the blue discard box at the bottom of the browser. 
-  * [[https://www.clustermonkey.net/download/Hands-on_Hadoop_Spark/Hadoop-Minimal-Install-Notes-V2.0-M8.1.tgz|Hadoop Minimal Build Notes Mac UTM]] (tgz format) 
  
  
----- 
  
-====Cloudera-Hortonworks HDP Sandbox==== 
  
-The Cloudera-Hortonworks HDP Sandbox, a full featured Hadoop/Spark virtual machine that runs under Docker, VirtualBox, or VMWare. Please see [[https://www.cloudera.com/downloads/hortonworks-sandbox.html|Cloudera/Hortonworks HDP Sandbox]] for more information. Due to the number of applications the HDP Sandbox can require substantial resources to run.  
  
 ---- ----
-/* 
-====Zeppelin Web Notebook==== 
-For those taking the //Scalable Data Science// training a 30-day web-based Zeppelin Notebook is available from [[https://www.basement-supercomputing.com|Basement Supercomputing]]. Please use the [[Sign Up Form]] to get access to the notebook.  
  
----- +
-*/+
 ====Other Resources for all Classes==== ====Other Resources for all Classes====
   * Book: [[https://www.clustermonkey.net/Hadoop2-Quick-Start-Guide/| Hadoop® 2 Quick-Start Guide: Learn the Essentials of Big Data Computing in the Apache Hadoop® 2 Ecosystem]]   * Book: [[https://www.clustermonkey.net/Hadoop2-Quick-Start-Guide/| Hadoop® 2 Quick-Start Guide: Learn the Essentials of Big Data Computing in the Apache Hadoop® 2 Ecosystem]]
   * Book: [[https://www.clustermonkey.net/Practical-Data-Science-with-Hadoop-and-Spark|Practical Data Science with Hadoop® and Spark: Designing and Building Effective Analytics at Scale]]     * Book: [[https://www.clustermonkey.net/Practical-Data-Science-with-Hadoop-and-Spark|Practical Data Science with Hadoop® and Spark: Designing and Building Effective Analytics at Scale]]  
   *  Video Tutorial: [[https://www.oreilly.com/videos/data-engineering-foundations/9780137440580|Data Engineering Foundations Part 1: LiveLessons: Using Spark, Hive, and Hadoop® Tools]]     *  Video Tutorial: [[https://www.oreilly.com/videos/data-engineering-foundations/9780137440580|Data Engineering Foundations Part 1: LiveLessons: Using Spark, Hive, and Hadoop® Tools]]  
-  *  Video Tutorial (**NEW**): [[https://www.informit.com/store/data-engineering-foundations-part-2-building-data-pipelines-9780138086992|Data Engineering Foundations Part 2: Building Data Pipelines with Kafka and Nifi ]]+  *  Video Tutorial: [[https://www.informit.com/store/data-engineering-foundations-part-2-building-data-pipelines-9780138086992|Data Engineering Foundations Part 2: Building Data Pipelines with Kafka and Nifi ]] 
 +  * Video Tutorial (**NEW**): [[https://www.oreilly.com/library/view/kafka-essentials-livelessons/9780138176761/|Kafka Essentials LiveLessons: A Quick-Start for Building Effective Data Pipelines ]] 
  
 ---- ----
start.1739652706.txt.gz · Last modified: 2025/02/15 20:51 by deadline