User Tools

Site Tools


start

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
start [2022/12/08 16:15]
deadline [Welcome to the Effective Data Pipelines Series]
start [2024/01/29 21:19] (current)
deadline [Linux Hadoop Minimal (LHM) Virtual Machine Sandbox]
Line 28: Line 28:
 These classes include more in-depth treatment and practical application of the scalable computing tools and examples I cover in the one-day trainings. Consider enrolling in one of these excellent Data Science programs. These classes include more in-depth treatment and practical application of the scalable computing tools and examples I cover in the one-day trainings. Consider enrolling in one of these excellent Data Science programs.
  
-Contact: ''deadline''(you know what goes here)''eadline''(and here)''org''+Contact: ''deadline''(you know what goes here)''eadline''(and here)''org''\\  
 +Mast: @thedeadline@mast.hpc.social \\  
 +Twitter: @thedeadline 
 ---- ----
  
-====Class Notes for Linux Command Line Quick Start==== +====Class Notes for Bash Programming Quick-start==== 
-(Updated 14-Jul-2021 +(Updated 19-Apr-2023) 
-  * [[First Steps for Linux Command Line Training]]   +  * [[First Steps for Bash Programming Training]]   
-  * [[https://www.clustermonkey.net/download/Hands-on_Hadoop_Spark/Linux-CL-Quick-Start-V1.0.tgz|Class Notes]] (tgz format) +  * [[https://www.clustermonkey.net/download/Hands-on_Hadoop_Spark/Bash-Programming-Quick-start-v1.tgz|Class Notes]] (tgz format) 
-  * [[https://www.clustermonkey.net/download/Hands-on_Hadoop_Spark/Linux-CL-Quick-Start-V1.0.zip|Class Notes]] (zip format)+  * [[https://www.clustermonkey.net/download/Hands-on_Hadoop_Spark/Bash-Programming-Quick-start-v1.zip|Class Notes]] (zip format) 
 + 
  
  
Line 57: Line 62:
   * [[https://www.clustermonkey.net/download/Hands-on_Hadoop_Spark/Up_Running_Kubernetes-V1.3.zip|Class Notes]] (zip format)   * [[https://www.clustermonkey.net/download/Hands-on_Hadoop_Spark/Up_Running_Kubernetes-V1.3.zip|Class Notes]] (zip format)
  
 +/*
 ====Class Notes for Implementing an Edge Computing Apache Kafka Inference Engine==== ====Class Notes for Implementing an Edge Computing Apache Kafka Inference Engine====
 (Updated 14-Apr-2021) (Updated 14-Apr-2021)
   * [[https://www.clustermonkey.net/download/Hands-on_Hadoop_Spark/Edge-Kafka-Keras-V2.0.tgz|Class Notes]] (tgz format)   * [[https://www.clustermonkey.net/download/Hands-on_Hadoop_Spark/Edge-Kafka-Keras-V2.0.tgz|Class Notes]] (tgz format)
   * [[https://www.clustermonkey.net/download/Hands-on_Hadoop_Spark/Edge-Kafka-Keras-V2.0.zip|Class Notes]] (zip format)   * [[https://www.clustermonkey.net/download/Hands-on_Hadoop_Spark/Edge-Kafka-Keras-V2.0.zip|Class Notes]] (zip format)
 +*/
 ====Class Notes for Getting Started with Kafka ==== ====Class Notes for Getting Started with Kafka ====
 (Updated **09-Aug-2022** - fixes typos)  (Updated **09-Aug-2022** - fixes typos) 
Line 69: Line 75:
  
 ====Class Notes for Kafka Methods and Administration ==== ====Class Notes for Kafka Methods and Administration ====
-(Update **06-Oct-2022** - some code and typo fixes) +(Update **20-Mar-2023** - some code and typo fixes) 
   * [[First Steps for Kafka Methods and Administration]]   * [[First Steps for Kafka Methods and Administration]]
-  * [[https://www.clustermonkey.net/download/Hands-on_Hadoop_Spark/Kafka-Methods-and-Administration-V1.1.tgz|Class Notes]] (tgz format) +  * [[https://www.clustermonkey.net/download/Hands-on_Hadoop_Spark/Kafka-Methods-and-Administration-V1.2.tgz|Class Notes]] (tgz format) 
-  * [[https://www.clustermonkey.net/download/Hands-on_Hadoop_Spark/Kafka-Methods-and-Administration-V1.1.zip|Class Notes]] (zip format)+  * [[https://www.clustermonkey.net/download/Hands-on_Hadoop_Spark/Kafka-Methods-and-Administration-V1.2.zip|Class Notes]] (zip format
 + 
 +====Class Notes for Scalable PySpark for Data Science ==== 
 +(Update **07-Jan-2024**)  
 +  * [[First Steps for Scalable PySpark for Data Science]] 
 +  * [[https://www.clustermonkey.net/download/Hands-on_Hadoop_Spark/Scalable-PySpark-v1.tgz|Class Notes]] (tgz format) 
 +  * [[https://www.clustermonkey.net/download/Hands-on_Hadoop_Spark/Scalable-PySpark-v1.zip|Class Notes]] (zip format) 
 +  * [[https://www.clustermonkey.net/download/Hands-on_Hadoop_Spark/Zeppelin-Notebooks/Scalable_PySpark_with_CSV_Files_and_Hive_Tables.json| PySpark for Data Science Zeppelin Notebook]] (Right Click, Save Link As ...)
  
 === Old Notes ==== === Old Notes ====
Line 100: Line 113:
 Used for //Hands-on//, //Command Line//, and //Scalable Data Science// trainings above. Note: This VM can also be used for the //Hadoop and Spark Fundamentals: LiveLessons// video mentioned below. Used for //Hands-on//, //Command Line//, and //Scalable Data Science// trainings above. Note: This VM can also be used for the //Hadoop and Spark Fundamentals: LiveLessons// video mentioned below.
  
-===VERSION 2-beta8: (Current)===  +===VERSION 2-8.1: (Current)=== 
-=== IMPORTANT: VirtualBox will not work on the new Apple M1 based systems ====+
  
-(Updated Aug-08-2022)+ 
 +(Updated Jan-25-2024)
 CentOS Linux 7.6, Anaconda 3:Python 3.7.4, R 3.6.0, Hadoop 3.3.0, Hive 3.1.2, Apache Spark 2.4.5, Derby 10.14.2.0, Zeppelin 0.8.2, Sqoop 1.4.7, Kafka 2.5.0, HBase 2.4.10, NiFi 1.17.0, KafkaEsque. **Used in all current trainings.** CentOS Linux 7.6, Anaconda 3:Python 3.7.4, R 3.6.0, Hadoop 3.3.0, Hive 3.1.2, Apache Spark 2.4.5, Derby 10.14.2.0, Zeppelin 0.8.2, Sqoop 1.4.7, Kafka 2.5.0, HBase 2.4.10, NiFi 1.17.0, KafkaEsque. **Used in all current trainings.**
  
-  * [[Linux Hadoop Minimal Installation Instructions VERSION 2]] (Read First)  +[[Linux Hadoop Minimal Installation Instructions VERSION 2]] (Read First)  
-  * [[https://www.clustermonkey.net/download/Hands-on_Hadoop_Spark/Linux-Hadoop-Minimal-V2.0-beta8.MD5.txt|Linux Hadoop Minimal V2.0-beta8 MD5]] + 
-  * Linux Hadoop Minimal Virtual Machine V2.0-beta8 OVA file [[https://www.clustermonkey.net/download/Hands-on_Hadoop_Spark/Linux-Hadoop-Minimal-V2.0-beta8.ova|US]] [[http://134.209.239.225/download/Linux-Hadoop-Minimal-V2.0-beta8.ova|Europe]] (11.0G) **NOTE:** Chrome may prevent //http// downloads, right click the link, choose "Save Link As" then click "Keep" next to the blue discard box at the bottom of the browser. +==For VirtualBox X86 PC, Mac, Linux Machines== 
-  * [[https://www.clustermonkey.net/download/Hands-on_Hadoop_Spark/Hadoop-Minimal-Install-Notes-V2-beta8.tgz|Hadoop Minimal Build Notes (tgz format)]] +  * [[https://www.clustermonkey.net/download/Hands-on_Hadoop_Spark/Linux-Hadoop-Minimal-V2.0-8.1.ova.MD5.txt|Linux Hadoop Minimal V2.0-8.1MD5]] 
- +  * Linux Hadoop Minimal Virtual Machine V2.0-8.1 OVA file [[http://161.35.229.207/download/Linux-Hadoop-Minimal-V2.0-8.1.ova|US]] [[http://134.209.239.225/download/Linux-Hadoop-Minimal-V2.0-8.1.ova|Europe]] (13.0G) **NOTE:** Chrome may prevent //http// downloads, right click the link, choose "Save Link As" then click "Keep" next to the blue discard box at the bottom of the browser. 
 +  * [[https://www.clustermonkey.net/download/Hands-on_Hadoop_Spark/Hadoop-Minimal-Install-Notes-V2.0-8.1.tgz|Hadoop Minimal Build Notes x86 Virtual Box]] (tgz format)
  
-[[OLD VERSION 0.42]] (Deprecated)+==For UTM Apple Mac M Machines== 
 +  * [[https://www.clustermonkey.net/download/Hands-on_Hadoop_Spark/Linux-Hadoop-Minimal-V2.0-M8.1.utm.zip.MD5.txt|Linux Hadoop Minimal V2.0-M8.1.zip MD5]] 
 +  * Linux Hadoop Minimal Virtual Machine V2.0-8.1 UTM file [[http://161.35.229.207/download/Linux-Hadoop-Minimal-V2.0-M8.1.utm.zip|US]] [[http://134.209.239.225/download/Linux-Hadoop-Minimal-V2.0-M8.1.utm.zip|Europe]] (8.0G) **NOTE:** Chrome may prevent //http// downloads, right click the link, choose "Save Link As" then click "Keep" next to the blue discard box at the bottom of the browser. 
 +  * [[https://www.clustermonkey.net/download/Hands-on_Hadoop_Spark/Hadoop-Minimal-Install-Notes-V2.0-M8.1.tgz|Hadoop Minimal Build Notes Mac UTM]] (tgz format)
  
  
Line 142: Line 159:
 ---- ----
  
-**Unless otherwise noted, all training content, notes, and examples (c) Douglas Eadline 2019, 2020, 2022 All rights reserved.**+**Unless otherwise noted, all training content, notes, and examples (c) Douglas Eadline 2019-2024 All rights reserved.**
  
  
start.1670516103.txt.gz · Last modified: 2022/12/08 16:15 by deadline