User Tools

Site Tools


start

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
start [2020/01/22 21:14]
deadline description updates
start [2024/01/29 21:19] (current)
deadline [Linux Hadoop Minimal (LHM) Virtual Machine Sandbox]
Line 1: Line 1:
 =====Welcome to the Effective Data Pipelines Series ===== =====Welcome to the Effective Data Pipelines Series =====
-(previously Scalable Analytics with Apache Hadoop and Spark) 
  
-**The six essential courses on the path to +This page provides many of the resources for books, videos and on-line trainings.
-scalable data science pipelines nirvana--or at least a good start**+
  
-====Course Descriptions and Links====+You can find more information on all current [[https://www.safaribooksonline.com/search/?query=eadline| video and book titles and upcoming on-line trainings]] from O'Reilly. 
 +====Training Descriptions====
  
-Click on the course name for availability and further informationNew courses are being added. For best results, courses should be taken in the recommended order (shown below).  Courses 1 and (2&3) can be taken out of order. Course 4 builds on courses 1 and (2&3). Course 5 builds-on and assumes competence with topics in courses 4, (3&2), and 1+Many of the trainings are run on a regular basis. Check the [[https://www.safaribooksonline.com/search/?query=eadline|O'Reilly]] site for upcoming live eventsWhere possible, trainings are demonstrated using the [[start#Linux Hadoop Minimal (LHMVirtual Machine Sandbox|freely available virtual machine]]To facilitate continued exploration using the virtual machinetraining notes (text filesavailable below.
  
-**NOTE:** If the link does not lead you to the classit has not yet been scheduledCheck back at future date. Also two new courses in the series are coming in the new year (including Kafka coverage +  * **Apache Hadoop, Spark, and Kafka Foundations** - (POPULAR) A great introduction to the Hadoop Big Data Ecosystem with Spark and Kafka. A non-programming introduction to HadoopSpark, HDFS, MapReduce, and KafkaAfter completing the workshop attendees will gain workable understanding of the Hadoop/Spark/Kafka technical value proposition and provide a solid background for following training in the Effective Data Pipelines Series (3 hours-1 day) 
-and Data Engineering).+  *  **Bash Programming Quick-start for Data Science** - Quickly learn the essentials of using the Linux command line for Data Science at scale. Download/upload files, run applications, monitor resources, edit files, write scripts, and navigate the Linux command line interface used on almost all modern analytics clusters. Students can download and run examples on the "Linux Hadoop Minimal" virtual machine, see below. (4 hours-1 day) 
 +  * **Hands-on Introduction to Apache Hadoop, Spark, and Kafka Programming** - (POPULAR) A hands-on introduction to using Hadoop, Hive, Sqoop, Spark, Kafka and Zeppelin notebooks. Students can download and run examples on the "Linux Hadoop Minimal" virtual machine, see below. (6 hours-2 days) 
 +  * **Getting Started with Kafka** - (POPULAR) Apache Kafka is designed to manage data flow by decoupling the data source from the destination. Kafka can provide a robust data buffer or broker that can help create and manage data pipelines. In this training, the basic Kafka data broker design and operation is explained and illustrated using both the command line and a GUI.  
 +  * **Kafka Methods and Administration** - Additional Kafka features that go beyond those presented in Getting Started with Kafka will be addressed. These topics include writing to databases and HDFS, producer and consumer options, working with Kafka Connect, and Kafka installation and administration.    
 +  * **Up and Running with Kubernetes** - Kubernetes can be considered a container operating system where application resource and storage needs are matched to an underlying cluster environment (either virtual or real). This course provides both background for users coupled with a practical hands-on introduction to Kubernetes. 
 +  * **Data Engineering at Scale with Apache Hadoop and Spark** - As part of the Effective Data Pipelines series, this training provides background and examples on data "munging" or transforming raw data into a form that can be used with analytical modeling libraries. Also referred to as data wrangling, transformation, or ETL these techniques are often performed "at scale" on a real cluster using Hadoop and Spark.(3 hours-1 days) 
 +  * **Scalable Analytics with Apache Hadoop, Spark, and Kafka** - A complete data science investigation requires different tools and strategiesIn this training, learn How to apply Hadoop, Spark, and Kafka tools to Predict Airline Delays. All programming will be done using Hadoop, Spark, and Kafka with the Zeppelin web notebook on a four node cluster. The notebook will be made available for download so student can reproduce the examples. (3 hours-1 day)
  
-| 1 | [[https://www.oreilly.com/search/?query=Apache%20Hadoop%2C%20Spark%2C%20and%20Kafka%20Foundations%3A%20Effective%20Data%20Pipelines&extended_publisher_data=true&highlight=true&include_assessments=false&include_case_studies=true&include_courses=true&include_orioles=true&include_playlists=true&include_collections=true&include_notebooks=true&is_academic_institution_account=false&source=user&formats=live%20online%20training&sort=relevance&facet_json=true&page=0|Apache HadoopSpark, and Kafka Foundations: Effective Data Pipelines]] - A great introduction to the Hadoop Big Data Ecosystem with Spark and KafkaA non-programming introduction to Hadoop, Spark, HDFS, MapReduce, and KafkaAfter completing the workshop attendees will gain a workable understanding of the Hadoop/Spark/Kafka technical value proposition and provide a solid background for following courses in the Effective Data Pipelines Series (3 hours-1 day)|{{:wiki:oreilly-logo-foundations-dp.png?400}}| +---- 
-| 2 |[[https://www.oreilly.com/search/?query=Douglas%20Eadline&extended_publisher_data=true&highlight=true&include_assessments=false&include_case_studies=true&include_courses=true&include_orioles=true&include_playlists=true&include_collections=true&include_notebooks=true&is_academic_institution_account=false&source=user&formats=live%20online%20training&sort=relevance&facet_json=true&page=0| Beginning Linux Command Line for Data Engineers and Analysts: Effective Data Pipelines]] - Quickly learn the essentials of using the Linux command line on Hadoop/Spark clustersDowload/upload files, run applications, monitor resources, and navigate the Linux command line interface used on almost all modern analytics clustersStudents can download and run examples on the "Linux Hadoop Minimal" virtual machinesee below. (3 hours-1 day)|{{:wiki:oreilly-begin-command-line-dp-logo.png?400}}| + 
-|3|[[https://www.oreilly.com/search/?query=Intermediate%20Linux%20Command%20Line%20for%20Data%20Engineers%20and%20Analysts%3A%20Effective%20Data%20Pipelines&extended_publisher_data=true&highlight=true&include_assessments=false&include_case_studies=true&include_courses=true&include_orioles=true&include_playlists=true&include_collections=true&include_notebooks=true&is_academic_institution_account=false&source=user&formats=live%20online%20training&sort=relevance&facet_json=true&page=0| Intermediate Linux Command Line for Data Engineers and Analysts: Effective Data Pipelines]] - This course is a continuation of Beginning Linux Command Line for Data Engineers and Analysts covering more advanced topics. Coverage includes: Linux Analytics, Moving Data into Hadoop HDFS, Running Command Line Analytics Tools, Bash Scripting Basics, and Creating Bash Scripts|{{:wiki:oreilly-inter-command-line-dp-logo.png?400}}| +==== About the Presenter ==== 
-| 4 |[[https://www.safaribooksonline.com/search/?query=Hands-on%20Introduction%20to%20Apache%20Hadoop%20Spark%20and%20Kafka%20Programming&field=title|Hands-on Introduction to Apache Hadoop, Spark, and Kafka Programming]] - A hands-on introduction to using Hadoop, Pig, Hive, Sqoop, Spark and Zeppelin notebooksStudents can download and run examples on the "Linux Hadoop Minimal" virtual machinesee below. (6 hours-2 days)|{{wikihands-on-course.png}}| +**Douglas Eadline**began his career as Analytical Chemist with an interest in computer methodsStarting with the first Linux Cluster Beowulf How-to documentDoug has written instructional documents covering many aspects of Linux High Performance Computing (HPC) and scalable data analytics computingCurrently, Doug serves as editor of the //ClusterMonkey.net// website and was previously editor of //ClusterWorld Magazine//, and senior HPC Editor for //Linux Magazine//. He is also a writer and consultant to the scalable HPC/Analytics industryHis recent video tutorials and books include of the Hadoop Fundamentals LiveLessons (Addison Wesley) videoHadoop 2 Quick Start Guide (Addison Wesley), High Performance Computing for Dummies (Wiley) and Practical Data Science with Hadoop and Spark (Co-author, Addison Wesley)Doug also designs [[https://www.limulus-computing.com|high performance desk-side clusters]] for both HPC and data analytics 
-| 5 |[[https://www.safaribooksonline.com/search/?query=Hands-on%20Introduction%20to%20Apache%20Hadoop%20and%20Spark%20Programming&field=title|Data Engineering at Scale with Apache Hadoop and Spark]] - As part of the Effective Data Pipelines series, this course provides background and examples on data "munging" or transforming raw data into a form that can be used with analytical modeling libraries. Also referred to as data wrangling, transformation, or ETL these techniques are often performed "at scale" on a real cluster using Hadoop and Spark.(3 hours-1 days)|{{wiki: hands-on-course.png}}| + 
-| 6| [[https://www.oreilly.com/search/?query=Scalable%20Data%20Science%20with%20Hadoop%20and%20Spark%20Eadline|Scalable Data Science with Hadoop and Spark]] - Learn How to Apply Hadoop and Spark tools to Predict Airline Delays. All programming will be done using Hadoop and Spark with the Zeppelin web notebook on a four node cluster. The notebook will be made available for download so student can reproduce the examples. (3 hours-1 day)|{{wiki: scalable-DS-course.png}}|+In addition to the on-line trainingsDoug also teaches graduate level courses as part of two Masters in Data science programs: 
 +  * **DS575 Big Data Techniques** as part of an on-line [[https://www.juniata.edu/academics/graduate-programs/data-science.phpMasters in Data Science]] from Juniata College.  
 +  * **DSCI 411 Data Management for Big Data** as part of in-person and online [[https://engineering.lehigh.edu/academics/graduate/masters/data-science|Masters in Data Science]] from Lehigh University
  
 +These classes include more in-depth treatment and practical application of the scalable computing tools and examples I cover in the one-day trainings. Consider enrolling in one of these excellent Data Science programs.
  
 +Contact: ''deadline''(you know what goes here)''eadline''(and here)''org''\\ 
 +Mast: @thedeadline@mast.hpc.social \\ 
 +Twitter: @thedeadline
  
 ---- ----
 +
 +====Class Notes for Bash Programming Quick-start====
 +(Updated 19-Apr-2023)
 +  * [[First Steps for Bash Programming Training]]  
 +  * [[https://www.clustermonkey.net/download/Hands-on_Hadoop_Spark/Bash-Programming-Quick-start-v1.tgz|Class Notes]] (tgz format)
 +  * [[https://www.clustermonkey.net/download/Hands-on_Hadoop_Spark/Bash-Programming-Quick-start-v1.zip|Class Notes]] (zip format)
 +
 +
 +
  
 ====Class Notes for Hands-on Introduction to Apache Hadoop and Spark Programming==== ====Class Notes for Hands-on Introduction to Apache Hadoop and Spark Programming====
-(Updated 03-June-2019+(Updated 13-Oct-2021) 
-  * [[https://www.clustermonkey.net/download/Hands-on_Hadoop_Spark/Hands_On_Hadoop_Spark-V1.5.tgz|Class Notes]] (tgz format) +  * [[First Steps for Hands-on Class]]  
-  * [[https://www.clustermonkey.net/download/Hands-on_Hadoop_Spark/Hands_On_Hadoop_Spark-V1.5.zip|Class Notes]] (zip format)+  * [[https://www.clustermonkey.net/download/Hands-on_Hadoop_Spark/Hands_On_Hadoop_Spark-V1.5.4.tgz|Class Notes]] (tgz format) 
 +  * [[https://www.clustermonkey.net/download/Hands-on_Hadoop_Spark/Hands_On_Hadoop_Spark-V1.5.4.zip|Class Notes]] (zip format)
  
-===Class Notes for Beginning Linux Command Line for Data Engineers and Analysts=== +====Class Notes for Data Engineering at Scale with Apache Hadoop and Spark==== 
-(Updated 22-Jan-2020+(Updated 17-Dec-2020)
-  * [[https://www.clustermonkey.net/download/Hands-on_Hadoop_Spark/Begin-Linux-Command-Line-V1.0.tgz|Class Notes]] (tgz format) +
-  * [[https://www.clustermonkey.net/download/Hands-on_Hadoop_Spark/Begin-Linux-Command-Line-V1.0.zip|Class Notes]] (zip format)+
  
-====Zeppelin Notebook for Scalable Data Science with Hadoop and Spark=== +  * [[First Steps for Data Engineering Class]]   
-(Updated 20-Aug-2019)+  * [[https://www.clustermonkey.net/download/Hands-on_Hadoop_Spark/Data-Engineering-at-Scale-V1.1.tgz|Class Notes]] (tgz format) 
 +  * [[https://www.clustermonkey.net/download/Hands-on_Hadoop_Spark/Data-Engineering-at-Scale-V1.1.zip|Class Notes]] (zip format) 
 +  * [[https://www.clustermonkey.net/download/Hands-on_Hadoop_Spark/Scalable-Data-Engineering.json|Data Engineering at Scale Zeppelin Notebook]] (Right Click, Save Link As ...)
  
-  * [[https://www.clustermonkey.net/download/Hands-on_Hadoop_Spark/Scalable-Analytics.json|Scalable-Analytics.json]]+====Class Notes for Up and Running with Kubernetes==== 
 +(Updated 07-Sep-2022) 
 +  * [[https://www.clustermonkey.net/download/Hands-on_Hadoop_Spark/Up_Running_Kubernetes-V1.3.tgz|Class Notes]] (tgz format) 
 +  * [[https://www.clustermonkey.net/download/Hands-on_Hadoop_Spark/Up_Running_Kubernetes-V1.3.zip|Class Notes]] (zip format) 
 + 
 +/* 
 +====Class Notes for Implementing an Edge Computing Apache Kafka Inference Engine==== 
 +(Updated 14-Apr-2021) 
 +  * [[https://www.clustermonkey.net/download/Hands-on_Hadoop_Spark/Edge-Kafka-Keras-V2.0.tgz|Class Notes]] (tgz format) 
 +  * [[https://www.clustermonkey.net/download/Hands-on_Hadoop_Spark/Edge-Kafka-Keras-V2.0.zip|Class Notes]] (zip format) 
 +*/ 
 +====Class Notes for Getting Started with Kafka ==== 
 +(Updated **09-Aug-2022** - fixes typos)  
 +  * [[First Steps for Getting Started with Kafka]]  
 +  * [[https://www.clustermonkey.net/download/Hands-on_Hadoop_Spark/Getting-Started-Kafka-V2.1.tgz|Class Notes]] (tgz format) 
 +  * [[https://www.clustermonkey.net/download/Hands-on_Hadoop_Spark/Getting-Started-Kafka-V2.1.zip|Class Notes]] (zip format) 
 + 
 +====Class Notes for Kafka Methods and Administration ==== 
 +(Update **20-Mar-2023** - some code and typo fixes)  
 +  * [[First Steps for Kafka Methods and Administration]] 
 +  * [[https://www.clustermonkey.net/download/Hands-on_Hadoop_Spark/Kafka-Methods-and-Administration-V1.2.tgz|Class Notes]] (tgz format) 
 +  * [[https://www.clustermonkey.net/download/Hands-on_Hadoop_Spark/Kafka-Methods-and-Administration-V1.2.zip|Class Notes]] (zip format) 
 + 
 +====Class Notes for Scalable PySpark for Data Science ==== 
 +(Update **07-Jan-2024**)  
 +  * [[First Steps for Scalable PySpark for Data Science]] 
 +  * [[https://www.clustermonkey.net/download/Hands-on_Hadoop_Spark/Scalable-PySpark-v1.tgz|Class Notes]] (tgz format) 
 +  * [[https://www.clustermonkey.net/download/Hands-on_Hadoop_Spark/Scalable-PySpark-v1.zip|Class Notes]] (zip format) 
 +  * [[https://www.clustermonkey.net/download/Hands-on_Hadoop_Spark/Zeppelin-Notebooks/Scalable_PySpark_with_CSV_Files_and_Hive_Tables.json| PySpark for Data Science Zeppelin Notebook]] (Right Click, Save Link As ...) 
 + 
 +=== Old Notes ==== 
 + 
 +[[https://www.clustermonkey.net/download/Hands-on_Hadoop_Spark/Old-Notes|Old Notes Files can be found here.]] 
 + 
 +====Zeppelin Notebook for Scalable Data Science with Hadoop and Spark==== 
 +(Updated 15-Sep-2021) 
 + 
 +  * [[https://www.clustermonkey.net/download/Hands-on_Hadoop_Spark/Scalable-Analytics-V2.1.json|Scalable-Analytics-V2.1.json]] New version that uses Hive, Python, and PySpark 
 +  * [[https://www.clustermonkey.net/download/Hands-on_Hadoop_Spark/Scalable-Analytics.json|Scalable-Analytics.json]] Old Version that uses Pig, Python, and PySpark
  
 ---- ----
  
-====DOS to Linux and Hadoop HDFS Help:==== +====Supporting Documents (Cheat Sheets)==== 
-  * [[https://www.clustermonkey.net/download/Hands-on_Hadoop_Spark/DOS-Linux-HDFS-cheatsheet.pdf|DOS to Linux/HDFS Cheat-sheet]] +  * [[https://www.clustermonkey.net/download/Hands-on_Hadoop_Spark/Supporting-Docs/DOS-Linux-HDFS-cheatsheet.pdf|DOS to Linux/HDFS Cheat Sheet]] 
-  * [[https://www.clustermonkey.net/download/Hands-on_Hadoop_Spark/ericg_vi-editor.bw.pdf|vi (visual editor) Cheat-sheet]]+  * [[https://www.clustermonkey.net/download/Hands-on_Hadoop_Spark/Supporting-Docs/ericg_vi-editor.bw.pdf|vi (visual editor) Cheat Sheet]]
   * [[https://www.cs.colostate.edu/helpdocs/vi.html|Additional help with vi]]   * [[https://www.cs.colostate.edu/helpdocs/vi.html|Additional help with vi]]
 +  * [[https://www.clustermonkey.net/download/Hands-on_Hadoop_Spark/Supporting-Docs/Hortonworks.CheatSheet.SQLtoHive.pdf|Hortonworks SQL to Hive Cheat Sheet]]
 +  * [[https://tldp.org/LDP/intro-linux/intro-linux.pdf|Introduction to Linux]] 
 +  * [[https://files.fosswire.com/2007/08/fwunixref.pdf|Linux Command Cheat Sheet]]
  
 ---- ----
Line 49: Line 111:
 ====Linux Hadoop Minimal (LHM) Virtual Machine Sandbox==== ====Linux Hadoop Minimal (LHM) Virtual Machine Sandbox====
  
-(Current Version 0.4203-June-2019) **Not ready for Scalable Data Science with Hadoop and Spark (soon)**+Used for //Hands-on//, //Command Line//, and //Scalable Data Science// trainings above. Note: This VM can also be used for the //Hadoop and Spark Fundamentals: LiveLessons// video mentioned below. 
 + 
 +===VERSION 2-8.1: (Current)===  
 + 
 + 
 +(Updated Jan-25-2024) 
 +CentOS Linux 7.6, Anaconda 3:Python 3.7.4, R 3.6.0, Hadoop 3.3.0Hive 3.1.2, Apache Spark 2.4.5, Derby 10.14.2.0, Zeppelin 0.8.2, Sqoop 1.4.7, Kafka 2.5.0, HBase 2.4.10, NiFi 1.17.0, KafkaEsque. **Used in all current trainings.** 
 + 
 +[[Linux Hadoop Minimal Installation Instructions VERSION 2]] (Read First)  
 + 
 +==For VirtualBox X86 PC, Mac, Linux Machines== 
 +  * [[https://www.clustermonkey.net/download/Hands-on_Hadoop_Spark/Linux-Hadoop-Minimal-V2.0-8.1.ova.MD5.txt|Linux Hadoop Minimal V2.0-8.1MD5]] 
 +  * Linux Hadoop Minimal Virtual Machine V2.0-8.1 OVA file [[http://161.35.229.207/download/Linux-Hadoop-Minimal-V2.0-8.1.ova|US]] [[http://134.209.239.225/download/Linux-Hadoop-Minimal-V2.0-8.1.ova|Europe]] (13.0G) **NOTE:** Chrome may prevent //http// downloads, right click the link, choose "Save Link As" then click "Keep" next to the blue discard box at the bottom of the browser. 
 +  * [[https://www.clustermonkey.net/download/Hands-on_Hadoop_Spark/Hadoop-Minimal-Install-Notes-V2.0-8.1.tgz|Hadoop Minimal Build Notes x86 Virtual Box]] (tgz format) 
 + 
 +==For UTM Apple Mac M Machines== 
 +  [[https://www.clustermonkey.net/download/Hands-on_Hadoop_Spark/Linux-Hadoop-Minimal-V2.0-M8.1.utm.zip.MD5.txt|Linux Hadoop Minimal V2.0-M8.1.zip MD5]] 
 +  Linux Hadoop Minimal Virtual Machine V2.0-8.1 UTM file [[http://161.35.229.207/download/Linux-Hadoop-Minimal-V2.0-M8.1.utm.zip|US]] [[http://134.209.239.225/download/Linux-Hadoop-Minimal-V2.0-M8.1.utm.zip|Europe]] (8.0G) **NOTE:** Chrome may prevent //http// downloads, right click the link, choose "Save Link As" then click "Keep" next to the blue discard box at the bottom of the browser. 
 +  * [[https://www.clustermonkey.net/download/Hands-on_Hadoop_Spark/Hadoop-Minimal-Install-Notes-V2.0-M8.1.tgz|Hadoop Minimal Build Notes Mac UTM]] (tgz format)
  
-Used for //Hands-on//, //Command Line//, and //Scalable Data Science// courses above. Note: This VM can also be used for the //Hadoop and Spark Fundamentals: LiveLessons// video mentioned below. 
-  * [[Linux Hadoop Minimal Installation Instructions]] (Read First)  
-  * [[https://www.clustermonkey.net/download/Hands-on_Hadoop_Spark/Linux-Hadoop-Minimal-0.42.MD5.txt|Linux Hadoop Minimal MD5]] 
-  * Linux Hadoop Minimal Virtual Machine OVA file [[https://www.clustermonkey.net/download/Hands-on_Hadoop_Spark/Linux-Hadoop-Minimal-0.42.ova|US]] [[http://134.209.239.225/download/Linux-Hadoop-Minimal-0.42.ova|Europe]] (3.3G) 
-  * [[https://www.clustermonkey.net/download/Hands-on_Hadoop_Spark/old|Old Versions]] 
  
 ---- ----
Line 64: Line 139:
  
 ---- ----
 +/*
 ====Zeppelin Web Notebook==== ====Zeppelin Web Notebook====
-For those taking the //Scalable Data Science// course a 30-day web-based Zeppelin Notebook is available from [[https://www.basement-supercomputing.com|Basement Supercomputing]]. Please use the [[Sign Up Form]] to get access to the notebook. +For those taking the //Scalable Data Science// training a 30-day web-based Zeppelin Notebook is available from [[https://www.basement-supercomputing.com|Basement Supercomputing]]. Please use the [[Sign Up Form]] to get access to the notebook. 
  
 ---- ----
 +*/
 ====Other Resources for all Classes==== ====Other Resources for all Classes====
   * Book: [[https://www.clustermonkey.net/Hadoop2-Quick-Start-Guide/| Hadoop® 2 Quick-Start Guide: Learn the Essentials of Big Data Computing in the Apache Hadoop® 2 Ecosystem]]   * Book: [[https://www.clustermonkey.net/Hadoop2-Quick-Start-Guide/| Hadoop® 2 Quick-Start Guide: Learn the Essentials of Big Data Computing in the Apache Hadoop® 2 Ecosystem]]
-  * Video Tutorial: [[https://www.safaribooksonline.com/library/view/hadoop-and-spark/9780134770871|Hadoop® and Spark Fundamentals: LiveLessons]] +  * Book: [[https://www.clustermonkey.net/Practical-Data-Science-with-Hadoop-and-Spark|Practical Data Science with Hadoop® and Spark: Designing and Building Effective Analytics at Scale]]   
-  * Book: [[https://www.clustermonkey.net/Practical-Data-Science-with-Hadoop-and-Spark|Practical Data Science with Hadoop® and SparkDesigning and Building Effective Analytics at Scale]]+  *  Video Tutorial: [[https://www.oreilly.com/videos/data-engineering-foundations/9780137440580|Data Engineering Foundations Part 1: LiveLessons: Using Spark, Hive, and Hadoop® Tools]]   
 +  *  Video Tutorial (**NEW**): [[https://www.informit.com/store/data-engineering-foundations-part-2-building-data-pipelines-9780138086992|Data Engineering Foundations Part 2Building Data Pipelines with Kafka and Nifi ]]
  
 ---- ----
Line 79: Line 155:
 ====Contact==== ====Contact====
  
-For further questions or help with the Linux Hadoop Minimal Virtual Machine please email [[http://scr.im/4502|d...@b...g.com]]+For further questions or help with the Linux Hadoop Minimal Virtual Machine please email: ''deadline''(you know what goes here)''eadline''(and here)''org''
  
 ---- ----
  
-**Unless otherwise noted, all course content, notes, and examples (c) Copyright Basement Supercomputing 2019All rights reserved.**+**Unless otherwise noted, all training content, notes, and examples (c) Douglas Eadline 2019-2024 All rights reserved.**
  
  
start.1579727666.txt.gz · Last modified: 2020/01/22 21:14 by deadline