User Tools

Site Tools


start

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Next revision Both sides next revision
start [2019/08/23 17:58]
deadline added additional vi link
start [2020/01/03 17:22]
deadline link updates
Line 1: Line 1:
-=====Welcome to Scalable Analytics with Apache Hadoop and Spark=====+=====Welcome to Effective Data Pipelines Series ===== 
 +(previously Scalable Analytics with Apache Hadoop and Spark)
  
-**(The four essential courses on the path to +**(The six essential courses on the path to 
-scalable data science nirvana--or at least a good start)**+scalable data science pipelines nirvana--or at least a good start)**
  
 ====Course Descriptions and Links==== ====Course Descriptions and Links====
Line 8: Line 9:
 Click on the course name for availability and further information. For best results, courses should be taken in the recommended order (shown below).  Courses 1 and 2 can be taken out of order. Course 3 builds on courses 1 and 2. Course 4 builds-on and assumes competence with topics in courses 3, 2, and 1.  Click on the course name for availability and further information. For best results, courses should be taken in the recommended order (shown below).  Courses 1 and 2 can be taken out of order. Course 3 builds on courses 1 and 2. Course 4 builds-on and assumes competence with topics in courses 3, 2, and 1. 
  
-| 1 | [[https://www.safaribooksonline.com/search/?query=Apache%20Hadoop%2C%20Spark%20and%20Big%20Data%20Foundations&field=title|Apache Hadoop, Spark and Big Data Foundations]] - A great introduction to the Hadoop Big Data Ecosystem. A non-programming introduction to Hadoop, Spark, HDFS, and MapReduce. (3 hours-1 day)|{{wiki:foundations-course.png}}|+**NOTE:** If the link does not lead you to the class, it has not yet been scheduled. Check back at a future date. Also two new courses in the series are coming in the new year (including Kafka coverage 
 +and Data Engineering). 
 + 
 +| 1 | [[https://www.oreilly.com/search/?query=Apache%20Hadoop%2C%20Spark%2C%20and%20Kafka%20Foundations%3A%20Effective%20Data%20Pipelines&extended_publisher_data=true&highlight=true&include_assessments=false&include_case_studies=true&include_courses=true&include_orioles=true&include_playlists=true&include_collections=true&include_notebooks=true&is_academic_institution_account=false&source=user&formats=live%20online%20training&sort=relevance&facet_json=true&page=0|Apache Hadoop, Sparkand Kafka Foundations: Effective Data Pipelines]] - A great introduction to the Hadoop Big Data Ecosystem with Spark and Kafka. A non-programming introduction to Hadoop, Spark, HDFS, MapReduce, and Kafka. (3 hours-1 day)|{{wiki:foundations-course.png}}|
 | 2 |[[https://www.oreilly.com/search/?query=Practical%20Linux%20Command%20Line%20for%20Data%20Engineers%20and%20Analysts%20Eadline| Practical Linux Command Line for Data Engineers and Analysts]] - Quickly learn the essentials of using the Linux command line on Hadoop/Spark clusters. Move files, run applications, write scripts and navigate the Linux command line interface used on almost all modern analytics clusters. Students can download and run examples on the "Linux Hadoop Minimal" virtual machine, see below. (3 hours-1 day)|{{wiki: command-line-course.png}}| | 2 |[[https://www.oreilly.com/search/?query=Practical%20Linux%20Command%20Line%20for%20Data%20Engineers%20and%20Analysts%20Eadline| Practical Linux Command Line for Data Engineers and Analysts]] - Quickly learn the essentials of using the Linux command line on Hadoop/Spark clusters. Move files, run applications, write scripts and navigate the Linux command line interface used on almost all modern analytics clusters. Students can download and run examples on the "Linux Hadoop Minimal" virtual machine, see below. (3 hours-1 day)|{{wiki: command-line-course.png}}|
 | 3 |[[https://www.safaribooksonline.com/search/?query=Hands-on%20Introduction%20to%20Apache%20Hadoop%20and%20Spark%20Programming&field=title|Hands-on Introduction to Apache Hadoop and Spark Programming]] - A hands-on introduction to using Hadoop, Pig, Hive, Sqoop, Spark and Zeppelin notebooks. Students can download and run examples on the "Linux Hadoop Minimal" virtual machine, see below. (6 hours-2 days)|{{wiki: hands-on-course.png}}| | 3 |[[https://www.safaribooksonline.com/search/?query=Hands-on%20Introduction%20to%20Apache%20Hadoop%20and%20Spark%20Programming&field=title|Hands-on Introduction to Apache Hadoop and Spark Programming]] - A hands-on introduction to using Hadoop, Pig, Hive, Sqoop, Spark and Zeppelin notebooks. Students can download and run examples on the "Linux Hadoop Minimal" virtual machine, see below. (6 hours-2 days)|{{wiki: hands-on-course.png}}|
start.txt · Last modified: 2024/01/29 21:19 by deadline