Welcome to Hadoop 2 Quick-Start Guide Resource Page, where you can ask questions about the book and examples. By visiting our website you agree that we are using cookies to ensure you to get the best experience.

Hadoop® 2 Quick-Start Guide

Learn the Essentials of Big Data Computing in the Apache Hadoop® 2 Ecosystem

With Hadoop 2.x and YARN, Hadoop moves beyond MapReduce to become practical for virtually any type of data processing. Hadoop 2.x and the Data Lake concept represent a radical shift away from conventional approaches to data usage and storage. Hadoop 2.x installations offer unmatched scalability and breakthrough extensibility that supports new and existing Big Data analytics processing methods and models.

Hadoop® 2 Quick-Start Guide is the first easy, accessible guide to Apache Hadoop 2.x, YARN, and the modern Hadoop ecosystem. Building on his unsurpassed experience teaching Hadoop and Big Data, author Douglas Eadline covers all the basics you need to know to install and use Hadoop 2 on personal computers or servers, and to navigate the powerful technologies that complement it.

Eadline concisely introduces and explains every key Hadoop 2 concept, tool, and service, illustrating each with a simple “beginning-to-end” example and identifying trustworthy, up-to-date resources for learning more. This guide is ideal if you want to learn about Hadoop 2 without getting mired in technical details. Douglas Eadline will bring you up to speed quickly, whether you’re a user, admin, devops specialist, programmer, architect, analyst, or data scientist. Hadoop 2.x installations offer unmatched scalability to support soaring data models, and breakthrough extensibility to support new processing models.

NEW Check out these on-line training classes (get up to speed quickly):

  • Practical Linux Command Line for Data Engineers and Analysts - Quickly learn the essentials of using the Linux command line on Hadoop/Spark clusters. Move files, run applications, write scripts and navigate the Linux command line interface used on almost all modern analytics clusters. (3 hours - 1 Day)
  • Apache Hadoop, Spark and Big Data Foundations - A great introduction to the Hadoop Big Data Ecosystem. A non-programming introduction to Hadoop, Spark, HDFS, and MapReduce. (3 hours- 1 day)
  • Hands-on Introduction to Apache Hadoop and Spark Programming - A hands-on introduction to using Hadoop, Pig, Hive, Sqoop, Flume, Spark and Zeppelin notebooks. All examples provided in course notes. Students can download and run examples on a "Hadoop Minimal" virtual machine. The virtual machine is designed to be used on a desktop or laptop (6 hours - 2 days).

Both classes provided ample time for interactive questions. The course can be streamed at any time.

To learn about Data Analytics with Hadoop check out the latest book on Practical Data Science with Hadoop and Spark: Designing and Building Effective Analytics at Scale.

ClusterMonkey Home