This is an old revision of the document!
The following steps explain how load and start the Linux Hadoop Minimal Virtual Machine (LHM-VM) and download the course notes files. A full and expanded explanation is provided as part of the class. The following steps are a “quick start.”
If you are using Linux or Mac, a terminal application is available that includes and “ssh client.”
If you are using Windows, you will need an “ssh client.” Either of these listed below will work. They are both freely available at no cost. (MobaXterm is recommended)
See Linux Hadoop Minimal Installation Instructions for instructions on how to start the Linux Hadoop Minimal Virtual Machine (LHM-VM)
Open a terminal (using Putty
or MobaXterm
on Windows) and enter the following to log in to the LHM-VM as user “hands-on” (password=“minimal”)
ssh hands-on@127.0.0.1 -p 2222
Once you are logged in to the LHM-VM, you should see the following prompt string:
[hands-on@localhost ~]$
The [hands-on@localhost ~]
will not be shown in the rest of the class documentation. A $
will indicate the prompt string for input.
To download the Data Engineering at Scale class notes into the LHM-VM, pull down and extract the course files (from inside the LHM-VM) as shown below:
$ wget --no-check-certificate https://www.clustermonkey.net/download/Hands-on_Hadoop_Spark/Data-Engineering-at-Scale-V1.0.tgz $ tar xvzf Data-Engineering-at-Scale-V1.0.tgz
If the file extracted correctly you should see:
$ ls Data-Engineering-at-Scale-V1.0 Data-Engineering-at-Scale-V1.0.tgz
These steps will be performed as part of the class.