This shows you the differences between two versions of the page.
— |
first_steps_for_scalable_pyspark_for_data_science [2024/01/07 23:08] (current) deadline created |
||
---|---|---|---|
Line 1: | Line 1: | ||
+ | ====== Scalable PySpark for Data Science ====== | ||
+ | The following steps explain how load and start the Linux Hadoop Minimal Virtual Machine (LHM-VM) and download the course notes files. A full and expanded explanation is provided as part of the class. The following steps are a "quick start." | ||
+ | |||
+ | If you are using Linux or Mac, a terminal application is available that includes an "ssh client." | ||
+ | |||
+ | If you are using Windows, you will need an "ssh client." | ||
+ | below will work. They are both freely available at no cost. (MobaXterm is recommended) | ||
+ | |||
+ | - [[http:// | ||
+ | - [[http:// | ||
+ | |||
+ | See [[Linux Hadoop Minimal Installation Instructions]] for instructions on how to start the Linux Hadoop Minimal Virtual Machine (LHM-VM) | ||
+ | |||
+ | ==== When the VM is Started ==== | ||
+ | |||
+ | Open a terminal (using '' | ||
+ | graphical tool. | ||
+ | < | ||
+ | ssh hands-on@127.0.0.1 -p 2222 | ||
+ | </ | ||
+ | |||
+ | Once you are logged in to the LHM-VM, you should see the following prompt string: | ||
+ | < | ||
+ | [hands-on@localhost ~]$ | ||
+ | </ | ||
+ | |||
+ | The '' | ||
+ | |||
+ | To download the **Kafka Methods and Administration** class notes into the LHM-VM, pull down and extract the course files (from inside the LHM-VM) as shown below: | ||
+ | < | ||
+ | $ wget --no-check-certificate https:// | ||
+ | $ tar xvzf Scalable-PySpark-v1.tgz </ | ||
+ | |||
+ | These steps will be performed as part of the class. |