Apache Hadoop, Spark, and Big Data

Big Data Analytics

Apache Hadoop is a platform for managing large amounts of data. It includes many tools and applications under one framework.

Details: Written by Douglas Eadline; Published: 31 March 2015; Hits: 4634

From the elephant in the room department

Talk to most people about Apache™ Hadoop® and the conversation will quickly turn to using the MapReduce algorithm. MapReduce works quite well as a processing model for many types of problems. In particular, when multiple mapping process are used to span TBytes of data the power of a scalable Hadoop cluster becomes evident. In Hadoop version 1, the MapReduce process was one of two core components. The other component is the Hadoop Distributed File System (HDFS). Once data is stored and replicated in HDFS, the MapReduce process could move computational processes to the server on which specific data resides. The result is a very fast and parallel computational approach to problems with large amounts of data. But, MapReduce is not the whole story.

Page 2 of 2

«Start Prev12NextEnd»

Main Menu

Search

Login And Newsletter

Feedburner

Subscribe Now!

Front Page RSS Feed

This work is licensed under CC BY-NC-SA 4.0

©2005-2023 Copyright Seagrove LLC, Some rights reserved. Except where otherwise noted, this site is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International. The Cluster Monkey Logo and Monkey Character are Trademarks of Seagrove LLC.