Hadoop Appliance
- Website : http://www.hadoopappliance.com
There are many difficulities and costs to set up Hadoop clusters and operate/utilize them properly. It's a very repetitive job to install operating systems and requied softwares on many machines as well as to run jobs via command-line interface. Hadoop Appliance provides solutions for these problems -- web based UI that includes various management functions and extension to Hadoop's original functions.
Overview
- Installation and managemnt of a cluster
- Set up and managemnt of multiple Hadoop clusters
- Web based UI for functions that Hadoop offers
- Extension to Hadoop itself
- Hadoop HA (High-Availability)
Features
Server Provisioning
- Automatic server provisiong : supports parallel and fast deployment via network booting with loggign
- DHCP based IP management : mapping IP-MAC addresses or setting individual hosts
- Internal DNS : provides internal domain names without modifying hosts files to increase security
Cluster Dashboard
- Whole cluster dashboard : provides information of the whole cluster
- Individual Hadoop cluster dashboard : provides status information of each node, MapReduce jobs, HDFS disk usage as well as NameNode and TaskTrackers
MapReduce Job Management
- Web based management of jobs : submitting jar files and viewing log on the web
- Setting input/output paths via HDFS browser
- Undo of submitted jobs
HDFS Browser
- Web based file/directory browsing
- Viewing file contents on the web
- Web based uploading and downloading files
Hadoop Namenode HA
- Detection of Namenode failures via heartbeat + S.M.A.R.T of HDD
- Automatic IP change to stand-by node when detected failure
Resource & Server Monitoring
- Real-time system monitoring : includes change analysis by RRD
- Hadoop monitoring : collects Hadoop status and metrics with integration with Hadoop management features
- Graphs : generating graphs of analysis result for various metrics and web UI for them
System Architecture
Effects
- Reduced cluster deployment time with automatic server provisioning and cluster set up process
- More efficient management with single system that can handle nodes and Hadoop
- Increased productivity with convenient MapReduce job management along with HDFS browser
- Improved accessibility with web-based UI