Hadoop Distributions Comparison edit  

A comparison of different Hadoop distributions, focusing on the different software packages available in each to enable an understanding of whether a distribution contains the appropriate software to meet a requirement or business case.

ComponentClouderaHWMapR
Compute Cluster ManagerYARNYARN; SliderYARN; Myriad
Hadoop Compatible FilesystemHDFS; RecordServiceHDFSMapR-FS
NoSQL DatastoreHBase; AccumuloHBase; AccumuloMapR-DB; HBase;
SQL DatastoreKudu + ImpalaPhoenix; Hawq 
Streaming Data StoreKafkaKafkaMapR-ES
Batch AnalyticsHive (on Spark); MapReduce; Pig; SparkHive (on Tez); MapReduce; Pig; Spark; HawqHive; MapReduce; Pig; Spark
Streaming AnalyticsSpark StreamingSpark Streaming; StormSpark Streaming; Storm
Graph AnalyticsSpark GraphXSpark GraphXSpark GraphX
Query EngineImpalaHive (LLAP)Drill; Impala
Machine LearningMahout; Spark MLlibMahout; Spark MLlibMahout; Spark MLlib
Analytical SearchSolrSolrSolr (available as an add on pack)
Data IngestionSqoop; FlumeSqoop; FlumeSqoop; Flume
Data Flow Management Falcon 
Workflow ManagementOozieOozieOozie
SecuritySentry; RecordServiceKnox; RangerSentry
ManagementCloudera ManagerAmbariMapR Installer; MapR Control System; MapR Monitoring
MetadataCloudera NavigatorAtlas 
Cloud ManagementCloudera DirectorCloudbreak 
User InterfacesHueZeppelin; HueHue