Solution for managing data on a CDH Hadoop cluster. Automatically extracts metadata relating to HDFS, Hive, Impala, MapReduce, Oozie, Pig, S3, Spark, Sqoop and YARN, including data structures (databases, tables and columns) and jobs (relating to data transformation) based on activity within a cluster (rather than statically analysing code), allowing it to be searched, filtered and viewed, including displaying lineage diagrams showing how data moves through the system, a Data Stewardship dashboard of key data management information (including statistics on the data held in the cluster and the activity relating to this data), analytics on the data held in HDFS, and a full audit capability of all activity on the cluster. Allows custom metadata to be added to objects, including descriptions, key-value pairs and tags, with the option to define metadata namespaces and data types / value constraints (managed metadata), plus the ability to pre-set custom attributes (via job properties for MapReduce jobs and JSON .navigator files for HDFS files), and the ability to define data lifecycle management policies (allowing actions to be specified based on metadata, e.g. to archive any files that haven't been accessed for six months). Web based, with a full user security model, and a REST API and Java SDK for integrating external applications with metadata held in Navigator. Initial 1.0 release was in February 2013.
Vendors Cloudera Type Commercial Last Updated April 2019 - v6.2
version release date release links release comment 2.10 2017-04-18 See CDH 5.11 release links CDH 5.11 2.11 2017-07-13 See CDH 5.12 release links CDH 5.12 2.12 2017-10-12 See CDH 5.13 release links CDH 5.13 2.13 2018-01-26 See CDH 5.14 release links CDH 5.14 2.14 2018-06-15 See CDH 5.15 release links CDH 5.15 2.15 2018-11-28 Release Notes [CDH] 5.16 6.0 2018-08-30 Release notes CDH 6.0 6.1 2018-12-18 Release notes CDH 6.1 6.2 2019-03-29 Release notes CDH 6.2