Hortonworks DataPlane edit   discuss  

An extensible platform for managing data ecosystems, with capabilities delivered through plugable applications. Supports the registration and management of DataPlane applications and the registration of Ambari managed clusters that are then accessible to these applications. Supports role based access control, with LDAP integration for users and groups and support for app specific roles. Runs on docker, with state held in an external database, and integrates with Knox (for SSO and access to clusters). Future services referenced include Cloudbreak and IBM DSX. Stated plan is for this to be a cloud service, however this is not currently generally available, and the documentation currently details installation steps for a local machine. First released in November 2017.

Technology Information

Other NamesHortonworks DataPlane Service
VendorsHortonworks
TypeCommercial Open Source
Last UpdatedSeptember 2018 - 1.2

Sub-projects

Hortonworks DataPlane >  Data Analytics StudioA DataPlane application for running Hive queries, managing Hive tables, and diagnosing Hive query performance issues. Supports a query editor (with autocomplete, a visual explain plan, performance improvement recommendations, saved queries and results downloading), a query search tool (with pre-defined queries for expensive, long running, non-optimised and failed queries, a range of filters and saved searches), a database management tool (supporting searching, browsing, interrogation, creation and modification of databases, tables, partitions and columns as well as uploading of data from local storage or HDFS) and table impact reporting (showing reads, writes, projections, aggregations, filters and joins by table and column, with support for dynamic heatmaps overlaid on entity relationship diagrams). Requires a Ambari mangement pack (the DAS engine) to be installed on all clusters.
Hortonworks DataPlane >  Data Lifecycle ManagerA DataPlane application for replicating HDFS and Hive data between two clusters along with any associated metadata and security policies. Clusters already registered with DataPlane can be paired, at which point replication policies can be defined, which result in replication jobs running at the selected interval. Supports replicating between HDFS and cloud object storage (with some caveats around replication of security policies), replication of encrypted HDFS data, TLS encryption of replication streams, one to many replication, support for Atlas metadata replication, reporting on and management of replication jobs and HDFS snapshottable directories, with jobs executed by DLM Engine processes on the appropriate cluster. Stated future plans include support for automatic tiering of data between clusters and point in time backup and restore.
Hortonworks DataPlane >  Data Steward StudioA DataPlane application for viewing and understanding data assets, with supported data assets currently limited to Hive tables on clusters with Atlas and Ranger installed. Supports viewing metadata associated with data assets (including properties, lineage, security policies and audit logs), profiling of data (with profiling performed by a background Spark process, with support for data summarisation, identifying sensitive/personal data and profiling user access to data), grouping of data assets into asset collections, taging and rating of data assets and collections and dashboard views of metadata by cluster and collection.
Hortonworks DataPlane >  Streams Messaging ManagerA DataPlane application for monitoring Apache Kafka clusters. Provides an overview view of producers, topics (and their partitions), brokers and consumer groups, showing key statistics and the connections between them, with the ability to propagate filters based on these connections. Also provides detail views, profiles and historic graphs for each producer, topic, broker and consumer group, with the ability to link out to Atlas to see end to end lineage and Ambari Grafana for detailed metrics. Metrics and status information is also provided over a REST API, with a REST Server Agent running on each cluster being monitored.

Release History

versionrelease daterelease linksrelease comment
1.02017-11-01announcement 
1.12018-05-21release notesSignificantly expanded docs
1.22018-08-25release notesSupport for HDP 3.0 and HDF 3.2

News

Blog Posts