The Mid Week News - 27/09/2017 edit  

It’s news time again, and there are big announcements from Cloudera and Hortonworks this week…

Technology updates (details are on the relevant technology pages):

Cloudera/Hortonworks technology news:

  • The big news this week is the simultaneous big product announcements from Hortonworks and Cloudera that look like they might be similar capabilities, but I think are probably trying to solve subtly different problems - we’ll revisit these in a few weeks once there’s more information available and do some technology summaries.
  • Cloudera SDX (Shared Data Experience, coming in CDH 5.13) appears to be trying to enable the “one” data platform experience that you get with an on premesis CDH cluster in the cloud, specifically a persistent shared storage layer with shared metadata, security and governance and a range of workloads on top. That looks different in the cloud - you probably don’t want persistent Cloudera cluster that you’re paying for by the hour even if you’re not using it - so SDX gives you a shared storage layer using cloud object storage, a shared metadata and management layer, and then the ability to run compute workloads in isolated transient workload clusters managed through Cloudera Altus. The original sales pitch of a single shared Hadoop data platform re-imagined for the cloud. More details via a Cloudera VISION blog post and a Cloudera Engineering blog post
  • Hortonworks Data Plane is again all about shared metadata, security and data management, but this time across a range of different data platforms - Hadoop, relational databases and your EDW, either on-premesis or in the cloud, and for data in motion or at rest. It’s open source, extensible for adding new services, with data lifecycle management being first up, allowing you to replicate, backup & restore and tier your data across your data platforms. It’s another cloud service (because obviously), and they talk about it as a Global Data Management Platform. More details via a Hortonworks blog post

Other technology news:

  • MapR DB 6.0 has been announced and will be available Q4 2017. There’s been a bunch of changes in the MapR stack over the last couple of months that I’ve not been keeping up to date with (the introduction of MapR XD for starters), so we’ll loop back round in a couple of weeks to refresh our MapR information.
  • Hortonworks are crowing about the increase in Hive performance in HDP 2.6 and its support for the full suite of 99 TPC-DS queries
  • Part 1 on the Apache Kudu consitency model
  • Looks like Hortonworks’ are proud of the fact they run docker containers on YARN
  • An introduction from InfluxData on InfluxDB and the TICK stack