Right, I’ve been slack in getting this out there, which means we’ve built up a nasty backlog, but it’s time to talk about what’s changed since we originally wrote some of our technology summaries.
Firstly, there are new releases of the Hortonworks Data Platform and CDH. Version 2.6 of HDP brings two main features - Hive LLAP, the ability for Hive to target the real time interactive query space, and Hive ACID Merges, allowing data to be transactionally loaded into Hive. Version 5.11 of CDH brings Navigator lineage support for Spark, the integration of Kudu with Kerberos, improvements to S3 support and support for Azure Data Lake Store.
There have also been a mass of projects that have seen new releases. Ordinarily I’d like to provide some sort of commentary on these, however given I’ve built up such a backlog we’ll just list them off this week. Each technology page includes a link to the relevent release announcement or details if you’re interested however. So, in no particular order the technologies that have seen new releases are: Apache Ambari, Apache Apex, Apache Atlas, Apache Bigtop, Apache Calcite, Apache Crunch, Apache Hadoop, Apache HBase, Apache Ignite, Apache Knox, Apache Mahout, Apache Parquet C++, Apache Phoenix, Apache Ranger, Apache Solr and Apache Storm.
In terms of other technologies news:
- Sqoop2 has been deprecated by Cloudera as of CDH 5.9, and will be removed from CDH in version 6. Suggests that all is not well in Sqoop2 land.
- Hadoop 3.0 is now into it’s second alpha release. Summary is here, with some thoughts form Hortonworks and Cloudera
- Apache Ranger has graduated to a top level Apache project. There’s an InfoQ write-up
- Gobblin has been donated to the Apache Foundation by LinkedIn - link
We’ll try and do this weekly going forward - let’s just hope keeping up to date with everything doesn’t prove to be unsustainable! And next week we’ll have a look at some of the interesting blog posts I’ve been accumulating.