A configuration driven in-memory transformation pipeline that can be embedded into any Java code base, with specific support for Flume, MapReduce, HBase, Spark and Solr. Supports multiple different file types including CSV, Avro, JSON, Parquet, RCFile, SequenceFile, ProtoBuf and XML plus gzip, bzip2, tar zip and jar files. Also supports a number of transformation steps out of the box, including integration with Apache Tika for reading common file formats.
Type Sub-Project Parent Project Kite Last Updated January 2017
Is packaged by Cloudera CDH