DataKitchen DataOps Platform edit  

Platform to enable aoption of DataOps practices for data engineering, science, and analytic teams. These DataOps practices combine ideas in Agile development, DevOps, statistical process control, data science model deployment, and test data automation through a series of steps in a collaborative workflow. Within DataKitchen's product, users start with "Kitchens." Each Kitchen represents a place to do work: production environments, development sandboxes, etc. Kitchens are a collection of the data, data stores, tools (ETL, data science, visualization), code or configuration used by those tools, git branch, and the necessary servers and software. These collections can be created, merged, or shut down. A Kitchen can also be a current environment already available in the organization. Kitchens can be individual or shared across groups. When working in Kitchen, team members create and run "Recipes." Each Recipe is a directed graph of steps. A Recipe represents the workflow pipeline used to deliver analytics: acquire data, transform data, call a machine learning model, and visualize data. A Recipe utilized the tools that DataKitchen's customers already own. As Recipes are running, tests are embedded in the Recipe to detect values, ranges, distribution, frequency, implied and enforced integrity, and other business based checks on data or processing. "Order" metadata is created that is no just about lineage and descriptors of the data and jobs, but also includes statistics such as wall-clock time, processing requirements, test data outputs, and more. Alerts are delivered if selected tests fail. If not found, data and process errors reduce the business users trust. DataKitchen's customers see a meaningful reduction in the number of data errors, incorrect results, and late deliveries. Another DataKitchen goal is to reduce the time it takes to move changes from development into production. When business customers request new changes, DataKitchen can continually and automatically deploy those changes. A challenge is to make sure that those changes do not cause regressions, functional errors, or performance problems in production. The embedded Recipe tests serve a dual role in resolving this problem. In production environments, those tests provide surveillance and alerts, but in development, those tests make sure that any change in code or configuration does not cause a problem further down the pipeline. DataKitchen allows users on different teams and location to collaborate. One method is to use "Ingredients" to create sharable, reusable, component services. Multiple Recipes can call Ingredients, and they have a standard, function-like API. The platform was designed and implemented for secure multi-tenant multi-cloud, and multi-environment deployments. Finally, interfacing with Data Kitchen is supported by a user interface, command line or APIs. This is important because Data Kitchen can accept data and metadata from other processes you may already have in place. The DataKitchen DevOps Platform is a commercial product, available as a managed service with optional on-prem agent installation, and was first released in 2014.

Technology Information

Last UpdatedJune 2019