New capability combines the interactivity of data warehouses with the scale and flexibility of data pipelines for a seamless development experience
PALO ALTO, Calif. – August 27, 2019 – Ascend, provider of the world’s first Autonomous Dataflow Service, today released a technical preview of Queryable Dataflows, a new capability to accelerate end-to-end data development that brings the interactivity of data warehouses to the scale of data pipelines. For the first time, data engineers can directly query incremental stages of any Dataflow without changing tools or disrupting the development process.
Powered by the Dataflow Control Plane, interactive queries are not only optimized for any scale of data but can be immediately productionized as new Dataflow stages in a single click. With the ability to explore, profile, and prototype now seamlessly integrated within the data development process, data engineers can build and productionize pipelines faster than ever before.
“For too long, data engineers have had to trade off the scale and performance of carefully tuned pipelines with the interactivity of warehouses. Making incremental stages of data pipelines queryable is challenging, requiring advanced layers of caching, persistence, lineage tracking, and data invalidation. As a result, engineering teams invest huge amounts of time determining which stages of pipelines to persist and where – what belongs back in the data lake and what belongs in a warehouse,” said Sean Knapp, CEO and founder of Ascend. “With the introduction of Queryable Dataflows, data engineers are freed from these unnecessary tradeoffs and can directly interact with data from any stage of any Dataflow as if it were a data warehouse. By breaking through the boundaries between pipelines and warehouses, we both ease pipeline development as well as more seamlessly integrate the data development lifecycle.”
In today’s data architecture, pipelines are the backbone connecting dynamic data from disparate systems to the data consumers, including data analysts and scientists building new applications and machine learning models. Unfortunately for the data engineers constructing these pipelines, understanding the data profile, discovering and exploring new datasets as well as staging the correct transformations based on the current structure has been extraordinarily challenging and time consuming. With no ability to interactively query or checkpoint result sets throughout the development process, building pipelines has become increasingly slow as data engineers are forced to run these queries against raw data in data lakes or derived data in data warehouses. In the fast moving world of big data and AI, pipeline development needs to become more interactive throughout the entire data lifecycle.
Ascend’s new Queryable Dataflow capability brings the speed and flexibility of interactive queries directly to the pipeline development world. Now data engineers can efficiently explore and profile even the largest raw datasets incrementally as they build. This enables them to not only construct new pipelines faster but also checkpoint as they go to ensure accurate results before landing resulting data into downstream systems. Additionally, interactive queries can immediately be productionized as stages within Dataflows, eliminating recoding and reprocessing.
Along with dramatically accelerating the pipeline development and productionization process, Queryable Dataflows hybridizes the pipeline and warehouse world to better optimize downstream systems for operational analytics and reporting. Pipelines are now able to handle staging and exploration use cases, offloading them from the warehouses to improve costs and performance. Data analysts and scientists can also connect directly to pipeline stages without having to first load data into a warehouse, using their preferred BI tools and notebooks for fast, on-demand access.