The DataAware Podcast

DataAware is not just a podcast — it’s a think tank for data professionals. From Data Engineers and Analytics Engineers to Chief Data Officers, we cater to those who don’t just follow trends but forge new paths in data engineering. Our episodes offer a blend of current insights and visionary discussions, focusing on innovations and out-of-the-box ideas shaping the future of data.

Join us for conversations that go beyond the now. DataAware is where today’s data professionals meet tomorrow’s industry breakthroughs. Be part of a community that leads, innovates, and transforms the world of data.

Subscribe to DataAware On:


Apple Podcast

Google Play

Spotify Podcast

Get all the episodes delivered straight to your inbox:

Explore DataAware Podcast Episodes

Are data pipelines a new security concern for LLM projects? Join Sean and Paul as they unpack the recent AI safety and security guidelines unveiled by the US and UK governments, endorsed by 16 other countries and dozens of top AI companies in Silicon Valley.
Join Sean and Paul as they talk about how an intelligent data pipeline controller brings traditional developer techniques to the world of data engineering. Learn about Sean’s framework for deciding how frequently you should refresh your pipelines and see if streaming and batch data processing are converging. Plus, what is the definition of real-time? All this and more in this week’s episode!
Join Sean and Paul as they unpack the trends from Ascend’s annual DataAware Pulse Survey. Learn why executives and individual contributors disagree on strategy so often... and why many in the data team want to drive automation but struggle to achieve it. All that and a full recap of the 2023 Big Data London event in this episode!
Sean and Paul talk the three eras of data engineering teams move through as they get more mature with data processing. We unpack the kinds of metadata required at each stage, and how realistic it is to build a system that processes data in incremental packets instead of full reductions.
Sean and Paul talk about the impact that Generative AI will have on data engineering. They explore similarities between LLMs and data pipeline automation models, and review common components of an automation stack.
Sean and Paul talk about whether spreadsheets will become self-aware now that Python functions are available natively in Excel, and what the definition of Data Pipeline Automation is given that so many people say they want it.
Sean and Paul unpack the latest research on the benefits of automation and why platform consolidation happens so often during market downturns. They explore how data engineers can improve their productivity with data by 700% and save at least $156k on their data stack costs at the same time. Tune in for more!
The one where Paul and Sean talk about why Sean founded Ascend, how automation has disrupted so many parts of the developer stack in the past, and why it's coming for data pipelines next!
In this episode, Sean and Leslie talk all things data lineage with Ascend solution engineer Jon Saltzman—from its importance at every step of the data journey to how data organizations go about ensuring their data is "certified fresh and organic" or, rather, easily traceable to where it's been and who has touched it.
In this episode, we chat with Ascend Field Data Engineer Shayaan Saiyed about one thing data engineers can't function without—data transformations, including best practices for iterating and troubleshooting and how data teams can be goal-oriented when starting to think about their ETL pipeline.
In this episode, we go back to the foundations of data engineering and data pipelines with a deep dive into data ingest.
In this episode, we discuss the concept of accidental ransomware—or when a team is slowed down by the burden of managing and maintaining commoditized software.