Ep 22 - Generative AI In Data Engineering

Sean and Paul talk about the impact that Generative AI will have on data engineering. They explore similarities between LLMs and data pipeline automation models, and review common components of an automation stack.

Transcript

Paul Lacey: Hey everyone. Welcome back to the program. I’m Paul Lacey and this is Data Aware with Ascend, the podcast that talks about all things related to data engineering. I’m joined once again by the founder and CEO of Ascend, Sean Knapp. Sean, welcome.

Sean Knapp: Hey everybody.

Paul Lacey: Hey, great to have you Sean. So we left with a bit of a cliffhanger ending last time where we were talking about, oh, wouldn’t it be great if we got into what are the components of a data automation stack, data pipeline automation stack? And I promise the audience that we’re not always going to leave you with one of these cliffhangers, although these conversations are part of a series where we’re unpacking what it means to do data engineering in the modern world and what the future of data pipelines look like.

Sean Knapp: You make it sound like we have a master plan.

Paul Lacey: I’d like to think so, but yeah, no, exactly. It was a little more like an episode.

Sean Knapp: This for all the listeners, this was pitched to me as, “Hey Sean, you want to show up and talk about some stuff on a weekly basis?” And I said, “That sounds fantastic. Do I have to prep at all?” And Paul said, “You’re probably good.” That’s it. I’m in.

Paul Lacey: Yeah, no, it is a bit of a stroll through the garden, if you will, of all things related to data engineering. And yeah, the dirty secret is there’s no master plan. We’re just making it up as we go. But with a couple of guys, with a lot of experience and domain expertise, I think we can kind of hit on the high points as we go along. And Sean, so one of the high points that we wanted to pick up on, just because super fresh and I know everybody’s interested in generative AI right now, there’s a lot of conversations happening in the industry, in tech publications and whatnot, about generative AI and how it applies to various industries, and in particular data management. So we just had, for preface, for everybody that’s listening, we had an amazing session yesterday with a bunch of folks from the industry that are running data teams and data programs, and we all came together to enjoy some coffee and some chocolates and just hang out and have a coffee break essentially talking about what is generative AI and how is it going to impact teams and what are people looking to do with it?

And it was a great conversation. So what I’d love if we could do for a second here, Sean, is maybe unpack some of the takeaways from that so that people can benefit from some of the conversations that we had there too. I know it’s a private setting that we’re not going to name names or really bring up any particular examples in relation to companies and whatnot that were on the call. But just general high level takeaway, Sean, what did you find was most interesting from our conversation?

Sean Knapp: So I think there was a couple. Clearly there’s a lot of interest in generative AI, and I think pretty uniformly across the group there was a really interesting conversation I would say around this, “Hey, the more senior we get inside of our organizations, the more magical GenAI is and it’s going to solve all these problems and we’re going to have a sense that a business user can just ask a question and get all the right data and so on.” And on the practitioner side of the, “Hey, maybe it’s going to get there at some point, but we’re just figuring out how this thing works.” And one of the people on the conversation at this great quote, which is, “If I ask Jenny what one plus one is enough times, I’m confident I’m going to get the wrong answer.” I’m going to paraphrase.

But I think that’s the really important part, especially in data. I think data and Analytics and Machine Learning that we’re still in this very early stage, and it’s amazingly exciting, and there’s already things we can absolutely put GenAI to work for. I think the general sentiment was everybody’s safe. It’s not coming for your job, so don’t worry about that. It does have the ability to give a significant leverage and remove what generally tends to be actually the highest toil parts of our problem. So when it came to data engineering in particular, a lot of excitement around can we get it to do the things that we and our teams generally don’t like to do anyways? Document, suggest different test cases, analyze code, help you translate between platforms, things like that. And so I think that’s where in the very immediate term we see a lot of interest as we start to work our way towards something bigger.

I think the closing comments on it too was we all know that it’s going to be very relevant in our future, profoundly relevant. And so in many ways there’s the tactical part of how we get our teams using it today for additional leverage, but there’s actually just the cultural element as an organization, which is the race. Some people would say the race has already started. Some people would say the race is already well underway. And I think from a cultural perspective, we’re just getting started and you have to get your team ready, embrace it because the rate of change is going to continue to accelerate. So a lot of it is just laying the groundwork in the foundation where people are truly interested in using it day to day.

Paul Lacey: It was a very interesting conversation and what you said just there, Sean, reminded me of your own research where you looked at how often the power of AI is doubling compared to computational power. Everyone knows Moore’s law and the number of semiconductors, a number of transistors on a chip doubles every 18 months, and we’ve gone a long time with that holding true, but what is it with AI? What’s that doubling frequency?

Sean Knapp: Yeah, some research out of it was Google, PWC, Stanford, and a few others highlighted that AI computational power, so not just the number of transistors but also the efficiency of the algorithms and so on, that the AI computational power is doubling every 3.4 months. And so when we play that out over the course of two years, that’s slightly over seven cycles. And for all the math nerds, 2 to the 7th is 128, the math actually shakes out to be about 133X increase in computational power just over the course of two years. And this was looking at back, I think, over the last eight to ten years if I recall from this research, and is forecasting that we’re going to continue to see this, and we would see this from a lot of the generative AI leaders in the space too. Our forecasting will continue to see this growth rate at least for the next five to ten years, if not longer.

Paul Lacey: Interesting. Yeah, so essentially it’s something that’s coming quickly, it’s something that’s going to be coming even more quickly. I forget who was attributed the quote, but when someone asked, “How did you go bankrupt?” They said, “Slowly and then suddenly.” And then that seems like a lot of what we can expect around the generative AI movement as well, is we’ll see things happen slowly and then suddenly, once we start hitting the knee of that curve a little bit.

Sean Knapp: Yeah, I think we’ve seen the introduction of a lot of raw power, and I think now we’re going to see major step functions obviously in the next generations when OpenAI releases GPT 5, for example, which we can assume is well under development. But at the same time, we’re going to also find a lot of material changes as we get hyper-verticalized approaches to generative AI, to begin with. And so I think that will probably soften the rate of change. It will still be very fast, but it won’t be these massive, earth-shattering step functions. It will just be this breakneck rate of change that’s smoothed out by verticalization in different fields.

Paul Lacey: One of the things that stood out to me from the conversation was I think one of the folks on the call said, “My executives seem to think that everything is applicable to generative AI. They think you can have generative AI replace … They were a little bit speaking tongue in cheek, but replace the data science team.” Just ask a question of Chat GPT, and it’s going to tell you the answer. How many customers do we have? How many customers will we have in three years? Oh yeah, here it is. It was interesting. We looked at some data as well from a recent survey that we did at Ascend that we’re getting ready to publish here more publicly, and it showed a gap between the expectations of the executives and their direct reports, the directors and leaders in a firm versus the individual contributors and the team leads on the teams that are actually in the trenches doing the work.

And somebody pointed out, I thought it was very insightful, is those are the people that are actually the closest to the technology and they’re actually, they understand more than anybody else how the technology works. So if you look at this data, what it’s actually saying is the people who don’t understand it, think it’s a panacea, and they think that, “Oh, I can just throw this at everything.” And the people who actually really understand it are the most skeptical of all say, “This probably is not going to do nearly the amount of stuff that you think it’s going to do.” When we look at how the models work and how the predictive algorithms work, and it’s not really intelligence, it’s just guessing the next word in the sentence that just so happens that it’s got such a large data set to work from that it gets it relatively right fairly often.

But like you said, it sometimes gets it wrong. It can hallucinate, it can say, “I think that what is one plus one and it’s like, oh three.” And if you don’t have somebody that’s there watching it that knows that one plus one equals two, they might just be like, “Oh one plus one equals three. Let’s take that. Let’s plug that into a model.” And before you know it, your business is completely off the rails because you’re smashing into Mars because you didn’t realize that one plus one is two, not three.

Sean Knapp: I think that’s one of the really interesting things about GenAI today is it’s changing so fast and it came onto the scene, I think, for most people, I shouldn’t say came on scene, hit most people’s radar with such force that we’re finding all these extremes and we’re finding the, it’s going to solve everything for us. And I think that it can solve everything for us. Maybe it can. Maybe at some point in our lifetime it will be absolutely amazing. And I tend to be pretty bullish on that belief structure, but I still think there’s a confusion of the thing it can eventually solve for versus the thing it can solve for today. And I think on the other extreme, we find folks who will look at it, and I’ve had many experiences with folks who have looked at GenAI and quickly dismissed it because they tried an experiment with something and that doesn’t work or they don’t like the reality that a machine may be really good at this stuff and are taking more of a head in the sand play.

And I don’t think either are great, I think they’re very natural responses but, ultimately, it’s going to be really good at a lot of stuff. And the more I think people, whether you’re an executive or you’re hands on a keyboard, the more you can truly leverage GenAI and very rationally where it stands today and what its strengths are, and what its gaps are, I think is amazing. It gives you the best position. I’m a big fan of using GenAI. We use it, I use Chat GPT on a bunch of marketing and media and messaging. I use GitHub and Copilot X, we were talking about this earlier. I was up to one in the morning coding last night. And so I’m very thankful to have Copilot with me when I’m coding. I think there’s very great uses of these technologies already today, and I think you’re going to continue to stack and add more value and give us more leverage.

Paul Lacey: And I think we definitely saw that as we looked into some of the use cases yesterday, and we heard from people at different levels of the organization talk through some use cases and what have you, and it’s for those that are listening that are trying to think about, “Okay, great, where should I plug this in my stack?” We heard that it’s great at documenting code, it’s great at documenting what code does, but don’t think it’s going to document why the code is doing what it’s doing. In my organization, a definition of a customer is blah, blah, blah because of these reasons, Chat GPT, whatever LLM you’re using, it’s never going to be able to read a code and say, “Hey, this is why they’re doing this really weird join.”

Sean Knapp: I feel like you’re trying to get, at least given the technology today, getting it to explain why the code is doing something, it seems really dangerous that is inviting a hallucination as there’s no world in which it has a backdrop of knowledge and learning and training who actually answer those. It would purely be a guess, or hallucination in this case, that might actually be fun just to experiment with, not because I would expect any reasonable result. I mostly just for pure comedic effect. I think this goes back to the why humans still play such this really key part of the value chain. And this is one of the comments that I made earlier, which is I’m so excited, personally, for GenAI because I feel that we get to spend maybe 10% of our day or we use that old saying, you use 10% of your brain, but I do think it’s true, our average clock speed as we go through the day feels massively throttled down, because whether we’re in meetings or having amazing conversations like we are here, or we’re writing data pipelines, we’re generally limited by the laws of physics, and the very specific ones are how fast can your mouth move and how fast can your hands type?

And anything that actually can remove those limitations such that your brain can run at a much higher clock speed and execute to the ideas that you have, I think is amazing. If we can get to a future state where we all get a run at a much higher fox speed and aren’t throttled back by these limitations, I just think the things we can create or the extra time you can take on vacation. Either way all sounds amazing if we can do which through that path aggressively.

Paul Lacey: And I think the thing that gets us the leverage in that respect is, as you’re saying, being able to have it generate stuff as we specify what needs to be generated. So moving from leveling up from that hands-on keyboard, I’m actually designing the system, I’m architecting the system and letting the system be built according to specifications that I’m setting. And then I’m QAing the build of that system, I’m dropping in and making sure that it is actually doing the things that I’m asking it to do, so I have to know what it’s doing enough to stay ahead of it, but I’m still in the architect seat. And that definitely compares to what Ascend’s view is when it comes to declarative programming of data pipelines. Like when we think about there used to be a world where you used to have to specify the what and the when and the how of every single step of your data pipeline, and we talked about this last week.

You have these orchestration technologies that exist purely for this reason to specify that what needs to get done when and in what order, and you build up this massive system around this and then all of a sudden you realize that you’re stuck. Every time you want to make a change it feels like taking a Jenga block out of a giant tower and hoping nothing falls, because it’s a massive thing that you could never possibly completely understand from end to end. No single person could. And yet this is how we have to do things in a traditional imperative world, versus the declarative world is, “Hey, I’m just going to tell you where I want to go.” I’m going to define the model. This is the model, this is how I want things to be set up. And you figure out when and how often and whatever to make sure that every single new piece of data that comes in is compliant with this model. And then all I have to do is just make sure that the logic stays good, and then I know that everything else is going to stay good underneath the hood. So it’s very compatible way of thinking about things.

Sean Knapp: I wholeheartedly agree, and this is one of the things we have been observing too, and to me it’s this whole exercise of we have to untrain a lot of the habits we’ve learned, whether it’s generative AI, whether it’s moving to declarative approaches and automation forward approaches to the pipelines. Because when we generally think of how that thought process goes, our brain always start with an outcome. Otherwise, why would we be doing anything? We’re trying to affect some outcome in this very task oriented world that we all exist in, and data engineering is no different. We quickly rush through that outcome identification stage and immediately start doing the things and building the things and designing a pipeline to do all the little incremental steps, for example. And what I think we’re finding in this era of automation and AI, is machines are good at the tasks, it’s the how do we retrain our brains to pop back up and not just at a subconscious or slightly conscious level quickly articulating it to ourselves, but how do we actually retrain ourselves to zoom back out and spend more time thinking through the outcome, the data model, the performance characteristics of the data, what the data means itself.

And what happens when we start to do that is we get a lot more clarity around why we’re doing what we’re doing matters. And then moreover, it feeds these systems that are so good at automation. Just like what we see with generative AI, you can articulate and embrace creativity and curiosity and properly articulate an outcome you’re looking for. You give that underlying system so much power to really go and work magic on your behalf. Automation for data pipelines is the same thing. It really comes down to do you have a technology that allows you to articulate that sufficiently, and then you slow down and get out of the details enough that you can actually architect, as opposed to just implement and plugging pieces together. So it’s a very similar thought process to me probably why I’ve been in declarative highly automated data pipelines for, God, eight years plus another almost decade before that, that’s doing data engineering too. So of course I’m very excited about this.

Paul Lacey: And when you see the two things converge, it’s very exciting too. I know everybody would love to get to that panacea where you can say, “Hey, get my Salesforce data in and then normalize it with my marketing automation system and create me a customer data model.” We’re not there yet, but you can imagine that if we had a data pipeline automation system that a Chat GPT could plug into and just say, “Hey, here’s the model code, go ahead and run the pipelines to build these models for me please. And I don’t really care how it runs. I don’t want to have to define all the orchestration and all the dags and all that stuff. Just I’m going to write code. You run the pipelines and get the data in the shape it needs to be in.” It starts to feel more realistic, doesn’t it?

Sean Knapp: Yep, it absolutely does.

Paul Lacey: So then what are the layers of the stack? I know this is what we were promising we would talk about today, Sean. Coming onto it towards the end of the program. So if someone was going to create this system that we’re imagining where you can literally just write the models and then have the entire system automated around that model code and the DAGs are self-generating, and a lot of the stuff people would be familiar with from existing technologies that allow you to do pieces of this. But to do it from end to end, from basically ingesting the data from wherever it’s coming on a regular basis to reverse ETL and the data out to where it needs to be on the other end of all these models being run and kept up to date, what does that look like? What are the basic building blocks in your mind of that kind of a stack?

Sean Knapp: Yeah, I think we have a lot of the things right from a … When I think of the verbs of data engineering and what are the things I have to do, we’ve got to ingest some data, transform it, deliver it, et cetera. I think we have a good understanding in the industry of that. I think some of the confusion in the space has been around this notion. I used to have this diagram, I would give in tech talks, around the differences between imperative and declarative from a technical perspective. And a lot of it comes down to which system is controlling what. In an imperative system, oftentimes we have a scheduling system, it really is up top, and not because it’s at the top of the stack and it’s where everybody spends all their time. They’re usually in a code editor, but the control starts from there. The scheduling system on a timer, a trigger, a sensor, whatever you want to call it, it basically says, “I need to go do something, let me run this DAG.” And which all work the same.

And the challenge then becomes, it goes and runs code, and runs code without an understanding of the side effects of that code, what the code’s doing. It just runs the code. And then that code that it’s running is usually the code that’s been created and designed inside of the build layer itself, the tools that we use. And when I think about the components of a declarative system and a highly automated system, it actually inverts this a bit. You’re still doing the same actions around ingesting, transforming, et cetera, but how you build and how you think about it is different by putting the developer on the top of the stack and their mental model of data, first data as product, and putting in rather than a scheduling system, a control layer that understands the connectivity between all of the models and the outcomes and the entities that the developer themselves is creating, and what actually is getting pushed down to the underlying engine to store and process and move data around.

And that’s why I think you need, really what it boils down to, is a control plane that sits between the developer model and the execution layer that better understands that linkage between code and data that is tracking all metadata that flows through the system. Every time you’re checking for new data, every time you ingest a new piece of data, every time you initiate processing of some part of new data that goes somewhere else, the key component really becomes that middle control layer that sits between the build layer and even the underlying engine. The next natural extension of this, and I know we have fancy diagrams of this on our website too, which is once you have this highly automated control air, you now actually can expose all these other operational things to a user because the control layer needs all of that metadata to do its part of the equation.

And so connecting into data reliability and data quality services, providing visibility into resource usage and costing greater observability into the performance of your pipelines themselves, everything that we traditionally see on an operations plane become a natural extension and a view into all the metadata that the control plane itself is able to collect. And so I think that’s why the traditional realm, we end up spending all of our time just trying to glue these things together because we’re writing all this code to protect against side effects. Whereas in the declarative world, we actually get greater visibility and greater control because we’re sitting really at the top of the stack.

Paul Lacey: Makes sense. So as I’m trying to build a mental model of this control layer controller of some kind that you’re describing, it feels like a scheduler on steroids. Is that a good way to describe it? It’s watching everything. It’s running things at different parts of the DAG at different times. How does that work?

Sean Knapp: In many ways I would describe the control plane as a schedule with battling with heavy anxiety, is the best way I could describe it. So a scheduler literally gets, and it’s funny, and I’ll even reference Kubernetes as a comparable technology here, but we call them schedulers usually when we start writing the code because that’s our frame of reference. Ascend has a scheduler, Kubernetes has a scheduler. What’s interesting is that they differ so greatly from this classic notion of a scheduler for data pipelines. For example, because a scheduler for a data pipeline only keeps track of tasks and runs on that time or trigger and then progresses through its DAG, if it hits an error or it stops, but it’s unaware of the side effects of most of the tasks that it’s performing, and it’s not monitoring the state of the system to dynamically generate tasks.

I think that’s the key difference. Is a scheduler just going to keep running your thing however you wrote it to, and if you want it to do anything dynamic in nature, you’re going to have to write all the conditional logic, the state management, et cetera to make it do them. When we think about a control plane and the scheduling logic inside of the control plane, why I say it isn’t a constant state of high anxiety is the job of a scheduler in a declarative domain-specific context to where control plane is. The job is to monitor the entire existing state of the ecosystem, all the data, all the pipelines, just like how Kubernetes monitors all the containers and deployments and resources, et cetera. And the scheduler’s given all of these rules, these models as to, “Hey, the data has to be in this place, having run through this code that matches this shot, et cetera, et cetera.”

And so the job of a scheduler is not to just go and say, “Hey, run the code in this order in this sequence, and if something breaks, stop or do something else.” The job is for it, every time a task is performed, to understand the side effect of that task, monitor the entire state of all of the pipelines, understand the impact of those tasks, and then dynamically generate the next appropriate task to continue to fulfill the model of the developer that they wanted the pipeline to do. And so this is why I say the scheduler in many ways is very high anxiety is, it’s literally just do a task like the state of the world. Does the state of the world match what you’re supposed to have the state of the world be? And if not, go find the next task and do it again and again and again. And that’s really what a declarative control plane does, but that’s the whole power of it, is the sheer ability for it to dynamically respond is unmatched and simply not possible in a imperative construct. Everything comes down to simple lines of code. You would have to write a lot of code to make an imperative system, even fractionally resemble the benefits of a declarative.

Paul Lacey: So what I’m hearing is it’s the Xanax for your data pipelines, or maybe the Xanax for your data engineers.

Sean Knapp: I would say, yeah, the data plane takes care of the anxiety part, or the control plane takes control of the anxiety parts. Hopefully the data engineers don’t have to worry about that, so they have a lower anxiety job. That would be nice.

Paul Lacey: Yeah, so I think it’s starting to make a little bit more sense when you talk about the connection between the data and the code, because this controller needs to now know not did this pipeline run? Did it run successfully? When was the timestamp of the last successful run? This controller also needs to know, and does the data look like it should look like after it has gone through this particular step of the pipeline, or not? And so that’s how it can detect. That’s one of the things that’s really interesting about the Ascend platform is you can actually pause and unpause pipelines in the middle of a pipeline, and it will restart from that point forward. It doesn’t have to go all the way from the beginning and completely rerun and reprocess, even if the data’s right shape and form, it reprocesses everything. And it also has to know if I get a new packet of data that comes into the system, say it comes in through a pipeline or through a connector that’s processing and increase new partition of the dataset, it needs to flow that partition of the dataset through the entire network of pipelines, but only that partition of the dataset.

It doesn’t need to reprocess all of the data that’s in all of the partitions across each stage of the pipeline, and it needs to understand which partition is coming through that needs to actually be reprocessed versus all these other partitions are okay, so don’t mess with those, right?

Sean Knapp: Exactly. And as a developer, we’re freed from all that complexity, having to figure out, well, what do we do with late arriving data? All that just goes away.

Paul Lacey: You keep saying metadata, and I know metadata means so many things to so many people. It’s almost as obscure of a term as automation nowadays. It’s just like, “Yeah, of course you need metadata.” It’s like, “Oh yeah, of course. Let’s get a little bit more metadata in there and sprinkle that little pixie dust over it and should work better.” But it feels like it needs to know so much about these data sets. How is it cataloging all this stuff?

Sean Knapp: Well, I think that’s why I love the saying, which is metadata is the new big data, which is really true. You need to collect so much data. If you want to automate how a pipeline behaves, you have to collect so much metadata for every partition of data that comes in the distribution factor of the data and the columns, the amount of resources that were required to ingest it, where that ESO partition of data went and the entire lineage of it. If the upstreams change, does it need to actually propagate and mutate the downstream data? If you get A-D-D-P-R data deletion request, propagate those deletes through at least until the personal data is gone. And so all of these things, if you just think of all the things as a Data Engineer, you would go through and inspect the pipeline and track, well, what percent of my data is late arriving? How many resources? How much do I need to go to in the JVM of my job? Et cetera, et cetera.

All these things are the same things that an automated engine needs to track in even more. But we get all this benefit in doing so all the way out to, if you actually know how many resources were consumed per job, you can start to do even smarter distribution of workloads across warehouses or clusters based off of historical norms. And then you start to do things like anomaly detection around resource consumption. So once you start to track all this metadata, the whole world opens up to automation, which is very, very exciting. And that’s why I think we’re still just on the cusp and in the early days, I would say, of really harvesting a lot of these amazing benefits.

Paul Lacey: So maybe a closing thought here, Sean, for those that are listening and thinking, “Oh, that sounds cool. I had like to build me one of those,” where would they start? With Ascend, we’ve been working at this for quite some time. We’ve built up a tremendous amount of internal expertise and intelligence around this. How would you do the build versus buy comparison on this kind of technology?

Sean Knapp: Yeah, don’t do it. It’s a trap. One, I would always say it’s hard work. I think a number of teams have tried this for a very long time. There’s so many companies I’ve gone into and who are incredibly interested in Ascend, and when we show them all the power of the declarative model, they’re like, “Oh my God, this is the thing I’ve wanted to build forever and just never had the time or resources or ability to focus on just this.” Because usually you see teams, these really advanced data engineering teams who are actually trying to do it, but in a company where that’s not the company’s job. It’s you’re building a data platform that’s servicing the needs of the company and you can do some small open sourcing projects here and there, but you can’t really build a whole platform like this that’s that powerful.

I don’t want to scare folks. You should totally try. And then if you get tired of that, always remember Ascend is hiring. So that’s another good thing. If you want to go build one of these yourselves, feel free to reach out to our recruiting team. And then still, if that’s not your jam and you’re really interested in it, I do recommend people, we’ve written white papers on the model around how we actually track. I have a bunch of talks on this too, where we use a lot of heavy shock calculations where we fingerprint code and we link it to data. They input really fancy Merkel trees underpinning every component. So we can do really rapid Shaw calculation and integrity checking on the order of hundreds of millions of partitions and how we can quickly determine if something is truly up to date or not.

So there’s a lot of really fancy things you can do. But the start I would say is one, read the white paper, it’s pretty good. And then two, take that step back and just think as a Data Engineer, what are all the questions you have to answer on a daily basis when you’re writing a new pipeline? And then I usually recommend go think about all the metadata you have to collect, then think about how you would build a highly general system to answer those questions. And you’re probably up to a pretty good start, in a few years of development effort or more, but then you’ll be off to the races.

Paul Lacey: And just in time for you to move on to your next role and do it all over again.

Sean Knapp: Did I mentioned Ascend? We’re hiring, so …

Paul Lacey: You did. And that’s a great plug there, Sean, too. Yeah, so folks can hit up ascend.io/careers, I think, and see what we’re looking for at the moment. That’s great. And phenomenal advice there, Sean, too. So I appreciate that. Yeah, we are, and that’s our goal with this program is not just to talk specific to Ascend Technologies, but to help folks that are trying to automate their data pipelines, figure out how to do that and how to do that at scale. Well, thanks for joining for that one, Sean. This has been a really fun conversation, and can’t wait to have more of these in the near future. So for those of you that are tuning in, please feel free to subscribe on whatever platform you like to follow your podcasts on, and we’ll continue to give you some more advice on how you can achieve leverage in your data engineering roles. Look out for more of that.

Sean Knapp: Awesome.

Paul Lacey: Thanks a lot, Sean.

Sean Knapp: Thanks Paul.

Ep 22 – Generative AI In Data Engineering

About this Episode

Transcript