Discover the top 3 data mesh challenges and effective strategies to overcome them for successful data mesh implementation.
If you work with data, you'll have come across the term data mesh by now. This decentralized but interconnected approach to structuring data has become increasingly popular since the term was coined by Zhamak Dehghani 4 years ago.However, while data meshes have significant advantages for scaling up your data operations, the approach comes with its fair share of challenges: Michele Goetz, VP and Principal Analyst at Forrester, calls it the "data mesh blind side."
In this article, we'll break down the main challenges of moving to a data mesh architecture and briefly explore how to tackle them.
First, a quick primer on the data mesh concept. Data mesh is a way of thinking about and organizing your data to move data ownership and accountability closer to the end users.The core tenet of the data mesh is to distribute responsibility and governance of your data across different business "domains". This is the opposite of having a single, monolithic data architecture managed by a centralized data team.These domains are loosely coupled teams that own and manage data and pipelines in their business unit. For example, human resources data might be owned and managed by the human resources domain on a particular data platform, while the sales data domain is managed by another team on a different platform.
The term "mesh" comes from the fact that all these data domains remain interconnected despite potentially running on different technology stacks. There's no return to the old days of siloed data warehouses. Rather, the goal is to move ownership and responsibility of the data closer to the subject matter experts who understand it, while maintaining a unified data catalog of data products that can be referenced by any other builders in the organization.The end result? A more scalable, secure, and speedy distributed architecture that helps you build and maintain interdependent datasets at scale.
There's no such thing as frictionless business transformation (no matter what the vendors tell you!). Data meshes are no exception. If you're used to operating with a centralized data platform, there will be some challenges when moving to a data mesh.Here are a few of the most common issues with data meshes:
Or, to put it more bluntly—office politics. As with any major change to the way you do business, it's only going to work if everyone buys-in. Here are a few of the hurdles you'll need to jump over:
Matteo Vasirani, a Senior Manager of Data Science at developer platform Github, suggests that the key for securing buy-in from the various stakeholders is to "show them the carrot"—the positive outcomes that will result from moving to a mesh framework.Matteo found that incentivizing data users with "what's in it for them" is more effective and realistic than simply attempting to mandate the move from the top (the "stick:).
Some of the key benefits to moving to a data mesh architecture that are likely to appeal to end users and business leaders alike:
To quote Sharad Varshney, CEO of OvalEdge, a data governance provider,
"Data mesh architecture delivers a scalable, affordable solution that enables you to work with more trusted data, more quickly, whilst taking pressure away from your IT and Data Teams. This enables them to focus on more business-critical tasks."
Sharad Varshney, CEO of OvalEdge
In other words, a data mesh makes it easier and faster for the end user to get their hands on the data they need to make business decisions—and who wouldn't want that?
When you move to a data mesh, you're assigning responsibility for the quality of the data to the data domain owners, instead of a centralized data team. That means that your data quality is dependent on multiple teams who may not know each other, and who don't necessarily share priorities or even a common set of terminology.Without taking this into consideration before implementing a data mesh, you risk running into quality control issues, cautions Vasirani. "A data producer might change something and then you suddenly notice a dashboard is failing downstream, and then you'll need to reverse engineer the issue." Essentially, you're risking scaling up your problems along with your data architecture.
To avoid degrading the quality of your data when you move to the mesh, you'll need an execution model that defines what downstream consumers can expect from any data product they consume. The industry has started calling these data contracts, but regardless of the terminology the concept has existed for decades. Just look at the amount of documentation and governance that surrounds an API.They key to success with this approach is the addition of a product management overlay to each data domain. It is the PM's responsibility to understand the use case that each shared dataset was created to address, and ensure that future changes do not compromise the guarantees needed to fulfill these use cases. If a new use case is introduced, often times it will require the creation of additional data products instead of modifying an existing one and breaking contracts.You may also want to consider training to make sure that everyone involved in working with or inputting data understands the consequences of unplanned changes.Finally, you'll need to be sure that the domain data owners are incentivized to keep the quality of their data high—both by reminding them of the positive business outcomes, and possibly by assigning data-quality KPIs to data product owners.
A data mesh is not a substitute for a central, unified data fabric or cloud data platform. Introducing ownership of the data at the domain level does not mean moving back to completely siloed datasets without understanding the big picture.Data silo-ing is a real risk when implementing a data mesh architecture, especially if you're building on home-grown technology that wasn't built with a mesh in mind. Some have proposed requiring every domain to use a siloed slice of the current monolithic infrastructure. But this can be challenging if many parts of the business are already using their own specialized cloud services. Let's face it, the multi-cloud world is not going away anytime soon.
To be successful, your data mesh must follow these key principles:
A unified platform for intelligent data pipelines can help create a solid foundation for the transition to a data mesh and make it easier to achieve these key principles.If you consolidate all of your data pipelines into a single platform, you can:
There are many benefits to a data mesh approach, but before you move towards implementation you'll need a plan in place to address the main challenges:
Ascend is building the first multi-cloud live data-sharing framework for data mesh. Our intelligent data pipelines detect and respond to changes, ensure data quality, and efficiently use cloud resources—so you can spend less time on implementing your data mesh architecture and more time reaping the benefits. Ready to take a look? Set up a free demo and see how our data pipeline automation can make accelerate your transition to data mesh.
Additional Reading and Resources