Unlocking Business Value and proving the value of data teams

The Power of Operationalising Analytical Systems

Jun 19, 2024

The last few months I’ve been busy onboarding in my new role and wrapping my mind around many strategic aspects related to data engineering. Two topics (of many) that I’ve been giving some thought is “unlocking business value” and “operationalising analytical data” and that those in fact are tightly connected. In addition, several recent posts and events helped pushing me to finally write this post:

The last few weeks I have read a number of posts about data teams having to prove their value, justify their existence and stay relevant for the business. I think the authors are absolutely correct, especially since the economical conditions and priorities (revenue → profit) have shifted and the industry is under pressure from the expectations and developments in GenAI.
At the same time I’ve seen the topic about productionizing your data being raised several times (check out Databricks keynote for example) driven partly by the GenAI wave.
I’ve had several discussions with colleagues about if and how the data teams should support product teams that want to feed analytical data into operational systems and use cases.
I was part of a discussion about operationalising analytical data with peer data professionals from tech companies in Stockholm at Twirl’s data breakfast event.
Lastly, I got a demo of Synq.io by Petr Janda and how they help data teams operationalise analytical data by addressing the challenges with identifying and managing issues and incidents that arise in their systems and how to resolve them while effectively keeping stakeholders updated on the status of the issue. (I know the pain and that it is common among data teams)

Engineering vs Design vs Data

If we agree that data teams need to carry their own weight and make it clear for the organisation that they do that, how much weight do you need to deliver? Obviously the value you need to prove to justify the investment in data and analytics in an organisation depends on how much the organisation has invested in that. A good start is to understand the profile of your tech, product and data functions.

Mikkel Dengsoe has done a terrific job to compare FTE ratios between design, software engineering and data roles in European tech companies and categorised them into three clusters. If your company is categorised as a data-first company the pressure to prove your value is even higher than others, but it could also be that you already have solved the equation. If you haven’t already calculated your FTE ratio, I suggest you do and see where you are positioned.

In addition to the cost of employees, the cost of the analytical system is often non-negligible and I’ve seen examples where it is on par with the operational system. Hence, practicing FinOps is definitely complementing the operationalising of analytical data to make a sure the investment in data and analytics deliver good ROI.

Operationalise your data flywheels

All the mentioned factors above combined put pressure on your data capabilities to prove your value by not only provide insights for human decision makers, but also enable operational use cases to improve the product experience or process efficiencies and doing that in a scalable way.

As you know, I like the mental model of data flywheels where learning, decision and improvement can be made by humans and/or computers. When there are mostly humans in the loop it is usually hard for companies to recognise and attribute the value from an improved process or customer experience via a decision based on an insight from interpreting a report or dashboard that is served by your data and analytical system.

However, the value of putting a computer in the loop and feed operational systems with analytical data often makes the value more visible for the organisation and it also creates value continuously and it is automated. Another positive effect is that data producers suddenly become data consumers and that helps raising the awareness and ownership of data quality. Once the workflow is up and running and generating value, any issue in the analytical system will be felt immediately and remind the organisation about what key part data teams plays in the value created by the data flywheel.

But creating value isn’t only about how the data is operationalised but also how it is served, if the cost of enabling the data flywheel is higher than the value of the improved product or process nothing is accomplished, so make sure to do it in a lean and scalable fashion.

But why are data teams reluctant to support operational use cases?

Data teams are often reluctant to support operational use cases for several reasons, a few that I can think of include:

Burnt by past experiences: Previous attempts at operationalising data may have led to failures or significant challenges, making teams wary of similar initiatives.
Data quality concerns: There may be a lack of trust in the quality and stability of the data, which can lead to hesitance in using it for critical operational purposes.
Resource constraints: Supporting operational use cases often requires additional resources, such as staffing for on-call duties and more robust infrastructure, which teams might lack.
Alert fatigue: The fear of a "Cambrian explosion" of alerts, issues, and incidents can overwhelm data teams, especially if they lack the proper tools and processes to manage and remediate these problems efficiently.
Accountability and ownership: There can be a reluctance to take on the responsibility and accountability for operational data, which often requires stringent service level agreements (SLAs) and continuous monitoring.
Complexity and risk: Operationalising data can introduce complexity and risk into data systems, which can be daunting without proper processes and mature systems in place.
Stakeholder expectations: Managing and meeting the expectations of stakeholders while ensuring timely updates and resolutions can be challenging, adding to the reluctance.

(Hot take) In addition to the above, I also think there may be a cultural aspect as well. If you ask data professionals (me included) what is important for them when considering a job they often answer that they want to make a difference/impact, create value for the business, feel the pulse, etc. But when all is said and done they are also quite comfortable and enjoy working with systems and processes that aren’t considered business critical.

Embrace ownership and support operational use cases to demonstrate direct value

So how do you embrace operational use cases? This should be done in a controlled manner, starting with use cases that don’t require SLAs on the data platform that you can’t guarantee, for example data teams may not be on-call so the operational systems consuming the data may have to cope with days of stale data. You also need to make sure to establish clear ownership, data contracts, and (declarative) observability to ensure reliability and transparency:

Controlled Introduction of Operational Use Cases. Begin with operational use cases that have less stringent SLA requirements to manage risk and build confidence. Many data teams don’t have on-call outside office hours, hence gradually expand to more critical use cases as staffing, processes and systems mature.
Establish Clear Ownership. Assign clear ownership for each operational use case to ensure accountability and consistent management.
Data contracts. Implement data contracts to set clear expectations and prevent breaking changes in data activation jobs and models.
Lineage. Maintain detailed lineage of data activation jobs and models to track data flow and dependencies.
Push over pull. Preferably publish (push) data to data output ports for operational systems to consume rather than having external systems pull data directly from the data warehouse.
Observability. Set up observability for all operational use cases to monitor performance and adherence to SLAs. Provide timely and transparent alerts to data consumers and keep them updated about the status of the issue/incident.
Iterative Improvement and progressively supporting increased criticality of use cases. Continuously assess and improve processes based on feedback and performance metrics. Scale the support for operational use cases progressively, ensuring stability and reliability at each step.

Some successful examples of operationalising analytical data I’ve seen in my career are:

Forecasting the required staffing of warehouse workers to make sure it isn’t over- or under- provisioning workers, saving millions while ensuring all orders are picked and hence resulting in a good customer experience. In fact the forecasting model was one of the simpler we had but the outcome was extraordinary, it doesn’t have to be complex to be valuable, you just need to connect the dots.
Input to predictive generation of purchase proposals reducing waste while ensuring products always are in stock and hence generating a better customer experience and higher revenue and customer retention. It will also reduce the manual and repetitive work for the supply purchase department. Warning: We set very clear expectations early on (cope with 3-5 days of potential downtime/stale data), but as the stakeholders realised the value and it became such an integrated part of their workflow that the expectations drifted from what was agreed and if we had issues in the analytical systems we got pressure and frustration channeled the very same day (early morning).

The alternative is worse - shadow IT

The alternative to embracing and supporting the operationalising analytical data is worse, because it will happen anyway but without your control and awareness, for example;

a team will extract data via google sheet that is then consumed by another system.
a team will use a service account that was created for one purpose but suddenly is used for many others.

These shadow IT-solutions will more or less be invisible for the data teams and without knowing about the downstream dependencies there will eventually be breaking changes or data issues that will come back to the data teams to assist resolving.

Your experience, tips and practices?

Now I’m curious to know your experiences from operationalising analytical data and different ways of not only creating business value but also make it apparent for the rest of the organisation. I’m very grateful if you want to share your experience, tips and practices in comments.

Do you agree or disagree with operationalising analytical data? Why?
What are good practices and bad?
Can you share good and bad examples?

Summary

Data teams must embrace operational use cases to prove their value and enhance business impact. By starting with less critical use cases, establishing clear ownership and having systems and processes in place to detect errors throughout your data stack, identify affected stakeholders, manage incidents and resolve them, teams can manage risks and build confidence. Without taking control, ad-hoc solutions will emerge, leading to even greater issues. Operationalising analytical data is essential for sustained relevance and efficiency in the evolving business landscape.

The next post will be a bit more technical but still lightweight example of how to enable data activation.

I'm committed to keeping this content free and accessible for everyone interested in data platform engineering. If you find this post valuable, the most helpful way you can support this effort is by giving it a like, leaving a comment, or sharing it with others via a restack or recommendation. It truly helps spread the word and encourage future content!

Data Platform Engineering

Discussion about this post