The Data Maturity Model: Where Most Organisations Actually Are
Ask an organisation to self-assess their data maturity and most will place themselves at level 3 or 4 on a five-point scale. Look at their systems and talk to their data team for a day, and the evidence usually puts them at level 2. This gap isn't dishonesty. It's a combination of optimism, definitional inconsistency, and the fact that data leaders are evaluated on progress narratives — which creates a strong incentive to report the destination rather than the current position.
The gap matters enormously for how organisations spend their data budgets. A team that believes it's at level 3 invests in level 4 initiatives. Those initiatives fail not because the technology is wrong or the team is incompetent, but because the foundation they assume — reliable data, stable pipelines, business trust in the numbers — isn't actually there. The investment disappears into a foundation that was never solid, and leadership concludes that data investment doesn't deliver returns.
Getting the self-assessment right is, in practice, the first and most important step in any data strategy.
What the Five Levels Actually Look Like
Maturity frameworks are usually presented as aspirational. The more useful version describes each level as it exists in practice — not as the framework author idealises it, but as you actually encounter it.
Level 1: Reactive
Data exists in source systems — the CRM, the ERP, the application databases. Reports are produced manually when someone asks for them, usually by whoever has database access or knows how to write a formula in Excel. There is no data team. The data platform is Excel. When a question needs answering, someone pulls a CSV and pivots it. The work takes hours or days, the methodology is inconsistent between people, and the numbers are frequently disputed because different people ran different queries at different times.
Most organisations recognise level 1 when they see it described. They are less willing to acknowledge when they are at it.
Level 2: Structured
A data team exists. There is a warehouse or a BI tool. Reports are scheduled rather than ad-hoc. The data team has built some pipelines and some models. This looks, from the outside, like a functioning data operation — and it is, in a limited sense. But the underlying reality is usually that data quality is inconsistent, models break when upstream schemas change, and the data team spends the majority of its time on pipeline maintenance, incident response, and answering "is this number correct?" questions rather than on analysis or new capability.
Business teams use the dashboards, but they don't fully trust them. Every significant decision triggers a round of data validation before the number is accepted. The data team is busy, often overwhelmed, and producing a lot of output — but the organisation's relationship with its data is still fundamentally fragile.
Level 3: Proactive
The data platform is stable. Pipelines are reliable enough that failures are exceptions rather than the norm. Business teams trust the data well enough to make decisions based on it without a validation round. The data team has capacity — real capacity, not just aspirational capacity — for new initiatives rather than spending everything on maintenance.
The distinguishing property of level 3 is trust. Business users accept the data without reflexively questioning it. That trust was earned by consistency and transparency, not by the data always being perfect. When something is wrong, the data team knows before the business does, and they communicate it proactively. That's what actually builds trust.
Level 4: Predictive
Historical data is used to model future outcomes. Machine learning models are running in production, not just in a notebook. The organisation can answer not just "what happened?" but "what is likely to happen?" The data team includes people with modelling skills alongside the engineers and analysts, and there is a production ML infrastructure — not just an ML prototype.
Level 4 requires level 3 as a genuine prerequisite. ML models fed by unreliable data produce unreliable predictions. Many organisations skip to level 4 investments while still operating at level 2, which is why so many ML initiatives fail to reach production or produce value once they do.
Level 5: Self-Optimising
Systems use data to adapt their own behaviour. Feedback loops between decisions and outcomes are automated. The organisation doesn't just predict what will happen — it adjusts its own operations based on those predictions without manual intervention. This is genuinely rare. It exists in some form at companies for whom data is the core product, not a support function. It is frequently claimed by organisations that have a recommendation engine or an automated email campaign, which is not the same thing.
The Common Misclassification
The most frequent misclassification is level 2 organisations that believe they are at level 3. The indicator they cite is infrastructure: they have a data warehouse, they have dashboards, they have a data team. These are necessary conditions for level 3 but not sufficient ones.
The test is trust. Do business users make decisions based on the data without first asking whether it's correct? Do they pull dashboard numbers into board presentations without a validation round? Do they escalate when the data looks off, or do they accept the data and question their own assumptions? If the routine response to a surprising number is to question the data rather than the business reality, the organisation is at level 2 regardless of its tooling.
A secondary test is time allocation. How does the data team spend its time? If the majority goes to maintaining existing pipelines, investigating quality incidents, and answering "is this right?" questions, the platform is not stable — it just looks stable from the outside because problems are being caught and fixed before they escalate. That's valuable work, but it is level 2 work.
Why Correct Assessment Is the Strategy
Data strategy conversations tend to start with ambition: what capabilities do we want to build, what does our data platform look like in three years? The more productive question is where you actually are now, because that determines what you need to build next.
An organisation at level 2 that spends the next year on AI initiatives will end up at the end of the year with expensive AI initiatives that don't work and a level 2 data platform. An organisation at level 2 that spends the next year on data reliability, quality, and team trust will end up at level 3 — and will then be positioned to do level 3 and 4 work that actually delivers. The second path is less exciting to present, and it is almost always the right one.
The practical test: look at how your data team actually spends its time. If more than half goes to pipeline maintenance, incident response, and data validation questions, you're at level 2. That's where to build from. The strategy question is how to get to level 3, not how to get to level 5.
Written by ATHING
We design and build data infrastructure, automation pipelines, and AI systems for organisations that need them to work.
Talk to Us