We sat down with Rheta Du Preez, Managing Partner at Monocle and BCBS 239 subject matter expert, to discuss the latest pain point for European banks’ data management capabilities – technical data lineage.
Recent supervisory communications from the European Central Bank (ECB) have reiterated that progress in addressing deficiencies in risk data aggregation and reporting remains insufficient. A recurring theme is the inability of institutions to demonstrate clear, end-to-end traceability of risk data through technical data lineage.
But beyond compliance, we asked Rheta, what does strong data lineage really mean, and how can banks unlock long-term value from it?
Data lineage tells you where your data comes from, how it changes along the way, and how it ultimately appears in risk and regulatory reports. With robust data lineage in place, you can achieve visibility of your data processes from authoritative sources all the way through to the point when data is presented to executives to inform strategic decision making.
In practice, this is achieved by taking a key risk indicator (KRI), as an example, and reverse engineering it down into its critical components. We identify all the input sources, transformation logic, aggregation processes and other applications that feed into these components
A key distinction must be made between business and technical data lineage. From a business perspective, stakeholders understand metrics conceptually, for example, balance sheet exposure or average account balances. Technical lineage, however, must demonstrate how those concepts are represented within systems, critical data elements, how they are transformed, aggregated, and reconciled across applications, and how they ultimately appear in reporting outputs. It is not sufficient to align naming conventions; institutions must evidence that the underlying data definitions, calculation logic, and transformation rules are consistent and controlled across the architecture.
This is very often an area where institutions really struggle – bridging the gap between business and technical understanding. While business might have a good understanding of how their data flows, aligning this with technical data lineage can be challenging as it requires connecting logical critical data elements to the physical attributes found in each system.
Firstly, we need to be realistic. This is a monumental task. That is precisely why banks must rely on automation and avoid approaching this through manual effort which is resource intensive and extremely difficult to maintain. There are data lineage tools available that can connect to data sources and platforms to automatically scan schemas, code, logs, and metadata. Using pattern recognition and metadata mapping to infer relationships, these tools are then able to generate end-to-end lineage across your organisation’s data architecture.
Where it becomes valuable, is if data lineage is truly embedded into all processes, allowing you to understand and react when an incident occurs. For instance, when something goes wrong with a particular attribute, the key question becomes how quickly you are able to identify which systems, applications, or processes are dependent on that attribute and, therefore, how efficiently you can rectify the issue.
Additionally, data lineage plays a valuable role in application changes. By using their technical data lineage, banks can easily identify which downstream or upstream systems and processes are impacted by a proposed change, which adds significant value from a testing and release change perspective.
What we see in the industry is that technical data lineage, despite the maturity of many existing solutions, is treated as more of a regulatory tick box exercise, especially considering the ECB’s regulatory push to get the basics right. Unfortunately, many institutions treat lineage purely as a regulatory remediation exercise, rather than embedding it as a sustainable risk management capability.
European supervisors have made clear they expect lineage from authoritative source systems through aggregation and transformation layers to final reporting outputs. Supervisory guidance emphasises the need to evidence lineage for material risk reports and the critical data elements that underpin them.
This is often extremely challenging. Automation is critical; however, the effectiveness of automated lineage extraction depends heavily on the underlying technology landscape. In legacy environments, particularly mainframe platforms, automated scanning may be partial or incomplete, often requiring supplementary validation and governance controls.
When defining scope, prioritisation is essential. Many institutions begin with authoritative source systems and key aggregation layers, particularly where data definitions are relatively stable and consistently applied. This can provide an initial level of traceability. However, without a structured plan to extend coverage across transformation, integration, and reporting layers, lineage will remain partial and of limited value.
The integration layer typically presents the greatest challenge. This is where data from multiple source systems are merged, transformed, filtered, and aggregated to create consolidated views. The complexity lies not in standardisation alone, but in evidencing how transformation logic, business rules, and derivations are applied, and ensuring that these processes are transparent, controlled, and reproducible.
Within banks, we know that processes, data, transformations, models, and calculations change on a monthly, if not, daily basis. In a bank’s data and application landscape, the only constant is change. Banks must embed lineage maintenance into formal governance and change management processes to ensure that traceability artefacts are periodically refreshed and remain aligned to system changes.
One of the best ways to ensure that technical data lineage remains up to date is to integrate it into the change control process, so that a change cannot go live unless the technical data lineage has been rescanned for that application. However, it’s a balance between compliance and agility. If rescanning cannot be performed quickly enough, overly restrictive policies may delay the delivery of critical data to stakeholders.
At the other end of the spectrum, you can also have a loose, policy-driven approach - for example, requiring technical data lineage to be updated periodically i.e. at least once per year.
From a governance perspective, the optimal approach is often a hybrid one. This would involve a combination of enforcing rescanning change control processes where possible, while allowing urgent changes to proceed when necessary. In such cases, a compensating control should ensure that technical data lineage has been refreshed and updated into the solution.
At scale, sustaining comprehensive and auditable lineage through manual processes alone is extremely difficult. Automation has therefore become a practical necessity for large, complex institutions. Since the publication of BCBS 239, technology capabilities in this area have matured significantly, enabling institutions to reconstruct and visualise data flows across increasingly complex architectures. However, technology is only an enabler; the effectiveness of lineage ultimately depends on governance, control integration, and disciplined maintenance processes.
Technical data lineage tools typically extract structural and transformation metadata from source systems, databases, and data processing platforms. By parsing SQL, ETL logic, and other code artefacts, they reconstruct field-level dependencies and data flows across the architecture. However, raw extraction alone is insufficient. The real value lies in contextualising these relationships, enabling institutions to visualise how data moves and transforms across systems, and to identify where aggregation logic, derivations, and controls operate within that flow. This visualisation capability is a significant driver of the value delivered by technical data lineage tools, and if you then overlay data controls onto that view.
Ultimately, this is the desired end state - the ability to see how data flows through systems, to demonstrate the controls in place at key risk points, and to provide stakeholders with confidence that the data they use can be trusted.
I think one of the most important things for me is to open the dialogue with your joint supervisory team (JST) as early as you possibly can. We fully understand - and the banking industry understands - that there are constraints when it comes to technical data lineage. The timeline that the ECB has set for BCBS 239 compliance is placing significant pressure on these projects. As a result, many project teams are tempted to take shortcuts, implement the bare minimum and focus solely on meeting regulatory expectations rather than on unlocking real value, which represents the worst possible outcome.
Instead, I would recommend an open and transparent dialogue with the regulator. This allows you to focus on the most important parts - prioritising areas that make sense for you as an organisation, making decisions based on materiality, and applying a risk-based approach. A risk-based approach requires institutions to identify the data flows underpinning material regulatory reports and balance sheet exposures, assess where aggregation complexity and transformation risk are highest, and prioritise those areas for demonstrable traceability. Taking a proactive approach is a much more favourable alternative than simply rushing to comply.
This approach, however, is not possible without meaningful engagement with your regulator. Aligning on the approach early and ensuring that your regulator is comfortable with that approach will enable you to unlock significantly more value over time and ensure a sustainable solution and not just as a once off manual exercise to tick the compliance box.
Principles must be converted into practice. With over ten years of implementation experience, BCBS 239 has been a significant aspect of Monocle’s consulting expertise since the principles were published in 2013. At Monocle, we perform a variety of functions, including project management, business and technical analysis, and facilitation with regulators.
Our prior engagements include remediation of regulatory reporting deficiencies and participating in the end-to- end BCBS 239 implementation journey, from establishing robust programme oversight and governance frameworks to implementing comprehensive controls and effective data management strategies. We have also supported risk aggregation processes, enhanced risk reporting capabilities, and developed IT and data architectures, ensuring alignment with BCBS 239 requirements at every step.