Institutional investors don’t just want more data. They want more useful data, with a clear understanding of its origin, status and how it has been enriched.
That means, in addition to the data itself, institutional investors increasingly need detailed documentation and metadata. Accessing information on data sources, collection methods and any transformation applied can help integrate the data within their systems. Ensuring the consistency and reliability of analysis can otherwise become difficult, requiring additional manual data treatment, reducing the speed at which they can support their own decision-making, and that of their end-clients.
In this article, Thomas Durif, Chief Data Officer for Securities Services at BNP Paribas, explains why qualifying data has become so important, how it helps answer such requests, and which best practices can help clients attain more efficiency.
What do we mean by “qualifying data”?
For us, the objective of data qualification is to enhance the value of what clients and internal teams consume. Qualifying the data financial institutions rely on entails more than ensuring its quality. Qualification also means describing and enriching the data: knowing its origin, definition, owner, lifecycle, confidentiality classification and level of quality. Essentially, qualifying data is about pulling together “metadata” i.e. information about data, helping to organise, find and understand it.
Why data qualification matters
Given the amount of data exchanged between providers and institutional investors, and the complexity of the financial ecosystem, mastering data qualification is a foundational step of increasing importance as organisations activate emerging technologies.
Today, multiple versions of the same data may be created at different times and for varying purposes, e.g. when a provider sends a client two reports with different conventions or formats. Aligning data dictionaries and avoiding data duplication produces efficiencies and prevents operational errors for both parties.
Institutional investors are also seeking more autonomy to build their own customised reports and analytics. Having access to the provider’s data catalogue, as well as information about where the data was sourced and how it was transformed and quality checked, is essential if they are to successfully self-service. Knowing what data you have available, its quality and for what purposes it can be used allows firms to adapt faster to change and be more responsive to client requests, such as for specific dedicated reports. Plus, it lightens the preparatory work for audits and regulatory controls.
The more the data is structured, and firms understand and trust the information, the more they can push forward as well with automating processes. Knowing exactly what data is populating systems minimises perennial “garbage in, garbage out” concerns.
Detailed metadata describing datasets will be particularly important for enhancing artificial intelligence (AI) models and having confidence in the output. Leveraging Generative AI’s full potential will require data to be far more accurate and transparent. Providers must also be able to qualify data within an organisation if they are to train models purely with authorised data and in a fully transparent way vis-à-vis their clients.
Qualifying data similarly benefits risk management. The greater the data’s accuracy and integrity, and the more trust portfolio management and risk teams have in the information, the better they can assess the firm’s exposures and take appropriate risk management actions. Rather than simply sending an asset valuation, for example, explaining there is a sizable spread with two other prices on the list, and that tomorrow the price variation will likely expand significantly can provide valuable context and market understanding.
How to qualify data
Failing to qualify data is no longer an option. Regulators increasingly want clear and precise descriptions of firms’ data. And institutional investors are demanding more responsiveness from their service providers to enable them in turn to improve their service to end-clients. With proper data qualification, it’s possible to slash investigation times from hours to minutes, drastically improving service levels.
Expertise and consistency are key. Providers need to own the process and demonstrate to clients – and regulators – that they have a full understanding of and control over data. And it must be industrialised to make it efficient and, to some extent, harmonised across datasets.
A robust data qualification capability relies on three core pillars:
Documenting the information
Having the right structure to store and document metadata – the data that describes data – is crucial, yet challenging. Establishing boundaries to determine critical data that brings the most value is equally important, to avoid getting caught in a never-ending process of documenting everything.
Quality control
Robust controls ensure data is of good quality. If quality controls are too specific to each business though the result will be similarly specific, making it difficult to analyse outcomes.
At BNP Paribas, we assess data against multiple quality dimensions (accuracy, consistency, completeness, integrity, timeliness, uniqueness and validity) to determine its value. Industrialising the control structure in this way creates a level of quality standardisation and allows analysis results to be leveraged in an aggregated way. By doing data quality “by design” we can also prevent the capture of bad data ex ante.
Data qualification pitfalls to avoid
Poorly done, qualifying data could become a Sisyphean task. Keeping business objectives front of mind is essential in managing boundaries in a way that brings value to users and avoids qualification becoming overly exhaustive.
Overengineering is another common danger. In a world of growing data volumes and complexity, data qualification needs to strike the right balance between being simple to use and understand, while being sufficiently detailed to add value. The temptation is to create a data catalogue that is beautifully engineered but that may be too complicated for actual users. Not everybody is a data scientist. Different data consumers have different needs, and the qualification service must reflect and adjust to those needs.
Deciding on the right data management tools further complicates the process. Firms have a profusion of tools available in the market to help document data, design data models, provide dashboards to present end data, etc. But navigating the choices to find tools that are sufficiently adaptable to meet your needs and deliver the results you expect can be complex.
Systematising data qualification at BNP Paribas’ Securities Services
Measures such as BCBS 239[1] have long made it mandatory for banks to document data and demonstrate what controls have been conducted along the value chain. Ultimately to calculate credit risk and liquidity risk reports.[2] As a global service provider, we handle enormous datasets and produce extensive reporting for diverse clients across the asset management universe. We have thus applied the data documentation principles gained from our regulatory experience to other datasets for the benefit of clients.
To guarantee service quality, we have made, and continue to make, significant investments in our enterprise data capabilities. Along with employing a market-leading data catalogue, we have been developing a Data Factory solution. This combines data warehouse aspects with an advanced three-pillar data qualification framework that incorporates our different data quality dimensions. Built using multiple systems, the Data Factory creates a complete, automated and industrialised data and metadata fabric. This aggregates rigorously qualified information to ensure more systematic preparation and provision of datasets we send to clients.
Through these developments, we aim to attain further data management efficiency from which our clients can benefit. Ultimately, our goal is to improve the time-to-market of data, reduce the risk associated with its management, and provide the necessary expertise and explanations for clients to best exploit that data.
[1] The Basel Committee on Banking Supervision’s Principles for effective risk data aggregation and risk reporting
[2] Principles for effective risk data aggregation and risk reporting, Bank for International Settlements, January 2013, https://www.bis.org/publ/bcbs239.pdf