What is data modeling?
The term “data modeling” has two commonly used meanings. The first definition of data modeling is the design and development of a data model that enables data to be organized meaningfully and used in a system or across a system landscape. The second definition of data modeling is the design of a series of calculations based on assumptions and parameters that are fed with data and which model the behavior of something in the real world. This article deals with the first definition of data modeling.
The objective of data modeling is to create a data model. A data model is a way to identify, structure, and describe data that are of interest to an organization, independently of how the organization uses and processes that data. It is an intricate web made up of often thousands of individual data characteristics, called attributes. Attributes are grouped and classified into entity types, representing detailed records of interest to an organization. Entity types have relationships between them. For financial markets firms specifically, some examples of entity types are securities, customers, accounts and positions.
Why do I need it?
Good data modeling results in a well designed and well documented, data model, which significantly reduces the time and cost to develop/implement, maintain and operate the systems and processes that use that data as a foundation. This is because the business context, business logic and related data attributes structured within the data model are mostly stable and only change slowly over time, while the processes and systems themselves could change much more regularly.
For example, a firm is growing, it’s increasing its customer base, entering new markets and selling new products. The process of onboarding new customers shifts from a mainframe computer using COBOL to client-server computing using first C++, then Java. The customer onboarding team first moves offices, then opens a new office in a new region, then switches to working from home. All of these changes impact the way data sets are accessed and processed, but the entity types themselves remain unchanged in the data model.
What should I do to achieve good data modeling?
Although most data attributes do not change often, sometimes they do and sometimes new attributes arise in the context of doing business. Likewise, new relationships arise. Good data modeling ensures that the data model can be extended in an orderly and logical way. This means that the new relationships and new attributes can be easily set up and incorporated into processes.
For example, let’s say that a financial institution comes to realize that some of their customers have a second middle name; the attribute for “second middle name” is added to the data model, thereby relating it to any and all entity types that reference a customer’s name. That attribute can, and more than likely will, exist in some form for as long as customers have names – maybe not always utilized, but it will persist in the data model.
A further feature of good data modeling is that groups of entity types can be abstracted (lifted and combined) from the data model and used together, for example in a single screen of a user interface (UI), to perform a business task. Having access to a data abstraction or object layer above the data model allows business users to design the screen and the workflow for a task, such as identifying risk exposure to all entities of a global conglomerate, without always needing resources from IT.
Without good data modeling, an organization’s data governance, and by extension its corporate governance, would not be possible. Data modeling principles ensure consistent naming conventions, enforce key attributes that denote unique records, and set the logic and rules for relating entity types. Only then can data governance policies and procedures be automatically incorporated into business systems and processes, because standards are set at the most foundational and fundamental level.
In summary, good data modeling ensures that the resulting data model can be used productively and with confidence by people other than those who designed it.
Looking for a data modeling job? Explore exciting opportunities on Jooble