Perfios
Graph Databases: Untangling Connections

Graph Databases: Untangling Connections

Table of Content

Introduction

Every time you log into LinkedIn and see first-degree, second-degree, and third-degree connections, LinkedIn satisfies your insatiable appetite to grow and widen your network by harnessing the social media networking site’s professional state-of-the-art network graph based on a graph database.

Amazon recommended items – “people who bought this item also bought” or “these items are often bought together” – come from a graph analytics query that entices you to continue shopping, fulfill your needs or create new wants, and keep their cash registers ringing.

The use of graph databases and analytics on Facebook, Twitter, and Instagram enable them to figure out how users relate to each other. Their algorithms utilise those relationships to hook you up with the most relevant content that catches your attention.

As a result, graph databases are used every day by millions of people without them realizing it. There’s a possibility that it’s currently influencing your decisions without you being aware of it.

What are Graph Databases?

The purpose of graph databases is to store, uncover, and navigate conspicuous and hidden relationships upto many degrees of separation. The relationships can be symmetrical such as two individuals sharing the same email account, domain address, or working in the same enterprise. Similarly, relationships can also be asymmetrical, i.e., A-B represents a parent-child relationship, while B-A represents a child-parent relationship, so it is asymmetrical since the dynamics of a relationship are determined by which entity is taken first and described accordingly. There are two distinct and unequal roles in asymmetric relationships. It is always obvious that there is a difference in age, power, and status between these two occupations.

The graph is a pictorial representation of data in the form of nodes, properties, and relationships in the form of edges. So graph database has three components: nodes, edges and properties:

Nodes – Objects are represented by nodes, which are instances or entities of data to be monitored and tracked. A node is a vertex in a graph. Accounts, business entities, people, locations, etc., can be nodes.

Edges – The concept of edges defines a node’s relationship with another node in a graph database. The relationships, as discussed above, can be symmetrical or asymmetrical. The relationships can be either unidirectional or bidirectional.

Properties – Nodes have properties that describe them. Edges also have properties that describe them.

Database Management System in Graph Databases

Within Graph Databases, there are two types of data processing systems at work: Online Transaction Processing (OLTP) and Online Analytical Processing (OLAP)

OLTP

Online Transaction Processing is purely operational, providingdata from the database that is readily accessible. A large number of relatively simple data updates, insertions, and deletions are processed by OLTP. OLTP systems are designed to handle large numbers of database transactions in real time by a large number of users over the internet. Database transactions in OLTP are characterized by atomicity or indivisibility – they succeed or fail as a whole, maintaining transaction integrity. A pending or intermediate state cannot exist. Due to the simplicity and straightforwardness of OLTP queries, they require little processing time and require little storage space. OLTP systems are business critical, any downtime can translate into disrupted transactions, lost revenue and damage to the reputation of the company.

Consider, for instance, a data warehouse of all airports worldwide. In this case, the nodes are airports around the world. The information relevant to the Mumbai airport would be retrieved by an OLTP query, for example. As an example, there are two vertices: Mumbai airport and Heathrow airport in London. If there is a direct flight between these two cities, then there will be an edge indicating that relationship with distance as one of the properties. As an example, an OLTP query might ask how many direct flights are available from Mumbai. The distance between Mumbai and other cities around the world and the direct flights between them will serve as the edges of connectivity. Similarly, in SQL semantics, OLTP query can be the last 10 transactions by a particular customer.

OLAP

An Online Analytical Processing system is designed to process and analyze large volumes of data encapsulated in multiple datasets in parallel to uncover complex relationships between them and draw valuable insights from them. A customer-driven data management system, OLAP gives organisations the ability to pull out insights that aren’t so obvious and conspicuous and assist in making informed decisions by analysing business data. Typically, this data is contained in a data warehouse, data mart or some other distributed and scalable database.

OLAP allows you to draw data from multiple data records for complex data analysis. OLAP databases have multidimensional schemas, which can support complex queries of multiple data facts from both historical and current data. Different OLTP databases can serve as a source of aggregated data for OLAP, and they may be organised as a data warehouse.

Continuing with the above example, OLAP query can ask for two airports farthest from each other and have a direct flight. Banks, for instance, open fixed deposits for various customers across the country. OLAP query can be like from which Pincode does the most fixed deposits come from.

Use Cases of Graph Databases

Money Laundering

The process of money laundering involves blending dirty money with legitimate funds and then converting it into hard assets. A circular money laundering occurs when the criminal or fraudulent actor sends large amounts of money to himself or herself, but conceals and hides the funds through a complex and convoluted series of valid transfers between “normal accounts”. “Normal accounts” are actually accounts created with fictitious identities or synthetic identities. As these normal accounts are generated from stolen identities, they tend to share similar information, making graph databases the perfect solution for uncovering and revealing their fraudulent origins. In order to simplify fraud detection, users can create a graph of transactions between entities as well as edges between entities that share some information, such as email addresses, passwords, addresses, etc. After a graph is created, a simple query will reveal all accounts that share information and which accounts are sending money to one another.

Fraud Detection

This is what Perfios OneView product specialises in. Based on its award-winning graph technology, KScan leverages 450 crore links between 2.7+ crore businesses. This illustrates how vast the related party network is, allowing it to uncover and judge disclosed and undisclosed relationships, creating the biggest relationship network in the fintech industry, ideal for due diligence due to its ability to highlight any kind of relationship that could affect critical business decisions.

Our investigative expertise has enabled us to detect and identify many red flags and general relations, including businesses with the same address, email ID, domain, ownership of the domain, and company and family relationships, thus enabling us to provide an extensive and thorough business network analysis and a graph that extends to 4th-degree connections, thereby providing the most exhaustive due diligence insights available.

This is how it works. For example, graph technology considers the CEO of a company as a node and constructs edges with other entities where he serves as a director, out of which one may be adjudicated as a shell entity. The concerned person needs to be further scrutinized as he may be harbouring fraudulent intent with his current company. Graph technology can also reveal a close relative of the concerned person who was prosecuted for insider trading. A closer look at the person in question is also warranted in this case. Also, there may be edges between entities sharing the same address, email ID, which reveals fraudulent intent since two companies cannot share the same address or one of these companies is under trial for money laundering. By exposing relationships between entities constructed on fraudulent premises, graph technology exposes fraud.

Contact Tracing

The spread of COVID-19 has brought disease contact tracing to the forefront. As people become ill with highly infectious, contagious diseases, they continue to lead normal lives, visiting movie theatres, packed gyms, and crowded weddings, spreading the disease wherever they go. Increasingly larger populations remain oblivious to its presence, and it spreads rapidly like a forest on fire during heatwaves.

As soon as someone is diagnosed with an infectious disease, the race is on to find everyone who has been in contact with the sick person so they can quarantine to prevent a pandemic or epidemic from occurring. To prevent the spread of the disease, contact tracers must do their job as quickly as possible, as any downtime will make it increasingly difficult to contain the disease and make it more likely to become a national emergency-kind of situation.

The strong emphasis on relationships in graph databases makes them ideal for analyzing disease patterns. Analysts can identify hotspots and connections using the information related to people who have tested ill, the friends and family they have interacted with, and the places they visited. So, contact tracers can work more quickly to isolate sick people and prevent disease outbreaks.

There are three levels to contact tracing with graphs:

In the first place, there is a need to understand and analyze people’s relationships, their communities, and the places they’ve been, which graphs can provide succinctly if enough mobile data is available.

Secondly, graphs must be used to find possible links between people who could spread the disease. Was the person travelling by train? Is it possible to identify everyone in his compartment?

Third, contact tracers must identify “super-spreaders” and isolate them as soon as possible. In other words, we are looking for people who have a wide range of contacts and ties to several communities. In order to find highly connected people, we explore graphs with notions of centrality and betweenness.

Conclusion

A graph database has become an indispensable tool and a better alternative to relational databases when it comes to identifying relationships between entities and people and identifying dependencies between various constituents where a small, seemingly inconsequential change can significantly impact multiple constituents in a domino effect. Globally, graph databases are used by corporations and governments alike to contain pandemics, recommend products, detect fraud, and identify money laundering activities. With so many uses, Graph Databases are becoming increasingly important.

About us:

Perfios is the largest data, analytics, automation, and decisioning solution provider to FIs, catering to the entire lending lifecycle from onboarding to diligence & monitoring to collections. Perfios solutions enable systemic fraud prevention, risk management, compliance & automation through superior data engineering and deep tech applications.

In a nutshell, Perfios stands on the trifecta of digitization: automation, enhanced diligence, and robust decisioning for straight-through processing; thus, creating a state-of-the-art digitization process without compromising on security and quality. Perfios is a pioneer in the services it offers and has successfully acquired a very diverse portfolio of 300+ live clients, spanning across the largest gamut of use cases in the industry.

Related Blogs

Get New Articles, How-to Guides and News Sent to your Inbox Monthly.

Subscribe for the latest from Perfios

Get New Articles, How-to Guides and News Sent to your Inbox Monthly.

Subscribe for the latest from Perfios