Skip to main content

Fraud Detection: Revolutionizing Financial Security - Part 2

Global-BRD-24-2927900-GTDA Thought Leadership

In the last few years, the meteoric rise of generative artificial intelligence (GenAI) has led to advancements in areas like predictive performance, speed-to-market, operational efficiency and bias elimination.

In our last article — AI Powered Fraud Detection Part 1 — we investigated the ways grammar syntax rules and fraud modus operandi play a role in identifying fraud. In this second and concluding article, we’ll use that fundamental knowledge to assess how to utilize Long Short-Term Memories (LSTM) and Graph Neural Networks (GNN) to further detect fraudulent activity patterns across a variety of fraud modalities.

So, let’s talk about language modeling and fraud detection! 
 

Setting up language modeling for fraud detection

Our goal with fraud detection analysis is to be able to decide an outcome based on a potential series of actions. We ask ourselves two key questions: “How relevant is the event?” and “How is the event related to the outcome class?”

One of the ways we can examine this is by solving a language modeling problem, or more specifically, a sequence modeling problem. Like we learned in Part One, each event – such as a password attempt - could have its own unique outcome. Some possibilities the industry looks at could be:

  1. Fraudulent account opening

  2. Account takeover

  3. First-party fraud

  4. Valid transaction

Now, to advance the analysis further, we create graphs of event clusters. Each of the graphs in our example includes nodes, which are the data points that represent the information contained in a single data structure.

Keep in mind, the goal of this is to determine the fraud outcome label for each of these clusters using the sequences of events on the nodes within them. As shown in Figure 1, it might not be one sequence of events but rather groups of sequences. 

diagram for example of sequence of events

Figure 1: Example Sequence of Events

 

You’ll remember in Part One, we learned the real world is full of people doing admissible activities as they go about their days. Those real people also face fraud criminals perpetrating schemes concurrently against different victims or institutions. We call these “simultaneous activities,” and use the previously seen fraud graphs to represent them.

Next, we’ll reframe the clusters into a graph construct to get a better look at the sequences as groups, as shown in Figure 2. In fraud schemes, the fraud-related event happens across different entities — which are reflected by the different nodes. We use the occurrences of certain events to order these and determine their fraudulent class labels. 

graph of clusters with event sequences

Figure 2: Graph of Clusters with Event Sequences

Ultimately, we want to learn from groups of sequences and summarize our learnings into a vector (more representations of data). Vectors help us easily understand and analyze relationships between different things to make predictions. That’s how we get to our eventual class label.

But to solve for this, we need to look at all these groups of sequences, not just a single one. To do this, we use sequence-to-vector graphs which bring LSTM and GNN into the mix. Let’s quickly switch gears to learn more about those.  

 

Understanding Long Short-Term Memory and Graph Neural Networks

What’s a neural network? Put simply, it’s a machine learning program that mimics the way humans make decisions by using a process similar to the way biological neurons work together. For our purposes, we’re going to be talking about Long Short-Term Memory and Graph Neural Networks.

LSTMs are ideal for solving our sequence-to-vector graph problems because they’re capable of learning long-term dependencies in sequential data. Basically, they’re great at tasks related to language translation, speech recognition and time-series forecasting.

Okay, back to those sequence-to-vector graphs.

While there are multiple ways to look at these groups of sequences, we do so with a:

  1. Fully connected LSTM

  2. Graph Convolutional Network (GCN)

As shown in Figure 3, with the fully connected LSTM, we’re able to input all known sequences without supplying information on how each node is connected. In other words, graph knowledge is not entered into the LSTM. Instead, the LSTMwill learn the graph connection on its own.

That leaves room for some errors on the LSTM’s side, such as learning the wrong connections or wasting lots of computation on things we already know about the graph. But when we bring that Graph Convolutional Network approach into the mix, we’re able to solve for these issues with two simple fixes:

  1. First, we use the GCN to learn spatial features. When the GCN updates the spatial features of a node, it uses only adjacent nodes and disregard ones that are disconnected from it.

  2. Second, we use the GCN in the temporal domain so when we do an LSTM update, we only include the nodes that are connected to each other and ignore those that are disconnected.

By doing it this way, we’re able to do a convolution, which is essentially a small filter that can cross the whole graph — ultimately saving on costs that could stem from using just the LSTM. Let’s see what all of this would look like in action.

 

graph of clusters with event sequences

Figure 3: Architecture of GCN-LSTM

 

Application of GCN-LSTM methods 

To start, let’s use an example below in Figure 4 with three nodes where each has a distinct color. Node 1 is red, Node 2 is blue and Node 3 is green.

Here, we’ll plot the cell input activation across timestep where gt can be thought of as the encodings of the events in the sequence for each node. For Node 2, we can see  gt = -1 for the j event in timestep=3 and for Node 3, we can see  gt = -1 for the k event in timestep=5. In other words, at timesteps 3 and 5, the model is seeing something that stands out. In this case, something suspicious could be indicative of fraud. 

 

graph of clusters with event sequences

Figure 4: Gate/State Visualization

 

In Figure 5, let’s look at the four examples we started with at the beginning of this post and run the GCN-LSTMacross all the clusters. Remember, we’re looking for one of four possibilities across all events, as depicted on their labels.

Machine learning can highlight suspicious events associated with distinct types of frauds, as well as valid transactions.

 

graph of clusters with event sequences

Figure 5: Fraud-Relevant Events Identified by the GCN-LSTM

 

1.      For Fraudulent account opening, we see the j event in timestep=3 for Node 1 and k event in timestep=7 for Node 4 

2.      For Account takeover, we see the j event in timestep=3 for Node 6 and k event in timestep=6 for Node 7 

3.      For First-party fraud, we see the j event in timestep=7 for Node 8 and k event in timestep=4 for Node 9 

4.      For Valid transaction, we see the j event in timestep=8 for Node 12 and k event in timestep=3 for Node 13 

And just like that, we’re able to detect dubious activity that could lead to fraud amongst all the regular, valid activity happening on a day-to-day basis.
 

What it all means

Over two articles, we’ve emphasized the ways machine learning and artificial intelligence help companies like TransUnion do more regarding fraud detection. By understanding the ordered sequences of word tokens and how communication is interpreted amongst humans, GenAI is able to replicate the same type of patterns to decipher fraudulent activity in large groups of sequences. Even more so, it can be done at less cost and more precision with methods like LSTMs and GNNs.

Throughout our research, we’ve been able to rationalize this success due to the nature of fraudulent actors and how humans sequence their words to effectively convey messages between parties. This gives us confidence these types of approaches are applicable among a myriad of aspects in business, especially event-driven use cases.  Much like you, we’re excited to see how GenAI capabilities will continue to grow and shape our everyday lives from fraud prevention and beyond.

To learn more about TransUnion’s work in the fraud protection space, check out our website here.

 

Authors:
Zinan Zhao (Sr. Advisor, Data Science and Analytics), Brad Daughdrill, PhD (VP, Data Science and Analytics) and Robert Stratton (SVP, Data Science and Analytics)