Building PhonePe’s Real-Time Transaction Model to prevent Frauds: A Journey

By Nipun VK, Trust and Safety Team at PhonePe Jul 28, 2022

Payments Fraud, a constantly evolving problem

The ease and convenience of UPI has made it increasingly popular across several geographies and demographics. At PhonePe, we process over 2.5 billion transactions in a month and in April 2022, we set a benchmark by processing 100 million transactions on a single day. With the rise in adoption of digital payments, users’ susceptibility to payment fraud has also risen significantly. This calls for efficient fraud detection and prevention systems.

Fraudsters are on the continuous lookout for new methods to defraud vulnerable users, and this makes the problem challenging and one that’s constantly evolving. It is also important for an organization like PhonePe, growing at scale, to design fraud prevention systems at very high precision as any fraudulent activity conducted by fraudsters via the app is a potential loss of business and a bad user experience.

The Trust & Safety Team at PhonePe aka Fraud Busters

The Trust & Safety team at PhonePe are problem-solvers and a team of analysts, engineers, and product folks. They are the subject matter experts in this field who combine domain expertise, statistical knowledge, and computer science skills to solve various problems around fraud and offer abuse. We at PhonePe, leverage a diverse list of signals and deploy sophisticated techniques to keep our users’ money safe.

Understanding the fraud, and separating it from disputes

The problem we will shed light on, in this blog, is the Peer to Peer (P2P) transaction workflow i.e., the transaction between app users.

In P2P transactions, a user sends money from his/her account to another user’s account. PhonePe only authenticates the money transfer and hence it becomes critical to identify the fraud in the authentication layer since money movement takes place outside the PhonePe ecosystem.

With the aim to reduce instances of PhonePe users trying to defraud other users, fraudulent transactions are declined on a real-time basis and eventually the fraudsters’ accounts are deactivated post investigation.

One of the very first challenges that we faced while building a fraud mitigation solution was regarding the quality of fraud reporting. At PhonePe we have made it easy for users to report any instance of potential fraud. The whole process can be done within a few clicks. However, what it also caused is the increase in false positive fraud complaints. These are disputes, which may seem fraudulent due to subjective bias but may not be an actual payment fraud.

We identified that clear demarcation of frauds and disputes are either difficult or confusing. For example, a customer being dissatisfied with the quality of a product, like a sofa or a service such as carpentry would not necessarily be a payment fraud. These are subjective opinions and are disputes. However, a fake e-commerce website duping a customer of his money can be considered a fraud.

We segmented fraud transactions that were reported using various signals around customer demography, behavioral variables, and historical transaction patterns. This helped us separate out a segment that saw high potential false positives. Removing them from the labeled data helped us develop the model with higher accuracy.

Building the Transaction Model

The idea was to build a model that classifies the transactions based on the receiver, sender, and transaction. We gathered and prepared data, post which training and testing were carried out. Various classification algorithms were tried and the models were tuned to ultimately build a model that helps prevent frauds. We had a dedicated operations team for looking into a sample of predicted results. They went through various risk signals associated with the user and the transactions, as well as make calls to senders to validate the accuracy of the model results. Based on their feedback, adjustments were made to the model and we iterated the results several times. The precision and the recall of the various versions were recorded. After multiple iterations, we finalized a production-worthy model.

Segmentation helped us narrow down the problem but implementing a blanket rule was not the best option. Rules are simple and quick in implementation but it might result in high false positives – which is why it would not be feasible as a long-term solution.

Real-Time implementation at scale

At the time of implementation of the model, the two choices we had were – a batch model that predicts the fraud-probability of transactions once in a day or a real-time model that predicts in real time (during the transaction path). In this specific case, we opted for the real-time model. Although the batch model would accurately predict fraudulent transactions, the output would be delayed. The real time implementation would help us not only identify fraudsters but also prevent the attempted fraud and thereby safeguard users’ money.

Making the right decisions at the right time is important in prevention of payment frauds. We tested out the difference in both these approaches for several weeks by simulating the real-time scenario. For several weeks we ran the simulation and the incremental goodness was very clearly established.

The first major challenge while moving in real time was maintaining real-time variables that are needed as the model input. And the second challenge was generating the model output and making sure our evaluation engine (Kratos) communicates back in milliseconds during the transaction flow. In order to solve the first problem of maintaining real-time aggregates, we leveraged our knowledge store – Yoda. We were able to convert most of these variables to real-time aggregates and the few remaining were processed in batch mode and passed on via Profile Store, the other data repository, without a significant drop in the performance of the model.

The Yoda was integrated with the ML flow servers and was optimized to make sure the Kratos call occurred within the permitted time limits. Pre-filtering out the transactions based on the insights from the earlier segmentation exercises helped us further reduce the load on the system.

The review and retraining of models occur based on user feedback, escalations and periodic analysis that we conduct on blocked transactions and reported fraud transactions.

Pictorial representation of the flow of the real-time transaction model

Reaping rewards

The results of the fully developed model were remarkable as seen in the numbers below:

  • Fraud rate was reduced by 40% on consumer to consumer transactions on PhonePe
  • Number of daily fraud complaints reduced by 38% overall (50% in the specific segment of high-value transactions that we are targeting)
  • Blacklisting done based on this model has a very low reinstatement rate – close to 1%

Future Scope

The initial model developed by the team is highly successful and helps prevent frauds in the P2P transaction flow. As next steps, we would use a similar approach and experiment on various other workflows to ultimately build an app with potentially low susceptibility to fraud – thereby protecting users’ money.