Verifying an Identity

A part of the puzzle of solving and preventing fraud has to do with verifying an identity. I struggle with this problem for a number of reasons. First, it straddles multiple problems. It exists as a challenge in itself and is hard to define. If we start by saying does a collection of data uniquely identify a particular individual, we are still left with the issue of how to know this to be true? And if I can do it now with a subset of data, would I get the same answer later? If I can use the data to choose from among a set of ‘identities’, surely the problem is harder as that set of possibilities grows? But if part of the problem is having the possibility of the ‘right’ individual even existing within the dataset, then more identities makes my life easier. And on fraud, it often feels like some people say they want Identity Verification as though that is synonmous with ‘fraud’. It feels like a conflation of a feature with resolution of the problem. There’s a level of trust that inability to verify has a direct line with fraud.

Rethinking False Positives

I’ve been thinking about performance, fraud, and anomaly detection lately. For a long time, I told myself that the nature of solving for fraud was one that should be seen as identifying where ‘expected behaviors’ go wrong. I feel like I need to modify that thinking. In the case of synthetic fraud, we know that there are many instances where the engagement of the identity with an FI does not involved extracting money from the FI. Sometimes synthetic identities are created for reasons outside of the FS ecosystem related to a CRA business.

A go-to signal for fraud is early delinquency. A problem here is that there are non-fraudulent instances that none-the-less go delinquent or default early. We shouldn’t be blind to the parallel idea that there are instances of fraud which do not go bad early. Given both of these aspects, there is some ambiguity pertaining to correctly measuring the effectiveness of solutions and designing those solutions.

There are two related topics to this line of thought: 1) the conceptualization of false positives; and 2) the bias toward building solutions where you are classifying something between one of two choices. These topics are entwined with each other as well. Discussion of false positives is nearly always discussed in the context of classification where the world is divided into True Positives, False Positives, True Negatives, and False Negatives. The framework can be extended to multivariate outcomes, but it is not the norm. Similarly, the industry standard for analytic approaches in Financial Services is based on distinguishing ‘bad’ from ‘good’. I suspect that it may make more sense to treat fraud framed by the possibility that it can be attempted or even carried out where, from the point of view of a particular FI, it may cause immediate, unplanned losses or not yield the expected benefit to the FI. An example of the latter would be a fraudster or fraud ring that needs multiple accounts that are used in some synchronous fashion where some of the accounts may not be used to extract money, but are vehicles (‘get-away cars’). Their use doesn’t align with a ‘normal’ consumer and the lender wouldn’t see the type of ROI as they usually do, even though there’s no immediate early, high value ‘loss’.

Having some ability to admit fraud that acts beyond immediate loss is potentially important because these fraudsters are potentially connected to those fraudsters who do extract immediate losses – it’s part of the same ecosystem of individuals that are the threat being addressed. Circling back to the False Positive aspect, it means that we should consider broadening what a false positive is, showing that many of those who do not lead to early or high-value losses, are none-the-less, not valuable partners to FI’s. Solving to minimize this type of ‘false positive’ will potentially eliminate important signals of the true ‘bad’ fraudsters.

Onfido again

Another report from a group called ‘Onfido’ came to my attention in the past year. Yes, I’m way behind in the articles I want to read. Almost as badly as I am with reading books. So, first, who is Onfido? Appears to be a group of Oxford students who in maybe 2012 figured out how to automate part of (all?) background checks using AI/ML. They compare selfie-photos with those on scanned documents. They do Optical Character Recognition. Since 2016 they’ve moved into the financial services area with attention on KYC compliance. In 2016 they started getting heavy venture capital funding and have a link to an ex-google dude as CCO (Chief Compliance Officer?). They found a niche and know how to replicate what others do these days when you find one.

So their background justifies the article I’m reading.

The article is a review of document fraud which they sort into the following categories:

  1. Stolen blank government docs
  2. forgeries: create a document by imitation
  3. doctored copies: take a legitimate, already valid document, and alter it.
  4. stolen documents: tied to identity fraud
  5. legitimate documents obtained by fraud
  6. fictitious or mostly fictitious documents: they suggest this is rare and has things like ‘Republic of Texas’.
  7. fudged/edited ‘dummy’ or ‘display’ documents; ones which govts keep around but are not intended for use (I think this is analogous to pocketbook SSNs)

They also categorize by sophistication level into 4 ‘tiers’ with the most difficult to spot comprising maybe 5% and are often connected to criminal rings while the easiest comprise maybe 20% and are typically pretty amateurish.

It is an interesting survey into the methods of document fraud. I’m wondering if there aren’t classes of such fraud that are tied to digitized mechanisms for document generation. For instance, could someone hack a DMV and get drivers licenses shipped to where I want/need? I guess you could lump that into either ‘stolen’ or ‘legitimate but obtained fraudulently’ but the sense I have is that the authors see all of this as being conducted by literal handiwork of humans. I would guess that they would make allowances for computer aided fraud with regards to doctoring , etc.

The paper also talked about things to use to help – for instance details of the document, consistence of things like font, logical errors, and the like. Most of them seem tied to a pretty careful examination by a human again. I suspect font consistency could be computerized.

Not bad for a short read, makes me think how this is used in the financial services – I’m sure it is since our company has a document authentication service. Embarrassingly enough, I don’t know a lot about it.

 

 

Document and Identity Fraud from Onfido

Another report from a group called ‘Onfido’ came to my attention in the past year. Yes, I’m way behind in the articles I want to read. Almost as badly as I am with reading books. So, first, who is Onfido? Appears to be a group of Oxford students who in maybe 2012 figured out how to automate part of (all?) background checks using AI/ML. They compare selfie-photos with those on scanned documents. They do Optical Character Recognition. Since 2016 they’ve moved into the financial services area with attention on KYC compliance. In 2016 they started getting heavy venture capital funding and have a link to an ex-google dude as CCO (Chief Compliance Officer?). They found a niche and know how to replicate what others do these days when you find one.

So their background justifies the article I’m reading.

Javelin’s Fraud Trends from Last Year

I had occasion to go through Javelin’s anticipated fraud trends written by the prolific Al Pascual. My intention is to do a quick search on the items mentioned there. As I’ve indicated in the past, I find a lot of material from places like Javelin and Aite to be…’light’ or ‘skimpy’. They feel like meant to provide bullet points but not with serious or deep thought to buttress them. This one is similar, but I don’t feel like it is trying to portray itself as something it’s not. As a result, it was not a bad read. Oh, the reference – ‘2019 Fraud Trends’ .

In short, Javelin notes three areas they expect to be active with regards to fraud and fraud related tools: 1) increased speed in payments systems; 2) impact of the changing regulatory environment; 3) evolution of the authentication environment. Specifically, the point about payments systems is the increase in things like Peer-2-Peer technology, I think that means things like Apple Wallet or Venmo ,will prove to be increasingly targeted, especially as smaller banks start to use them without the kinds of defenses the big boys have. With the regulatory environment, Javelin provided an interesting discussion of the impact of both the European new GDPR and the new California privacy legislation. Those will likely be the cutting edge of a general trend. I think about it in the context of the block chain use case I’ve read about for consumers to own and essentially sign for their identities. Finally, Javelin sees authentication to continue going into the biometric direction but more in regards to visual imagery of individuals, driven by really good evolution in the ease and efficiency of that technology. They suggest that OTP is already ‘outmoded’.

A quick google check, showed that under P2P fraud shows others are anticipating growth of fraud there, especially around account takeover.There’s an article from a group with a site ‘SecurityIntelligence.com’ who talk about it and how to detect the signs of it (essentially unexpected movement of large amounts of $). I could not tell from a quick search whether actual incidence of this is increasing.

Spring Labs

While blockchain is on my mind, there were a couple other news items I saw this year around a group called Spring Labs. Actually the two pieces are both dated June 12, 2019, one from Forbes and one from Chicago Crain’s. The Forbes piece is about GM Financial extending its membership with a consortium using Spring Labs blockchain technology for identity and fraud prevention. The Crain’s piece is about the consortium in general.

The Crain’s pieces discusses it in the context of suppressing loan stacking. Reading between the lines, its a way for consumer identities to be tokenized on a distributed ledger – which makes almost scary sense to me. Much like the Fraud Prevention Exchange, this framework needs to function at scale before it is effective. Right now, the Spring Labs use is based on 25 partners: Avant, GMF, SoFi, On Deck Capital and others.

The Forbes pieces is centered on GM and the use case for fraud in automotive financing which always seemed strange to me – can identities be effectively disguised when the consumer is right there in person?

People: Spring Labs cofound is John Sun who was a co-founder of Avant. (I think I met him). It’s based in Chicago and San Francisco. Connections: wife of Avant CEO is Anna Fridmen is chief counsel. There’s a connection to others in my world apparently.

Blockchain 2019

Just prior and during holiday season time off, I went through an article reviewing the current state of blockchain. I’m used to these types of write-ups being very shallow and full of unsubstantiated marketing buzzwords. While this one had that as well, it actually felt like a decent  survey of what the state of market and r&d look like ‘out there’. It was put out by a group called ‘CB Insights’.

On blockchain itself, I tend to find myself still struggling with mapping the underlying ‘distributed ledger’ technology to something that feels like currency – since that’s how it was initially put forth. I never did (and still don’t) quite grok the link between a computer doing ‘something’ and having that labor be converted to digital value. How does that work? Who determines what the ‘value’ is? And what exactly is the connection to blockchain and the ledger? I think some of my issue with the blockchain piece is that I don’t myself have a lot of experience with the use of any type of ledger.

The paper in hand is nice in that, as a survey of it’s use and development, these examples give me a better idea or gestalt of what blockchain is. It helped, in reading this, if I thought of it as a book-keeping device with built in error checking that has no ‘owner’.  Most of the areas of relevance described here have a need for that type of facility. It’s actually a little depressing to me since, while clearly needed, these are mostly things that I would have taken for granted; they fall into the category of things that you only notice when they do not work well. Improving them just doesn’t feel like a real advancement of anything novel or creative.

Those misgivings aside, the paper frames the status of blockchain by looking at use cases along two axes: 1) Market Strength; and 2) Industry Adoption. Market Strength means how broad and wide is a particular need. High market strength means the adoption of a particular use for blockchain would impact a very large part of a particular market, draw lots of attention in earnings transcripts commentaries, draw lots of quality investment and capital. High Industry adoption means lots of industry attention, lots of customers engaging with a use case, more and more start-ups appearing.  Then these two dimensions split the universe into 4 categories:

(Italicized applications are ones I look at a little more in detail below).

Transitory: High adoption, low market strength: These  are use cases in play today but play to limited market. Examples explored:Initial Coin Offerings, Smart Contract Platforms.

Experimental: Low adoption, low market strength: Identity Management, DAOs, Non-fungible tokens, Data marketplaces, decentralized exchanges.

Necessary: High adoption, high market strength: Bitcoin mining, Fiat-crypto-exchanges, Supply Chain Distributed Ledger use.

Threatening: Low adoption, high market strength: DLT in IoT, Bitcoin, Privacy coins, DLT in clearing and settlement.

Some interesting ones:

Identity Management – Even though it is not discussed in detail, it could impact places like credit reporting agencies or Lexus Nexus to a great extent. I think the idea is that blockchain technology offers a way to have a source of truth about an identity and own it, authenticating changes that defeats transitory alterations suggestive of fraud and allowing an ownership of when an identity is used by anyone. It’s still very vague, but it seems to me like the seeds of this are potentially revolutionary and something to look for.

Supply Chain Distributed Ledge – this falls into that kind of boring area (to me anyway) that could none-the-less be important. The idea is that the supply chain in most markets feature exchange of things between parties at multiple points – creators/growers/retailers/transport/ etc. The pattern repeats throughout economies. DLT offers a common, error-correcting, efficient way to manage that data flow.

Smart Contract Platforms – I was a little surprised to see this veer into the ‘transitory’ area since, as I understand it, it represents a potentially natural way to fulfill activity associated with commodities and stocks and the like. Why rely on humans to figure out when to execute something (which is my understanding of what smart contracts would allow). Ethereum is actually a smart contract platform. CB Insights seems to suggest that the challenge here is getting the coding/developer contributions to make the software common and functional. That is, it remains a challenge to get the buy-in from software developers.

Clearing and Settlement – Again,thinking about the authentication/issuer dance for credit card transactions, the idea of the ledger being an error-correcting way to efficiently be open to multi-party activity seems natural to DLT, and clearing and settlement activity is probably a very natural and imminent target for DLT adoption. Adoption is potentially there for current providers of this activity. A big player is DTCC, Depository Trust and Clearing Corporation.