Since all models are wrong the scientist must be alert to what is importantly wrong. It is inappropriate to be concerned about mice when there are tigers abroad.
Some History
The last ten years of machine learning and engineering have been extraordinary. We have seen unimaginable progress in computer vision, natural language processing, and tabular data modeling.
I am lucky to have started my career at a time before much of the excitement started and to have seen the innovation happen first hand. To briefly elaborate, I finished my first master’s in economics and statistics in the fall of 2011 and worked in what was called “predictive analytics and data mining”. I decided to pursue another master’s part time in data science in 2014 where I focused all of my studies on machine learning, computer science, and math. So getting the opportunity to study the field academically while working in it professionally during its glorious rise felt like (1) dumb luck and (2) a terribly good time.
So, how did we get from machine learning being mostly laughable to an extraordinary internet craze?
Things improved for three technical1 reasons:
Better hardware
Better software
More data
Many of the impactful algorithms that have been successful were developed in the decades before but lacked the juice necessary to do some massive computations.
As the three magical ingredients collided, a new era of machine learning was born and the world has changed significantly for it.
In the past year the release of novel models by some of the biggest AI labs (OpenAI, Meta AI, DeepMind, Google Brain) have raised the stakes further, providing what feels like yet another colossal change in machine learning.
Stable Diffusion, Multilingual Machine Translation, GPT3 (Generative Pre-trained Transformer 3), and other state of the art ML models are quite simply astonishing. Of course, they have their flaws but the good news is researchers at these labs are constantly iterating to improve them. So these models and neural architectures will get better over time, but it’s important to understand that going from the intelligence of an ant to the intelligence of a dumb human is a much greater leap than getting from a dumb human to a smart one…but these models don’t quite have the intelligence of an ant. You wouldn’t be able to take GPT-3 and plug it into a robot and suddenly watch it learn how to walk or become sentient. It relies on an extraordinary amount of engineering, so we’re not really at the “dumb human” level. We’ve just trained this model to be very good at language.
The innovation coming from these open sourced generative models is exciting. People are changing their online profile pictures to ones generated using Stable Diffusion, posting screenshots of GPT conversations, launching AI startups, but few people have mentioned any useful Fintech applications.
As a brief aside, if you are building a business with a small product on top of an open sourced model you’ll probably get copied because there are no barriers to entry—so make sure to create some if you want your business to thrive. 😅
Fintech and AI
David asked a great question (a16z wrote about this topic more broadly) and the replies to him mostly centered around synthetic data generation for fraud. Unfortunately, I don’t think that would be a great idea2 as it could be generating samples that give you useless, or worse, wrong data (and wrong data will ruin your business and we surely don’t want that in this economy 😉).
ChatBots and better customer experiences seem like a natural option, right? Sadly, no. ChatBots are often quite terrible and I’m skeptical that they would outperform basic search and curated documentation, which is mostly the illusion of a ChatBot with some slick UX and, in my opinion, is a much better alternative. (Why? More on that later.)
So what, in particular, can generative models be used for in Fintech? And what, in general, can machine learning be used for in Fintech?
Financial Data is Chaos
Finance is very different from language (NLP) and computer vision (CV). Not to trivialize all of the CV research in the past decade but the mechanism that we use to quite literally see objects in the world and represent them in Red, Green, Blue pixels stored in Tensors is rich in its structure…and also for the most part a complete and accurate representation.
By which I mean that if you see a picture of a cat and I told you “this is a cat”, that information is unambiguous. The image your brain processed can see the image again and say “this is a cat”. You don’t even have to know what a cat actually is, you just know it’s this thing. Isn’t that kind of amazing?!
With language this becomes more challenging as there is latent context missing and implicit in writing.
As an example, if I told you “Donald Trump sold NFTs this week” you may have a lot of contextual knowledge about Donald Trump the person, businessman, and former president. You also might have knowledge about what an NFT (non-fungible token) is and that “this week” means the week of December 18th-24th 2022 because you understand time. That’s a lot of hidden data in six words. Luckily word embedding and self-attention (some of the key ingredients to the success of Large Language Models) handle this information reasonably well but they have their limitations.
So what about finance data (i.e., tabular data)? The structure of financial data (especially consumer finance) is rich but in a much more non-obvious and kind of arbitrary way—i.e., the Data Generating Process is chaotic.
How? Well, we use arbitrary numbers for rules in systems that encode more numbers and rules with lots of implicit information about other random rules about dates and thresholds, which makes the structure of the data really just kind of weird.
For example, the credit bureaus store information about loans (but only from lenders that both choose to give a loan to the customer and furnish that data…and only when they want to) in a database that represents how much a person borrowed and paid back at different points in time. That data represents how the interest accrued (but not for everyone!), the dates when someone is supposed to pay it back, and the amount of credit someone has available—which are all made up numbers in a made up system with somewhat arbitrary (though governed) rules. 🥲
This isn’t to say that these numbers don’t mean anything, of course they do. It’s just everything surrounding this money stuff was kind of made up by humans to keep track of things. That is a profoundly different phenomenon than your visual cortex, linguistics, or the laws of physics.
Why does any of that matter?
The set of problems you can use AI for in Fintech is more narrow than other industries
Generative models are often wrong in silly ways and that is dangerous and potentially expensive in Fintech (maybe even criminal 😳)
Most industry practitioners will tell you that deep learning on credit/fraud data underperforms compared to XGBoost
Interestingly, (3) is mostly because consumer finance operates within a system that, as I mentioned, is rather discrete in nature and constructed by arbitrary boundaries which is a property that gradient boosting machines joyfully exploit.
Some fun history here: Dan Steinberg, founder of TreeNet (aqcuired by MiniTab) was, to my knowledge, the first to professionally release an implementation of Gradient Boosted Machines authored by Jerome Friedman himself 🤯. Dan first sold the software to Capital One many years ago (late 2000s?), who had been aggressively using TreeNet since. So GBMs have been in use in Fintech for over 10 years. I had the pleasure of getting to spending time with Dan back in 2013 and 2015 at both AIG and the Commonwealth Bank. He’s brilliant.
Consumer Fintech is a Math Puzzle
I still haven’t answered the question yet.
Credit and Fraud modeling are what everyone uses AI/ML for in consumer Fintech, so naturally that’s the obvious answer. Will Generative models make them better? No, they won’t. Don’t waste your time or money on it. Generating fictitious data that can harm your currently accurate models is dumb (and that’s not you!).
Will innovations in ML algorithms improve risk models? Yes but those are increasingly incremental and short lived because people catch on quickly.
Where’s the opportunity then? Engineering. 😇
The data, systems, and software surrounding artificial intelligence are rapidly changing and that’s where there’s a massive opportunity to innovate in consumer finance for risk modeling because consumer finance is basically a multivariate constrained stochastic optimization problem (that’s a mouthful).
Companies like Databricks, dbt Labs, Prefect, and Tecton are changing the industry and I think there’s still a lot of opportunity here.
I’d say the same things for consumer insurance. There are innovations happening in how data is measured (e.g., in auto insurance with tracking devices) but much of insurance still uses actuarial models for their risk modeling. This is even more true for commercal insurance and that makes sense as machine learning really isn’t the right tool for low frequency high dollar risk pricing.
Back to ChatBots
What about ChatBots and customer experiences? Surely we can use GPT3 to answer basic questions about a customer’s account? No. A pathologically extreme example would be if a consumer asked about whether they could apply to get a loan and the ChatBot said “Yes” but then didn’t give them the loan or gave them inaccurate information about the loan. The consumer would be upset and I’m sure this would result in a world of legal pain for the lender.
Suppose you asked a ChatBot for your checking account balance and it gave you the wrong number? Maybe it led you to make an overdraft, wouldn’t that make you absolutely furious? You’d be mad at the bank for using this technology on something so important (yet technically trivial)—and rightly so.
GPT3 and Neural Architectures don’t really care about individual facts in a database, they care about on average phenomena and structures that are on average true. This is why it reflects the structure of a right sentence but gets the details wrong.
Fun internet ChatBots without legal ramifications can do that, but not financial services companies, and that’s a good thing.
In short, the cost of being wrong is high and having your customer's shout (read: type in caps lock) chaotically at a ChatGPT is just pure guessing. Manually curated workflows masked behind a Conversational Form based ChatBot is a better option because you dictate the end state, not some random matrix multiplication machine3. Your legal teams will thank you.
Artificially Intelligent Marketing
Fintech and the battle to reduce CAC (Customer Acquisition Cost) is a well known challenge in the industry. Most build propensity models for Direct Mail, Email, Ads, and maybe even in-app events if you’re fancy enough. Can new generative models add value here?
Potentially! But more through enhancing copy and the ad creatives more so than the other machinery. Copy.AI, Jasper.AI, and others do a lot of this and it’s broader than Fintech alone but it does seem like a possibly useful option, though I’m not sure the gains will be significant over things people can create…unless you make an excessive combination of your ad creatives and copy…and that feels like less of a strategy and more like throwing stuff at a wall and seeing what sticks.
Churn and SQL
Customer attrition is quite well understood and we actually have enough ML tools for the problem. More importantly, I think the tools we currently have are more than what we need for the job. For understanding why your customers are leaving, I recommend (1) talking to your customers, (2) looking at your data, and (3) creating a funnel to see what in your product is broken.
No need for ML.
Recommendation Engines and the Path to BananaCard
Several years ago I spoke with a few different founders who wanted to build a recommendation engine to help consumers improve their credit so that they could get approved for a mortgage, a credit card, or whatever consumer credit product. This came after the announcement of the Path to Apple Card.
That’s not actually a recommendation engine in the technical sense but the goal is to give the consumer recommendations about how they can change their credit profile to get approved by some sort of rules system, which makes the misnomer more palatable.
This is actually much more like an optimization problem commonly solved by Newton’s Method for optimization (though I should note this is more challenging as it’s actually a multivariate optimization problem and the function isn’t always guaranteed to be twice differentiable—the approach is actually much more simple in practice).
The point is that this product feature is well solved already so I don’t think there’s much novelty here, maybe the algorithms to solve the optimization can run faster but beyond that I don’t see anything wildy new happening.
For actual recommendation engines, the machine learning systems that exist today are actually quite good already so the gains I suspect will only be incremental.
Automated Insights
Personal Financial Management Applications are heavily reliant on ML, business logic, and a miserable amount of data transformations/pipelines.
I built one myself so I know this in excruciating detail. There are companies that now offer this product as a service (e.g., Pave) and I think this could actually be a good opportunity for generative models because there’s a general meta-pattern to these things and they can assist ML Engineers in creating insights faster and maybe even more accurately. In many ways this is similar to GitHub Copilot but for Data Scientists, and I think that could be really useful.
It’s important to highlight that the Data Scientists have to be steering things as getting data insights wrong about a consumer’s data (that they have kindly permitted you to use) would be potentially harmful and we don’t want that! A concrete example would be misstating something about their current balance and making the consumer believe they could spend more than they should—that would be disastrous.
…But generating faster insights from standard transaction data could be an extremely powerful feature for Generative AI. I should note that this, again, could be used more broadly than this single use case (and is actually an active area of research called Meta-Learning).
Algorithmic Trading and Asset Management
Algorithmic trading has been using ML for a very, very long time. Two Sigma was founded in 2001, D.E. Shaw in 1988, Renaissance Technologies in 1982, and Quantitative Managament Associates in 1975. So the fight for alpha continues, maybe with short term battles left to be won, but I don’t think this area will be disrupted substantially.
Using generative models to create synthetic data is only marginally different than Monte Carlo simulations which is basically the old school approach of sampling from a multivariate Gaussian. Since there’s already published work on using Generative Adversarial Networks for trading strategies, my guess is that these aren’t that impactful. People don’t rush to publish the quantitative strategies that are the most effective. 😉
Asset Management has long used quantitative methods (i.e., statistics and machine learning) to understand risk exposure and optimize their portfolio. Some places use even more indirect ways of using machine learning to find emerging companies (I actually did some of this work back in 2015), so I expect this space to see only modest improvements as well.
Closing Thoughts
At the moment founders and venture capitalists seem to be jumping on the AI bandwagon (have we learned nothing from the NFT craze?), largely driven by the recent and dramatic innovations in AI, but that innovation was driven by large research labs funded by the largest technology companies that put in a decade (arguably more) of work and billions of dollars.
Startups will be able to use these open source models and iterate quickly with some slick UI but I am extremely skeptical that there’s a valuable venture scale business model from that alone4. The cost to build/train one of these open source models is extraordinary (i.e., millions of dollars). So tread cautiously around those selling you their AI startup or AI products as they’re either (1) using an open source model and their business is trivially exchangeable, (2) don’t know what they’re talking about (probably because of the Dunning-Kruger Effect), or (3) are lying.
Hot takes aside, the future of AI is extremely bright in Fintech5 and more broadly. Breakthroughs are continuing to happen and I have never been more excited about this ever evolving field. I am particularly bullish on the innovation in the software, data, and infrastructure surrounding these large models—that’s why I have dedicated my career6 to it.
If you are interested in staying up to date with the latest innovation in the field and less on the hype, I recommend following Yann LeCun (Chief AI Scientist at Meta, Turing Award Winner, Silver Professor at New York University, and a lot of other impressive stuff too!). He is probably the best person to follow for a broad set of information about AI. Not only is he one of the Godfathers of Deep Learning, he’s also quite active on social media giving a wonderful perspective about the history of many innovations (back when AI was boring).
I will leave you with his brilliance.
Most of human and animal learning is unsupervised learning. If intelligence was a cake, unsupervised learning would be the cake, supervised learning would be the icing on the cake, and reinforcement learning would be the cherry on the cake. We know how to make the icing and the cherry, but we don’t know how to make the cake. We need to solve the unsupervised learning problem before we can even think of getting to true AI.
-Yann LeCunn
Happy innovating.
-Francisco 🤠
Some Content Recommendations
’s latest from reviewed his 2022 predictions, most of which centered on crypto getting absolutely wrecked, was fantastic. His predictions were quite accurate. Certainly at the time the consensus opinions were that Crypto was going to be fine but things have only gotten worse. That said, I’m certainly optimistic about crypto’s future (those that do real work in the space and not embezzle money, of course). Alex Johnson at
wrote about Chime’s attempted acquisition of DailyPay, a new Fintech focusing on founders’ retirement, and Goldman layoffs at Marcus. He makes some good points about why Marcus struggled and some of the fight for attention was true based on my experience working there. Things changed quite dramatically under Solomon’s leadership compared to Blankfein’s (who initially sponsored the initiative). Broader GS didn’t really like a product that lost money whatsoever and they’ve been pulling back on it over the last few years, so it’s unfortunately not surprising. It’s a shame as Marcus’ Personal Loan product was truly innovative and consumer friendly when it launched, but Solomon is facing a lot of market pressure and it’s mostly because GS struggles to be as nimble as other competitors due to typical internal corporate bickering. During my time there, Lloyd had some pretty explicit sheltering to avoid too much internal interference—that changed dramatically after launch and probably caused much of the problems. To be explicit, there was just bad partnership across the firm due to competing incentives, and my honest take is that Marcus just stopped innovating as it grew. C'est la vie. by Simon Taylor wrote a beautiful piece about the 2023 State of Fintech. I absolutely loved it and highly recommend reading it as it’s so thoughtful and balanced about where the industry is and where there is opportunity.Postscript
Did you like this post? Do you have any feedback? Do you have some topics you’d like me to write about? Do you have any ideas how I could make this better? I’d love your feedback!
Feel free to respond to this email or reach out to me on Twitter! 🤠
One important non-technical reason for the growth of AI was the extraordinary open source community that built software and helped educate people that were just getting started. This was evident on Reddit, GitHub, LinkedIn, Twitter, and academic conferences. It really took tens of thousands of the brightest minds in computer science, statistics, and engineering to innovate successfully over the past ten years and it’s truly quite wonderful to see so many people rally for the progress of our collective digital intelligence.
I should note that there is a rich literature already on synthetic data generation and, in some sense, even the classic bootstrap is a form of data generation. So this isn’t as exceptionally new as one might think. I personally see the GAN approach to be similar to data augmentation, which reduces to a form of regularization, which is already well established in the field. Some folks probably will have disagreements with me on this.
Though aren’t we humans really nothing more than meat-based matrix multiplication machines? Actually, kind of.
That’s not to say that there can’t be good businesses in general but venture returns need much higher return value to repay the fund.
I should note that I tried to cover many application areas of AI in Fintech but I’m certainly omitting some. I would love to hear your ideas of ones that I missed!
There are 3 possible ways to interpret this: (1) I’m heavily biased, (2) I have strong conviction, or (3) I have no idea what I’m doing and just stumbled into it. It’s mostly (2) and (3).
Thank you for this inspiring post! Curious about how you see generative models to help with insights creation (and how you define insights?)
Maybe you have some system like this in mind?
* Input = transactions data of the user
* Output = some text giving the user some insights into their personal finance, maybe some advice, e.g. "Bill X is due in 3 days, remember to transfer $Y from Savings!"
Or are you thinking about it more in terms of developer / data scientist tooling?
One use case that I’m excited to see is using chatGPT for chargebacks/disputes. There is a known process for filing chargebacks and winning them is all about saying the right things, having the necessary proof, and following the right process. Chargeback win rates are notoriously low in the industry and friendly fraud is extremely high and prevalent. Further, people that commit friendly fraud are way more likely to do it again. Early stage fintechs can start beefing up their disputes management process with fewer team members by training GPT on their processes. Won’t be long before a startup emerges to automate this for fintechs.