AI in Finance and Banking, September 15, 2025

This semi-monthly column highlights news, government documents, NGO/IGO papers, conferences, industry white papers and reports, academic papers and speeches, and central bank actions on the subject of AI’s fast paced impact on the banking and finance sectors. The chronological links provided are to the primary sources, and as available, indicate links to alternate free version.


NEWS:

Goldman Sachs bankers explore limits of AI: ‘The risk is over-reliance’. no paywall  One partner explains the efficiencies and limitations of new tools. Financial Times, September 14, 2025. Kerry Blum, who works within Goldman’s private wealth management business, says she is ‘finding new ways to use [AI] every day’ and that it is saving her a few hours a week. When Goldman Sachs partner Kerry Blum was wrestling with how to communicate a new project to staff, she found a quick solution. “Candidly I was having a bit of writer’s block . . . While I could have spent time iterating on the framing of the proposal on my own, I decided to brainstorm with the AI assistant.” The session, which she says sped up and enhanced her work, is an example of the new efficiencies bankers like Blum are finding through AI. Goldman rolled out its generative AI-powered platform — GS AI Assistant — to all its roughly 46,000 employees in June, telling staff the aim was for it to help with tasks such as summarising complex documents, drafting content and performing data analysis. Its Wall Street rivals have provided similar tools to tens of thousands of staff in recent months as they attempt to boost productivity. Finance has used forms of AI for decades — to manage funds, decide if customers qualify for loans or detect fraud, for example — but rapid adoption of new generative or agentic tools could transform work across divisions. In a Bloomberg survey of banks published last month, 70 per cent indicated generative AI would be widely used or critical to their business in the next two years, compared with 24 per cent now. Blum, who started at Goldman full time in 2001 and now runs the equity structuring group within its private wealth management business, says she uses the tool for as many as 10 tasks a day. AI at work: how are jobs changing? This is the second in a series of interviews examining how AI is shaping different roles Part one: How AI is helping one doctor treat cancer She sees one big risk with such AI programmes on Wall Street: that bankers become too dependent on them.


The State of AI in Financial Services in 2025 — views from our front row seats. Medium, August 28, 2025. That’s what we’ve been hearing across financial institutions. We are in an unprecedented moment where traditional financial institutions are actively seeking to adopt new technology, even from early-stage companies. Historically, large corporations, have built a reputation that is reactive. They’ve avoided being the first to adopt new technology, being wary of the risks of failure. But the narrative is changing. Today the most successful financial institutions are the ones that are “shifting left”, working with earlier stage companies that are at the edge of new ideas or even actively building new technology out themselves. This shift is helping them differentiate and establish category leadership for the years ahead. It’s an exciting time to be involved in startups, as an explosion of ideas are being created and funded (and even getting acquired or going public). Companies are landing customers faster than ever, and AI has created a catalyst for financial institutions to establish an imperative in adopting new technology. At Illuminate, we have front row seats to the global problem statements and priorities of technology across the leading financial institutions, and are collaborating with each of them to establish the next generation of technology. These are our strategic partners — JPM, Citi, Barclays, BNY, Jefferies, S&P Global, Euroclear, Deutsche Börse, and SGX — and we work with senior leaders and executives every day. Over the course of the last two years, we’ve been listening closely to their needs and connecting founders with decision makers. We’re excited to share the areas where we’re seeing the strongest demand and growth across our conversations.


PAPERS – NBER:

AI Agents for Economic Research. Anton Korinek. Working Paper 34202. DOI 10.3386/w34202. Issue Date The objective of this paper is to demystify AI agents – autonomous LLM-based systems that plan, use tools, and execute multi-step research tasks – and to provide hands-on instructions for economists to build their own, even if they do not have programming expertise. As AI has evolved from simple chatbots to reasoning models and now to autonomous agents, the main focus of this paper is to make these powerful tools accessible to all researchers. Through working examples and step-by-step code, it shows how economists can create agents that autonomously conduct literature reviews across myriads of sources, write and debug econometric code, fetch and analyze economic data, and coordinate complex research workflows. The paper demonstrates that by “vibe coding” (programming through natural language) and building on modern agentic frameworks like LangGraph, any economist can build sophisticated research assistants and other autonomous tools in minutes. By providing complete, working implementations alongside conceptual frameworks, this guide demonstrates how to employ AI agents in every stage of the research process, from initial investigation to final analysis.


The Algorithm Advantage: Ranked Application Systems Outperform Decentralized and Common Applications in Boston and Beyond. Christopher Avery, Geoffrey Kocks & Parag A. Pathak. Working Paper 34207. DOI 10.3386/w34207. Issue Date

School choice systems increasingly use common applications, where students can apply to multiple schools on a single form, though schools make admission decisions independently. We model three application systems: a common application, a decentralized system with costly separate applications, and a ranked-choice system using a matching algorithm. Our model shows that while a common application may expand access, it increases competition and may produce worse matches than a decentralized system where application costs encourage more selective applications. Ranked-choice systems combine reduced application costs with preference-based matching that reduce mismatches. We examine these predictions by analyzing how Boston’s charter school sector was affected when it adopted an online common application. Counterfactual simulations suggest the common application performs no better than alternatives on several metrics and did little to increase access for disadvantaged groups. A ranked system consistently outperforms a common application across various levels of competition and assumptions on preference stability between application and enrollment stages.


Artificial Writing and Automated Detection. Brian Jabarian & Alex Imas. Working Paper 34223. DOI 10.3386/w34223. Issue Date

Artificial intelligence (AI) tools are increasingly used for written deliverables. This has created demand for distinguishing human-generated text from AI-generated text at scale, e.g., ensuring assignments were completed by students, product reviews written by actual customers, etc. A decision-maker aiming to implement a detector in practice must consider two key statistics: the False Negative Rate (FNR), which corresponds to the proportion of AI-generated text that is falsely classified as human, and the False Positive Rate (FPR), which corresponds to the proportion of human-written text that is falsely classified as AI-generated. We evaluate three leading commercial detectors—Pangram, OriginalityAI, GPTZero—and an open-source one —RoBERTa—on their performance in minimizing these statistics using a large corpus spanning genres, lengths, and models. Commercial detectors outperform open-source, with Pangram achieving near-zero FNR and FPR rates that remain robust across models, threshold rules, ultra-short passages, “stubs” (≤ 50 words) and ’humanizer’ tools. A decision-maker may weight one type of error (Type I vs. Type II) as more important than the other. To account for such a preference, we introduce a framework where the decision-maker sets a policy cap—a detector-independent metric reflecting tolerance for false positives or negatives. We show that Pangram is the only tool to satisfy a strict cap (FPR ≤ 0.005) without sacrificing accuracy. This framework is especially relevant given the uncertainty surrounding how AI may be used at different stages of writing, where certain uses may be encouraged (e.g., grammar correction) but may be difficult to separate from other uses.


NGOs/IGOs:

Managing explanations: how regulators can address AI explainability FSI Occasional Papers |  No 24 08 September 2025. by Fernando Perez-Cruz, Jermy Prenio, Fernando Restoy and Jeffery Yong. PDF full text

The increasing adoption of artificial intelligence (AI) by financial institutions is transforming their operations, risk management and customer interactions. Nevertheless, the limited explainability of complex AI models, particularly when used in critical business applications, poses significant challenges and issues for financial institutions and regulators. Explainability, or the extent to which a model’s output can be explained to a human, is essential for transparency, accountability, regulatory compliance and consumer trust. Yet, complex AI models, such as deep learning and large language models (LLMs), are often difficult to explain. While there are existing explainability techniques that can help shed light on complex AI models’ behaviour, these techniques have notable limitations, including inaccuracy, instability and susceptibility of misleading explanations.

Limited model explainability makes managing model risks challenging. Global standard-setting bodies have issued – mostly high-level – model risk management (MRM) requirements. However, only a few national financial authorities have issued specific guidance, and they tend to focus on models used for regulatory purposes. Many of these existing guidelines may not have been developed with advanced AI models in mind and do not explicitly mention the concept of model explainability. Rather, the concept is implicit in the provisions relating to governance, model development, documentation, validation, deployment, monitoring and independent review. It would be challenging for complex AI models to comply with these provisions. The use of third-party AI models would exacerbate these challenges.

As financial institutions expand their use of AI models to their critical business areas, it is imperative that financial authorities seek to foster sound MRM practices that are relevant in the context of AI. Ultimately, there may be a need to recognise trade-offs between explainability and model performance, so long as risks are properly assessed and effectively managed. Allowing the use of complex AI models with limited explainability but superior performance could enable financial institutions to better manage risks and enhance client experiences, provided adequate safeguards are introduced. For regulatory capital use cases, complex AI models may be restricted to certain risk categories and exposures or subject to output floors. Regulators must also invest in upskilling staff to evaluate AI models effectively, ensuring that financial institutions can harness AI’s potential without compromising regulatory objectives.

Posted in: AI in Banking and Finance, Economy, Financial System