So Much AI, So Much AND So Little Progress at the Same Time

All the headlines we didn’t have time to include in the newsletter last week… :)

May 16, 2025

Dear Friend,

Another week, another whiplash in (AI) headlines. On one hand, you have Google’s DeepMind unit announcing AlphaEvolve, an AI coding assistant that can design its own advanced algorithms; on the other hand, you see a distinct lack of AI-driven code contributions to open-source software projects, indicating that AI-generated code isn’t up to snuff (see our reporting below). The moral of the story: Ethan Mollick’s Jagged Frontier is alive and kickin’…

And now, this…

Headlines from the Future

The (AI) Canary in the Coal Mine ↗

Here is an interesting question: If AI is truly so good at creating code, why don’t we see many more code contributions in Open Source repositories made with the help of AI?

Here is Satya Nadella, CEO of Microsoft:

“Maybe 20 to 30 percent of the code that is inside our repos today in some of our projects is probably all written by software.”

That’s a lot of “maybe” and “probably”… And we just can’t know, as Microsoft’s code repository is (of course) private. But Open Source code lives in public places like GitHub and thus is inspectable. And when you look closely, you will find very little evidence that code in those repositories is written by AI.

Admittedly, a lot of Open Source projects aren’t particularly excited about AI-generated pull requests:

“It’s true that a lot of open source projects really hate AI code. … the biggest one is that users who don’t understand their own lack of competence spam the projects with time-wasting AI garbage.”

But that aside, when you look at the data, it’s just not there:

“TL/DR: a lot of noise, a lot of bad advice, and not enough signal, so we switched it off again.”

In many ways, AI keeps furthering the skill gap:

“The general comments … were that experienced developers can use AI for coding with positive results because they know what they’re doing. But AI coding gives awful results when it’s used by an inexperienced developer.”

Overall, a good reminder to look past the marketing and hype…

Here is the full article.

—//—

Should We Trust AI With Our Health When It Can’t Even Draw a Simple Map? ↗

On one hand, we have OpenAI announcing HealthBench, a physician-designed benchmark that rigorously evaluates AI models on 5,000 realistic health conversations across diverse scenarios, thus preparing to pave the way for Doc ChatGPT to become a reality. On the other hand, you have LLM-skeptic Gary Marcus trying something as trivial as having ChatGPT draw a map to hilarious effect:

It was very good at giving me bullshit, my dog-ate-my-homework excuses, offering me a bar graph after I asked for a map, falsely claiming that it didn’t know how to make maps. A minute later, as I turned to a different question, I discovered that it turns out ChatGPT does know how to draw maps. Just not very well.

After quite the saga (it’s worth reading Gary’s article in full), he concludes:

How are you suppose do data analysis with ‘intelligent’ software that can’t nail something so basic? Surely this is not what we always meant by AGI.

—//—

The Reports of the Finance Profession’s Death Are Greatly Exaggerated ↗

Think AIs will come for your CFO and his team? Not quite, not yet… Vals.ai’s Finance Agent Benchmark clearly shows we are some ways off from the professions caving in to our LLM-powered overlords:

“The foundation models are currently ill-suited to perform open-ended questions expected of entry-level finance analysts”

But fret not — not all is lost:

“Models on average performed best in the simple quantitative (37.57% average accuracy) and qualitative retrieval (30.79% average accuracy) tasks. These tasks are easy but time-intensive for finance analysts.”

Link to benchmark and analysis.

—//—

The Metaverse is Truly Dead ↗

Talk about a nail in the coffin: Minecraft, the proto-3D world-building game played by hundreds of millions, is shutting down its VR and Mixed Reality support.

If there ever was a question of whether the Metaverse is dead (though calling it ‘dead’ assumes it was once alive, which is debatable), this might be the definitive answer.

“Microsoft’s latest update to Minecraft’s Bedrock Edition, version 1.21.80, removes support for both virtual reality and Mixed Reality, Microsoft’s own take on AR that it killed off years ago. Now, Minecraft remains as a game that can be played on multiple consoles and platforms — just not VR.”

Of course, it is not just Minecraft—after losing oodles of money, Meta (the company which conveniently rebranded itself to indicate it is the king of VR) is shuttering large parts of its VR efforts for quite a while.

Makes me wonder what happened to all the Chief Metaverse Officers (the “new” CMOs)? And yes, THAT was indeed a thing for a hot minute…

A good reminder to keep applying Chris Yeh’s brilliant “Does it have utility?” framework: Frequency / Density / Friction: How often do you encounter the problem? How much time/energy do you spend in the problem space? How much pain does it cause you? If those factors are too low, you simply don’t have a problem worth solving. Metaverse - Never/None/Zero. Q.E.D.

Game over.

What We Are Reading

😰 Employees Fear the Stigma of Using AI at Work Employees are being encouraged to use AI; however, many worry they will be perceived as lazy if they do. @Jane

🤖 2025: The Year the Frontier Firm Is Born In a report filled with interesting stats, Microsoft Worklab envisions a future of augmented knowledge work steered by human employees in the role of “agent bosses.” @Jeffrey

🎯 Scenario Planning Is Getting a Stress Test When the world keeps throwing curveballs, scenario planning is undergoing its own stress test. Because in 2025, even your backup plan needs a backup plan. @Kacee

🔄 The World Is Wooing U.S. Researchers Shunned by Trump A hugely unusual shift in the international flow of academic talent. While other nations usually combat a brain drain to the United States, they are now shuffling to present the best pitch to US researchers looking to move. @Julian

🪞 AI for Thee, but Not for VC It should give you pause when the billionaire funding much of the drive to replace every conceivable job with AI believes that only he himself will be irreplaceable. What was that story about Icarus again? @Pascal