So Much AI, So Much AND So Little Progress at the Same Time
All the headlines we didn’t have time to include in the newsletter last week… :)
Dear Friend,
Another week, another whiplash in (AI) headlines. On one hand, you have Google’s DeepMind unit announcing AlphaEvolve, an AI coding assistant that can design its own advanced algorithms; on the other hand, you see a distinct lack of AI-driven code contributions to open-source software projects, indicating that AI-generated code isn’t up to snuff (see our reporting below). The moral of the story: Ethan Mollick’s Jagged Frontier is alive and kickin’…
And now, this…
Headlines from the Future
The (AI) Canary in the Coal Mine ↗
Here is an interesting question: If AI is truly so good at creating code, why don’t we see many more code contributions in Open Source repositories made with the help of AI?
Here is Satya Nadella, CEO of Microsoft:
“Maybe 20 to 30 percent of the code that is inside our repos today in some of our projects is probably all written by software.”
That’s a lot of “maybe” and “probably”… And we just can’t know, as Microsoft’s code repository is (of course) private. But Open Source code lives in public places like GitHub and thus is inspectable. And when you look closely, you will find very little evidence that code in those repositories is written by AI.
Admittedly, a lot of Open Source projects aren’t particularly excited about AI-generated pull requests:
“It’s true that a lot of open source projects really hate AI code. … the biggest one is that users who don’t understand their own lack of competence spam the projects with time-wasting AI garbage.”
But that aside, when you look at the data, it’s just not there:
“TL/DR: a lot of noise, a lot of bad advice, and not enough signal, so we switched it off again.”
In many ways, AI keeps furthering the skill gap:
“The general comments … were that experienced developers can use AI for coding with positive results because they know what they’re doing. But AI coding gives awful results when it’s used by an inexperienced developer.”
Overall, a good reminder to look past the marketing and hype…
—//—
Should We Trust AI With Our Health When It Can’t Even Draw a Simple Map? ↗
On one hand, we have OpenAI announcing HealthBench, a physician-designed benchmark that rigorously evaluates AI models on 5,000 realistic health conversations across diverse scenarios, thus preparing to pave the way for Doc ChatGPT to become a reality. On the other hand, you have LLM-skeptic Gary Marcus trying something as trivial as having ChatGPT draw a map to hilarious effect:
It was very good at giving me bullshit, my dog-ate-my-homework excuses, offering me a bar graph after I asked for a map, falsely claiming that it didn’t know how to make maps. A minute later, as I turned to a different question, I discovered that it turns out ChatGPT does know how to draw maps. Just not very well.
After quite the saga (it’s worth reading Gary’s article in full), he concludes:
How are you suppose do data analysis with ‘intelligent’ software that can’t nail something so basic? Surely this is not what we always meant by AGI.
—//—
The Reports of the Finance Profession’s Death Are Greatly Exaggerated ↗
Think AIs will come for your CFO and his team? Not quite, not yet… Vals.ai’s Finance Agent Benchmark clearly shows we are some ways off from the professions caving in to our LLM-powered overlords:
“The foundation models are currently ill-suited to perform open-ended questions expected of entry-level finance analysts”
But fret not — not all is lost:
“Models on average performed best in the simple quantitative (37.57% average accuracy) and qualitative retrieval (30.79% average accuracy) tasks. These tasks are easy but time-intensive for finance analysts.”
Link to benchmark and analysis.
—//—
The Metaverse is Truly Dead ↗
Talk about a nail in the coffin: Minecraft, the proto-3D world-building game played by hundreds of millions, is shutting down its VR and Mixed Reality support.
If there ever was a question of whether the Metaverse is dead (though calling it ‘dead’ assumes it was once alive, which is debatable), this might be the definitive answer.
“Microsoft’s latest update to Minecraft’s Bedrock Edition, version 1.21.80, removes support for both virtual reality and Mixed Reality, Microsoft’s own take on AR that it killed off years ago. Now, Minecraft remains as a game that can be played on multiple consoles and platforms — just not VR.”
Of course, it is not just Minecraft—after losing oodles of money, Meta (the company which conveniently rebranded itself to indicate it is the king of VR) is shuttering large parts of its VR efforts for quite a while.
Makes me wonder what happened to all the Chief Metaverse Officers (the “new” CMOs)? And yes, THAT was indeed a thing for a hot minute…
A good reminder to keep applying Chris Yeh’s brilliant “Does it have utility?” framework: Frequency / Density / Friction: How often do you encounter the problem? How much time/energy do you spend in the problem space? How much pain does it cause you? If those factors are too low, you simply don’t have a problem worth solving. Metaverse - Never/None/Zero. Q.E.D.
What We Are Reading
😰 Employees Fear the Stigma of Using AI at Work Employees are being encouraged to use AI; however, many worry they will be perceived as lazy if they do. @Jane
🤖 2025: The Year the Frontier Firm Is Born In a report filled with interesting stats, Microsoft Worklab envisions a future of augmented knowledge work steered by human employees in the role of “agent bosses.” @Jeffrey
🎯 Scenario Planning Is Getting a Stress Test When the world keeps throwing curveballs, scenario planning is undergoing its own stress test. Because in 2025, even your backup plan needs a backup plan. @Kacee
🔄 The World Is Wooing U.S. Researchers Shunned by Trump A hugely unusual shift in the international flow of academic talent. While other nations usually combat a brain drain to the United States, they are now shuffling to present the best pitch to US researchers looking to move. @Julian
🪞 AI for Thee, but Not for VC It should give you pause when the billionaire funding much of the drive to replace every conceivable job with AI believes that only he himself will be irreplaceable. What was that story about Icarus again? @Pascal
Rabbit Hole Recommendations
Cloudflare CEO warns AI and zero-click internet are killing the web’s business model
AI’s energy demands are out of control. Welcome to the internet’s hyper-consumption era
As an experienced LLM user, I actually don’t use generative LLMs often
Burrito now, pay later – liquid lunches and the serious case for BNPL securitization
China just made the world’s fastest transistor, and it is not made of silicon
Bus stops here: Shanghai lets riders design their own routes
Tesla has yet to start testing its robotaxi service without driver weeks before launch
Happy Distractions
🧱 LegoGPT: Generating physically stable and buildable LEGO® designs from text
📎 Microsoft Clippy (yep, THAT ONE!) is back as an interface to your LLM of choice…
🏎️ Lego built full-size F1 cars for the Miami GP drivers’ parade. Here’s how they did it
🎂 We are the robots! Robotics meets the culinary arts
🥖 This wrapping paper turns all your presents into bread
Do your fingers wrinkle the same way every time you’re in the water too long? Research says yes
☕ Why are coffee stains darkest at the edges when they dry?
📜 Secret messages detected on Egyptian obelisk in Paris
📜 A lovely collection of early internet artifacts - you may touch the artifacts
🌐 A truly stunning visualization of the Internet from its inception to today
Great references. You go beyond the usual sources, which I really appreciate