JPEG Lessons for AI: Compression Leads to Collapse
Just like images, AI models risk losing clarity—what’s the way forward?
A few days ago, OpenAI upped the ante in the field of foundational model development with the release of the o1 model. Early usage reports indicate that it performs significantly better on specific tasks (e.g., coding) due to its enhanced reasoning capabilities—though this improvement comes at the cost of processing time, as O1 requires substantial time to think. It turns out the old adage “You want it good, fast, and cheap—pick one” is true for AI as well.
OpenAI’s strategy with o1 diverges from previous approaches that relied on training models on an ever-increasing volume of data. Instead, it closely mimics the human thought processes: step by step, with each output serving as input for the next.
Some experts argue that we have reached—or are nearing—a point where there is insufficient useful data to train our models beyond their current knowledge. As more and more (training) content is generated by AI itself—often referred to as “slop” in the content industry—we risk encountering a significant issue known as “model collapse.”
Model collapse occurs when an AI model becomes overly specialized or converges on a limited set of outputs, leading to a loss of diversity and generalization in its predictions. In simple terms, AI-generated content often tends to be average or mediocre, merely rehashing existing knowledge. Training an AI on such content results in a model that produces even more average output. Repeating this process can render the AI increasingly ineffective.
To illustrate this concept, consider JPEG image compression1. JPEG is the dominant image format, allowing images to be compressed into smaller file sizes using a destructive compression algorithm. After the first compression, the differences between the original image and the JPEG file are often imperceptible. However, compressing the JPEG file again and again introduces noticeable artifacts, such as blockiness and smearing. Repeating this process leads to a progressively degraded image, which, in the end, is just a mess.
And just as in photo editing, different algorithms lead to different results. This brings us back to the wonderful world of AI: There is a chance that we will (or already have) hit the ceiling as far as current approaches to AI go. Your next model update might only give you marginal gains—as did pretty much all the updates we have seen in the last year or so. The next big breakthrough will more likely come from new approaches to AI—and who knows which team or company will have the lead there?
@Pascal
Video by PetaPixel “What Happens When You Re-Save an Image 500 Times in Different Formats”