AI meets GIGO
I’m a frequent user of what passes for AI on Google. I used to just Google key words, but now I’ll Google questions. Google AI answers those questions. Usually, they’re simple questions like “When did ___ happen?” or “How old is ___?” For anything esoteric that I care about, I can follow up with my own search. So far, I’ve only caught Google AI in one mistake.
But thanks to internet pollution caused by ChatGPT, that could change:
“The rapid rise of ChatGPT — and the cavalcade of competitors’ generative models that followed suit — has polluted the internet with so much useless slop that it’s already kneecapping the development of future AI models.
“As the AI-generated data clouds the human creations that these models are so heavily dependent on amalgamating, it becomes inevitable that a greater share of what these so-called intelligences learn from and imitate is itself an ersatz AI creation.
“Repeat this process enough, and AI development begins to resemble a maximalist game of telephone in which not only is the quality of the content being produced diminished, resembling less and less what it’s originally supposed to be replacing, but in which the participants actively become stupider. The industry likes to describe this scenario as AI “model collapse.”
My guess is that software will be written to (a) filter for ChatGPT, (b) append reliable references to search results and/or (c) score the results for reliability. Meanwhile, caveat lector.
ChatGPT is polluting the database for AI
But thanks to internet pollution caused by ChatGPT, that could change:
“The rapid rise of ChatGPT — and the cavalcade of competitors’ generative models that followed suit — has polluted the internet with so much useless slop that it’s already kneecapping the development of future AI models.
“As the AI-generated data clouds the human creations that these models are so heavily dependent on amalgamating, it becomes inevitable that a greater share of what these so-called intelligences learn from and imitate is itself an ersatz AI creation.
“Repeat this process enough, and AI development begins to resemble a maximalist game of telephone in which not only is the quality of the content being produced diminished, resembling less and less what it’s originally supposed to be replacing, but in which the participants actively become stupider. The industry likes to describe this scenario as AI “model collapse.”
My guess is that software will be written to (a) filter for ChatGPT, (b) append reliable references to search results and/or (c) score the results for reliability. Meanwhile, caveat lector.
ChatGPT is polluting the database for AI

Joel:
What I have noticed is AI may not answer questions which may be controversial. Some topics are off limits to it. It appears to be artificially smart enough to avoid such questions. Such avoidance makes me laugh as such is a human type of avoidance when am answer may lead to a negative reaction or labeling as slanted.
@joel,
Yep – you nailed it. Gives new meaning to built-in obsolescence.
It’s called “model collapse”. There was an interesting paper on it last summer: “AI models collapse when trained on recursively generated data”.
Note that an early sign of the problem is the tail vanishing as the variance shrinks. Interesting or unconventional information or solutions become harder to find. Also note the problem of deliberate poisoning, adversarial training. This is going to come from the introduction of advertising and the LLM equivalent of SEO.
@Kaleberg,
Yes, model collapse is specifically mentioned in the link and in my pull quote. Thanks for the thorough explication.