Degenerative AI

In an attempt to catch up with Google, a company that has successfully destroyed its popular search product through sheer force of greed, Microsoft invested a further $10 billion in ChatGPT creator OpenAI (six days after laying off 10,000 people), hoping to make its own search engine Bing better, or more popular, but certainly not more profitable. The net result is a search engine with narcissistic personality disorder, responding to user queries with emotionally manipulative and outright false claims.

I Do Not Know What You Are Talking About, Man!

If you haven’t been keeping up with this stuff, the latest goldrush in technology is in ‘generative AI,’ meaning in broad terms that an artificial intelligence “generates” content as a result of a user’s input. ChatGPT, created by OpenAI, is the largest and most popular. Companies can use the API - a way of connecting one application to another - to power their own products.

For example, there are generative AIs that can create images, or write articles about subjects, all from a user prompt. You can say “write me 1000 words on where the finest crumpets are found” into ChatGPT, and it will produce 1000 words on that subject. Stable Diffusion does the same thing with images - you can ask it to, say, make a watercolor painting of a duck with a Remington shotgun, and it will do its best to create it.

To do so, these products are “trained” on datasets, creating massive issues of bias and copyright infringement, because these generative AIs have to learn to construct whatever they’re generating from somewhere. Because, ultimately, these products are inherently learning from other people’s work.

The problem with these AI-powered search engines is that they do not have consistent results. Both Google’s Bard AI and Bing AI had obvious errors in their first demos, largely in part because they are, from what I can tell, making their best guess every time. As a result, Bing couldn’t tell the difference between a cordless or corded vacuum cleaner, or correctly interpret financial reports - tasks that one would not expect Bing or Google to do, unless the CEOs of both companies made statements about them being able to do so and then showed the world them trying to do so in a demo for the press.

In a vacuum, these are minor errors. They are obvious problems that could be overshadowed by how cool it is that an AI can do this. But when put in the context of “this is how search engines are going to work going forward,” they are absolutely fucking ridiculous.

The point of a search engine is that you are searching for something, such as a solution to a problem or the location of something. The results - though manipulated to extract as much capital out of the user as possible - were never truly positioned as “answers.” You, the user, chose what was “right” in your results. The search engine offered ideas but not conclusions, meaning that while manipulations were possible (and quite common!), these engines were rarely prescriptive.

This is, if I had to guess, because these companies did not want the legal liability of being wrong. Search engines and social media platforms have always been terrified of the possibility of being seen as anything but objective - despite the fact that they quite literally manipulate the results of what you see algorithmically. The defense mechanism of blaming the vague idea of an “algorithm” worked well, because it shifted the blame from individuals to abstract lines of computer code.

The algorithm only suggested something was “good” or “right” or “best.” It left the actual judgment to the content it was presenting to the users. Even in the case of Facebook and Cambridge Analytica (and the latter’s Canadian evil twin, Aggregate IQ, which played a decisive role in the Brexit referendum), users were manipulated by the content offered by the algorithm, but Facebook itself did not claim ownership or endorse any of these beliefs.

Though I’m not a lawyer, it seems fairly obvious that this becomes a threat to numerous providers’ protections under Section 230 of the Communications Decency Act. To quote The Verge:

Section 230 of the Communications Decency Act, which was passed in 1996, says an “interactive computer service” can’t be treated as the publisher or speaker of third-party content. This protects websites from lawsuits if a user posts something illegal, although there are exceptions for copyright violations, sex work-related material, and violations of federal criminal law.

Section 230 specifically protects platforms - so Google, or Bing - from being held accountable, both civilly and criminally, for users’ content. But Microsoft has already admitted that Bing may misrepresent the information that Bing’s AI finds, meaning that while content may be represented by particular sources, it is being “created” by Bing - which Microsoft would likely be potentially liable for.

Again, this is something Google should know about. In 2022, an Australian court found the company liable for allegedly-defamatory content published on YouTube pertaining to John Barilaro, who, prior to the case, served as the Deputy Premier of New South Wales. Without the protections of Section 230, it was forced to pay Barilaro A$715,000 and (along with the YouTuber) was referred to a separate court for possible criminal contempt charges over allegations they put “improper pressure” on the former politician during the trial.

In fact, James Vincent of The Verge raised several troubling ramifications of generative AI based on regulation alone:

For example, will EU publishers want AI search engines to pay for the content they scrape the way Google now has to pay for news snippets? If Google’s and Microsoft’s chatbots are rewriting content rather than merely surfacing it, are they still covered by Section 230 protections in the US that protect them from being held liable for others’ content? And what about privacy laws? Italy recently banned an AI chatbot called Replika because it was collecting information on minors. ChatGPT and the rest are arguably doing the same. Or how about the “right to be forgotten”? How will Microsoft and Google ensure their bots aren’t scraping delisted sources, and how will they remove banned information already incorporated into these models?

Bard and ChatGPT are great examples of products created from a purely US-centric worldview. It’s something that makes sense in a world where the protections granted by the First Amendment are the norm. Except that very much isn’t the case everywhere. Even within the democratic and free West, societies have wildly different standards for what kinds of speech they are willing to tolerate.

Fun fact: Many European countries — notably France, Germany, Belgium, and Austria — criminalize both holocaust denial and the veneration of the Third Reich. With the horrors of Nazism still within living memory, they are unwilling to tolerate any form of revisionism or equivocation.

What does that have to do with AI? Well, in 2016 Microsoft launched a conversational chatbot called Tay, which pioneered many of the concepts found in today’s splashy generative AI products. Within one day of going live, Tay — who was engineered to replicate the writing patterns of a 19-year-old American girl. Nothing creepy about that — was loudly proclaiming the supremacy of Adolf Hitler and declaring vile anti-semitic views.

Although hugely embarrassing for Microsoft, Tay was a small-scale experiment designed to test AI concepts outside the laboratory. It was never a real product, and thus it didn’t attract attention from lawmakers or regulators. But it’s easy to imagine that if Bing AI or Bard suddenly start referring high school history students to the works of David Irving, the authorities in Berlin or Paris might take action.

To be fair to ChatGPT, the reinforcement bit of its learning model (where it adjusts results based on human feedback) is done in-house, in part by an army of Kenyan contractors earning just $2 an hour. Tay allowed anyone with a Twitter account to give feedback, and hence, it was easily gamable. But even with that caveat, there’s still the possibility that something could go terribly wrong.

You see, these generative AIs - or at least the sources they use to generate from - require massive datasets that they can learn from. ChatGPT uses a dataset of 45 Terabytes of text, from sources including Wikipedia, a community-maintained encyclopedia, and miscellaneous “books.” Finding the content within these sets that is “right” or “good” or “factually correct” is difficult enough before you consider whether the content itself can be legally used. And I can’t imagine it’s easy for ChatGPT or any of these models to “unlearn” something once they’ve learned it - and the more it consumes, the more difficult it will be to untangle the bias of something it has already learned.

Where’s Your Ed At is a free newsletter, but if you like my work and want to kick me a few dollars, you can do so here. I really appreciate your support.

The Rotten AI

On top of that, these products are extremely expensive to run. ChatGPT burns millions of dollars a day in computing power, while charging a per-access-token price for companies to plug into its models. OpenAI expects $200 million in revenue in 2023, claims that they’ll be making a billion a year in revenue in 2024, but as their AI becomes more widely adopted into other products, so will the cost of providing that service, and I see nothing about how this company could possibly be profitable. On top of that, OpenAI will be handing over 75% of its profits until Microsoft has recouped its $10 billion investment. To survive and perform even its most basic tasks, OpenAI must constantly consume information and burn capital, and as it grows in complexity, so will its technological demands, ethical concerns and genuine threats to society.

ChatGPT and generative AI are not themselves the problem, but as with many things in tech, the way that they are thrown into the market has made them potentially dangerous. On its own, ChatGPT is considered fallible and experimental - a thing to mess around with - but when plugged into something else silently, it now lacks the labels that are necessary to understand what it is and what it does (or can do).

As said last week, tech companies are incentivized to grow at all costs, even if said costs involve them acting in reckless, unethical ways. A ChatGPT-powered search engine marketed by a massive tech company as a search engine is one that users are likely to believe the answers of, meaning that when said search engine gives patently incorrect and “unhinged” results, we are likely to see situations of misinformation that dwarf the damage caused by Cambridge Analytica and Facebook. Microsoft rushed out Bing AI because they wanted their shareholders to see them as constantly growing, despite the fact that it is both regularly wrong and actively manipulative users.

Google, terrified that Microsoft had one up on them, rushed out their own “Bard AI” search engine, despite the fact that, according to Alphabet’s Chairman, the product “wasn’t really ready.” These companies were and are fully willing to do things that are reckless and dangerous to constantly market themselves as “growing” both in revenue and market share, even if the way in which they grow is both unsustainable and actively harmful to society.

What happens if Bing AI or Bard becomes so widespread that they become sources of information that consumers rely on? What happens if Bing AI tells somebody suffering from crippling depression that they have “not been a good user” at the wrong time? What happens if somebody relies on Bing for “advice” and that advice results in them doing something unethical or illegal? While these are problems one might have with ChatGPT itself, they become much more pronounced when the source of this information is a search engine, which is a product that hundreds of millions of people have been conditioned to trust and respect?

Finally, what safeguards do we have against Google and Microsoft using their AI products to further their own agenda, or distort the democratic process to their own advantage? Companies — especially large multi-billion tech conglomerates — are not apolitical entities, with both spending millions each year on lobbyists, all with the goal of advancing legislation they like and halting laws they don’t. If Bard and Bing AI are tools for misinformation, as I fear they are, what’s to stop their owners from - even accidentally - becoming the world’s largest misinformation engine?

And the reason that these risks are so significant is that both Microsoft and Google were desperate to show that they will never, ever stop growing. They did not slowly and methodically roll out these products with constant reminders of their fallibility - they branded and marketed them as the future, as the new way in which we request and process information. Microsoft and Google’s customers are victims of rich people playing with toys, hoping to find ways to please rich people that invest the money of other rich people.

The result will be genuine human suffering as a result of dressing a fallible, experimental intelligence up as an authoritative source of information. People will be inspired to harm themselves or others, all while their ideas and likenesses are counterfeited and monetized, all so that wealthy executives can tell the world - and Wall Street - that their companies are “innovative.”

Degenerative AI

The Rotten AI

Ed Zitron

Empty Laughter

A Continual Christmas

Software Is Beating The World

Welcome to Where's Your Ed At!