VorWorte

Wissen schafft auch Leid[en]. Das nur zur Komplettierung und als Hintergrund für das Fragezeichen.

Dieses Blog ist eine persönliche Frage- und Antwort-Reise, mit dem Ziel das Wirklichkeit in {und aus} das ein-zu-bringen, was auch daIst: *ich.

Wissen schafft Freude? Wie? Indem das aktuelle Wissen angewendet wird, statt auf das Gestern und dessen "Wissen" zu setzen.

Viel Freude beim lesen.

Freitag, 5. Mai 2023

Maschinen lernen schneller als Menschen. Warum?

 Poesie

Titel; Wie lernt ein Maschinen-Programm

All NewslettersRead Online
New York Times logo
Technology
FOR SUBSCRIBERSMAY 5, 2023

How generative A.I. really works

Hello! We’re back with a bonus edition of On Tech: A.I., adding to our five-part series. Over the next few weeks and months, we’ll be bringing you highlights of New York Times A.I. coverage, especially pieces that help you understand the underlying technology, and how to use it yourself.

My colleague Aatish Bhatia wrote a fascinating article that reveals the inner workings of generative artificial intelligence software like ChatGPT.

If you remember what Cade Metz and Kevin Roose showed you earlier, these large language models are notoriously opaque, but the basic idea behind them is surprisingly simple: They are trained by going through mountains of text, repeatedly guessing the next few letters and then grading themselves against the real thing.

To show you what this process looks like, Aatish trained six tiny language models on some bodies of text: the complete works of Jane Austen and Shakespeare, plus the Federalist Papers, transcripts of the TV show “Star Trek: The Next Generation,” “Moby Dick,” and the Harry Potter novels.

State-of-the-art models like GPT-4 from OpenAI are trained on hundreds of billions of words, for weeks and months. BabyGPT — the model created by Aatish for his article — is ant-size in comparison.

It was trained for about an hour on a consumer-grade laptop on text sources of up to a million words. But that stripped-down approach makes it easier to peek under the hood and see how large language models really operate.

Let’s walk through the process, starting with the Jane Austen bot, which we’ll prompt with the text, “‘You must decide for yourself,’ said Elizabeth.” (For the full interactive experience, click here.)

Before training: Gibberish

Initially, BabyGPT’s guesses are completely random and include lots of special characters. BabyGPT hasn’t yet learned which letters are typically used in English, or that words even exist.

This is how language models usually start off: They guess randomly and produce gibberish. But they learn from their mistakes, and over time, their guesses get better. Over many, many rounds of training, language models can learn to write by figuring out statistical patterns that piece words together into sentences and paragraphs.

250 rounds: English letters

After about 250 iterations, or 30 seconds of processing on a modern laptop, BabyGPT has learned its ABC’s and is starting to babble:

In particular, our model has learned which letters, like “E” are most frequently used in the text. It has also learned some small words: I, to, the, you, and so on, and is also inventing its own words, like “athok,” “trnglad,” and “hastamt.”

Obviously, these guesses aren’t great. But BabyGPT keeps a score of exactly how bad its guesses are.

Every round of training, it goes through the original text and compares its guesses with what actually comes next. It then calculates a score, known as the “loss,” which measures the difference between its predictions and the actual text. BabyGPT’s goal is to try to reduce this loss and improve its guesses.

500 rounds: Small words

After a minute on a laptop, it can spell a few small words:

It’s also starting to learn some basic grammar, like where to place periods and commas. But it makes plenty of mistakes.

5,000 rounds: Bigger words

Ten minutes in, BabyGPT’s vocabulary has grown:

BabyGPT now makes fewer spelling mistakes. It still invents some longer words, but not as often. It’s also starting to learn some names that occur frequently in the text. Its grammar is improving, too.

Every round of training, an algorithm adjusts these numbers to try to improve its guesses, using a mathematical technique known as backpropagation. Tuning these internal numbers to improve predictions is how a neural network “learns.”

30,000 rounds: Full sentences

An hour into its training, BabyGPT is learning to write in full sentences. Just an hour ago, it didn’t even know that words existed!

The words still don’t make sense, but they definitely look more like English.

BabyGPT doesn’t copy and paste sentences verbatim; it stitches new ones together, letter by letter, based on statistical patterns that it has learned from the data. The neural network generates probabilities, rather than actual letters or words, which is why you can get a different answer every time you generate a new response.

Diminishing returns

Because the training data is relatively small, and we used a laptop rather than a huge array of computers, we quickly hit a point at which BabyGPT isn’t going to get much smarter.

BabyGPT still has a long way to go before its sentences become coherent or useful. It can’t answer a question or debug your code. It’s mostly just fun to watch its guesses improve.

But it’s also instructive. In just an hour of training on a laptop, a language model can go from generating random characters to a very crude approximation of language. Larger language models use more data and computing power to mimic language more convincingly.

The week in A.I. news

BabyGPT used an algorithm developed by Andrej Karpathy, a prominent A.I. researcher who recently joined OpenAI, the company behind ChatGPT. Karpathy was first exposed to artificial intelligence as a student at the University of Toronto, where he took a class by Geoffrey Hinton, known as “the Godfather of A.I.”

Earlier this week, Cade scored an exclusive scoop: Hinton quit his job at Google, where he had worked for more than a decade, so he can freely speak out about the risks of A.I.

“It is hard to see how you can prevent the bad actors from using it for bad things,” Hinton said.

Other news:

Need help? Review our newsletter help page or contact us for assistance.

You received this email because you signed up for On Tech: A.I. from The New York Times.

To stop receiving On Tech: A.I., unsubscribe. To opt out of other promotional emails from The Times, manage your email preferences.

Explore more subscriber-only newsletters.

Connect with us on:

facebooktwitterinstagram

Change Your EmailPrivacy PolicyContact UsCalifornia Notices

The New York Times Company. 620 Eighth Avenue New York, NY 10018

Keine Kommentare:

Kommentar veröffentlichen