How has Artificial Intelligence progressed?

2024-09-07 15:16:55, Tech CNA

During the summer of 1956, a small but famous group of scholars gathered at Dartmouth College in New Hampshire.

It included Claude Shannon, the creator of information theory, and Herb Simon, the only person to have won the Nobel Memorial Prize in Economic Sciences from the Royal Swedish Academy of Sciences, and the Turing Award from the Society for Computing Machinery.

They were invited by a young researcher, John McCarthy, who wanted to talk about "how they could make machines use language, form abstractions and concepts" and "solve different kinds of problems like humans."

This was the first academic meeting devoted to what McCarthy called "artificial intelligence". For the next 60 years, in this field, advances would be made that were not on par with the ambitions of researchers.

The meeting at Dartmouth College did not mark the beginning of scientific research into machines that could think like humans. Alan Turing, after whom the Turing Award is named, was skeptical of the idea; so did John von Neumann. Until 1956, there were a number of approaches to this issue.

Historians think that one of the reasons why McCarthy coined the term "artificial intelligence" for his project was that the name was all-encompassing, leaving open the question of what the best approach might be.

Some researchers liked systems that combined facts with axioms about the world, such as those in geometry and symbolic logic, in order to derive the right answers.

Others preferred building systems where the probability of something depended on the updated probabilities of many other things.

How has Artificial Intelligence progressed?

In the following decades, there were many intellectual debates and arguments on this topic, but by the 1980s, a general agreement was reached on the way to go: "expert systems" that used symbolic logic to capture and apply the most advanced knowledge. human good.

The Japanese government in particular financially supported such systems, paying for the equipment they might need.

But for the most part, these systems couldn't handle the messiness of the real world. By the late 1980s, the reputation of artificial intelligence had declined. Scholars began to avoid the term.

How today's boom was born

Today's boom was born precisely from those persistent few who were left. After the 1940s gained a deeper understanding of how brain cells worked, scientists began to wonder if machines could be wired in the same way as neurons.

In a human brain, neurons are connected in a way that allows activity in one neuron to stimulate or suppress activity in another.

This makes a neuron dependent on what other neurons connected to it are doing.

In the first attempt to recreate this model in the laboratory (by Marvin Minsky, a meeting participant at Dartmouth College), devices were used to mimic neural networks. Since then, layers of interconnected neurons have been simulated in software.

These artificial neural networks are not programmed using clear rules. They actually "learn" by being exposed to many examples.

During this training, the strength of connections between neurons is adjusted from time to time so that a given input of data produces an appropriate output.

Minsky himself abandoned the idea, but others took it forward. In the early 1990s, neural networks were trained to help with sorting by recognizing handwritten numbers.

The researchers thought that adding more layers of neurons could allow for more sophisticated achievements. But the downside was that the systems ran much slower.

A new type of computer hardware provided a way out of the problem. Its potential was clearly demonstrated in 2009, when researchers at Stanford University increased by 70 times the speed at which an artificial neural network could operate, using a gaming computer in their dorm room.

This was possible because, like the "central processing unit" (CPU) found in all computers, this network also had a "graphics processing unit" (GPU) to create a game world on the screen. And the GPU was designed in a suitable way to allow the implementation of the neural network code.

Coupling this hardware with more efficient training algorithms meant that networks with millions of connections could be trained in a reasonable amount of time.

Also neural networks could handle larger inputs and more importantly, they could be given more layers. These "deeper" networks turned out to be much more capable.

How has Artificial Intelligence progressed?

"Deep Learning"

The power of this new approach, which became known as "deep learning", was made evident by ImageNet in 2012.

Image recognition systems were provided with a database of more than one million image files. For any given word, such as "dog" or "cat," the database contained several hundred photos.

Using these examples, image recognition systems were trained to "translate" inputs in the form of images into outputs in the form of one-word descriptions.

Subsequently, the systems began to produce such descriptions even when fed with previously unseen images.

In 2012, a team led by Geoff Hinton, then at the University of Toronto, used "deep learning" to achieve an accuracy of 85%. This was a huge breakthrough.

By 2015, almost all researchers in the field of image recognition were using "deep learning" and the accuracy had reached 96%, better than the average human score.

This method was also being applied to a number of other "problems...reserved for humans", mostly concerned with feature recognition of one subject in another subject: for example, speech recognition, face recognition and translation .

In all of these applications, the large amount of data that could be obtained over the Internet was vital to achieving success. Moreover, the possibility of large markets was also opened up, thanks to the large number of people using the Internet.

And the bigger (ie, the deeper) the networks got and the more training data they were given, the more their performance would improve.

"Deep learning" was quickly applied to all kinds of new products and services.

Voice-controlled devices such as Amazon's Alexa appeared. Online transcription services came in handy. Internet browsers enabled automatic translations. Thus, artificial intelligence began to seem useful and became a part of everyday life.

Qualitative changes

In 2017, a qualitative change was added to the quantitative gains made possible by more computing power and more data: a new way of arranging the connections between neurons, called a transformer.

Transformers enable neural networks to keep track of patterns in their input, even if the elements of the patterns are not similar. This enables them to pay attention to features in the data.

Transformers made networks better understand context, which made them suitable for a new technique called "self-supervised learning."

Under this technique, to explain in bold, some words are randomly hidden during training and the model learns itself to fill in the most likely candidate to fill the gap.

Because the training data does not need to be pre-labeled, such models can be trained using billions of words from raw text taken from the Internet.

How has Artificial Intelligence progressed?

The OpenAI Revolution

Large transformer-based linguistic models (LLMs) began to attract wider attention in 2019, when a model called GPT-2 was released by the artificial intelligence startup OpenAI.

These large language models were capable of exhibiting behaviors for which they were not specifically trained.

Absorbing large amounts of language texts not only made them extremely adept at linguistic tasks like summarizing or translating, but also at things like simple arithmetic and writing software, which were implicit rather than direct skills in training data. Unfortunately, with these new skills came the ability to make social judgments based on data.

In November 2022, a larger model created by the OpenAI firm, called GPT-3.5, was presented publicly in the form of a chatbot. Anyone with internet access could make a request and receive a response.

No consumer product has become so popular, so quickly. Within weeks, ChatGPT was creating everything from student essays to computer code. Artificial intelligence had taken another big step forward.

Where the first set of AI-powered products relied on cognition, in this case, it relied on creation.

Deep learning models such as Stable Diffusion and DALL-E used a technique called diffusion to convert text queries into images. Other models can produce surprisingly realistic video, speech or music.

Progress is not only technological. The very way of doing things also changed. ChatGPT and rivals such as Gemini (from Google) and Claude (from Anthropic, founded by researchers formerly working at the OpenAI firm), produce results from calculations, as do other systems that have been trained with a "deep learning" approach. ".

But they respond to your requests by creating a new response, which makes them different from software that recognizes faces, takes dictation, or translates menus.

These new systems seem to "use language" and "form abstractions", as McCarthy had hoped in the past./ Monitor.al