A recent study has found that GPT-3 can solve problems at a level that matches or exceeds undergraduate students.
A recent study has found that GPT-3 can solve problems at a level that matches or exceeds undergraduate students.
People are good at solving new problems without any special training or practice. They do this by comparing them to problems they have already solved and applying the answer to the latest issue. That process, known as analogical reasoning, has long been considered a uniquely human ability.
The University of California (UCLA) researchers found that the GPT-3 large language model performed about as well as US college undergraduates when asked to solve reasoning problems that appear on intelligence tests and standardized tests such as the SAT.
"No matter how impressive our results, it's important to emphasize that this system has major limitations," said Taylor Webb. "It can do analogical reasoning, but it can't do things that are very easy for people, such as using tools to solve a physical task. When we gave it those sorts of problems, some of which children can solve quickly, the things it suggested were nonsensical."
To compare GPT-3's reasoning capabilities to humans, Webb and his team designed a series of tests inspired by Raven's Progressive Matrices, which require the subject to predict the next image in a complex pattern of shapes. Webb turned the pictures into texts GPT-3 could understand to see the figures. This method also ensured that the AI had never seen the questions before.
The researchers gave the same questions to 40 first-year college students at UCLA. Surprisingly, not only did GPT-3 do about as well as humans and made similar mistakes, commented UCLA psychology professor Hongjing Lu, the study's senior author. GPT-3 solved 80% of the problems correctly, well above the average score of just below 60% for the human participants.
The researchers also prompted the model to solve some SAT "analogy" questions – selecting pairs of linked words that they believed had not been published on the internet and, therefore, could not have appeared in the vast amount of data it was trained on. When compared with college applicants' SAT scores, the AI did better than the average score for the people.
The researchers then asked GPT-3 and student volunteers to solve analogies based on short stories -- prompting them to read one passage and then identify a different story that conveyed the same meaning. The technology did less well than students on those problems, although GPT-4, the latest iteration of OpenAI's technology, performed better than GPT-3.
GPT-3 has so far been unable to resolve issues that require understanding physical space, said researchers. "Language learning models are just trying to do word prediction, so we're surprised they can do reasoning," Lu remarked. "Over the past two years, the technology has taken a big jump from its previous incarnations."
GPT-3 might be thinking like a human, Holyoak added. But on the other hand, people did not learn by ingesting the entire internet, so the training method is entirely different. We'd like to know if it's doing it the way people do or if it's something brand new — a real artificial intelligence — which would be unique in its own right. It's not fully general human-level intelligence. But it has made progress in a particular area.
They would need access to the software, the data used to train it, and the ability to offer tests they are certain the software has yet to take to identify the fundamental cognitive processes AI models are utilizing previously. They said it would be the next stage in determining what AI should turn into.
Want your digital assets to be protected?
CyberShelter provides innovative and modern cybersecurity products and niche services to individuals and organization against all kinds of cyber threats.
For the latest cyber threats and the latest hacking news please follow us on Facebook, Linkedin, and Twitter.
You may be interested in reading: How to Survive the COVID Time Cyber Security Threats?