GPT-4 learns from pictures – Kommersant FM – Kommersant
[ad_1]
ChatGPT developers have introduced a new version of artificial intelligence, which can now work with pictures. The creators call GPT-4 the most efficient language model. Its main difference from its predecessor is the ability to analyze images. For example, AI can explain what is funny in a picture. In addition, it is argued that the model has become better at handling complex tasks, although this may not be noticeable in the usual dialogue.
The system was trained on Microsoft supercomputers, the corporation became one of the largest investors in the project. The company said its Bing search engine already supports the new technology. Kommersant FM tested it.
GPT-4 can understand what is shown in the picture, and use this data to solve problems, say OpenAI. For example, the system is able to read information from a diagram and explain it. The developers promise that the new model will give fewer incorrect answers, less likely to “go crazy” and talk about forbidden topics. GPT-4 is claimed to be more accurate than the previous version and even passed law and math tests better than 90% of people. It also handles complex instructions better. As an example, the developers give the task: retell the story of Cinderella using the words in alphabetical order, without repetition.
Despite Microsoft’s statements, the model works uncertainly in the Bing search engine, Kommersant FM was convinced. The chatbot said that you can send him a link to the picture, and he will tell you what is shown on it. But instead of describing a frame from the Twin Peaks series, the bot began to retell the post from Peekaboo (although we took the picture from this resource).
Then Bing got the link to the review of the foreign press on “Kommersant FM”, which talked about the so-called Havana syndrome. The bot assured that it was able to decipher the audio, but at first mistook the two-minute recording for a forty-minute podcast, and when they pointed out the error, he decided that this was a report about the conflict in Ukraine.
The developers say that the updated technology is used by Morgan Stanley Bank, the Duolingo language learning application, as well as a special service for the blind Be My Eyes, which converts pictures into a text description. Roman Dushkin, director of science and technology at the Agency for Artificial Intelligence, believes that businesses will be able to find other applications for the language model: “GPT-4 can now perceive both images and text, that is, it is a two-modal system.
And I am sure that OpenAI will not stop there and will add new modalities, since they have models for generating sounds and voiceovers.
ChatGPT can already be used in business. The easiest way to integrate is to replace the routine operations of creating some letters, certificates, reports, and so on. AI does this faster and often better than people who sit and invent what to write. Since the system has become two-modal, you can use the generation of some specific images such as diagrams, graphs. What was presented at the presentation from Microsoft and OpenAI, of course, captures the imagination a little.”
However, OpenAI warns that the new software is not yet perfect and that it is inferior to humans in many scenarios. According to the company, the model still has serious problems with “hallucinations” (making up facts), so it is not reliable in terms of representing facts. GPT-4 still tends to insist on being right when it’s wrong.
So in the foreseeable future, the technology can hardly be considered promising, Alexander Serbul, head of Data Science at 1C-Bitrix, said: “Serious things ChatGPT cannot be trusted. First, the system is unpredictable because it learns from garbage. No one reviews this information in detail. We are talking about a huge amount of data, the Internet, some books. In addition, there are hundreds of millions of parameters. Yes, she is studying, yes, she speaks coherently. But what do mentally ill people look like? They often speak just as smoothly. I think that the business will try to apply this in some form of entertainment or where the dirty work is required.
A more advanced version of the new language model is available to subscribers of the paid GPT4+ service. You can use it for $20 per month, however, the resource does not work in Russia.
News at your pace – Telegram channel “Ъ FM”.
[ad_2]
Source link