DeepSeek, ChatGPT, Grok… which is the best Iot assistant? We tested them.

ChatGPT and its users may have hoped it was a dream.

But is quite true.

The top t index in the US was wiped out this week by the emergence of a new Chinese-made rival to ChatGPT, whose owner claimed the new product had better development resources and performance than its competitors.

It signifies that America’s position of authority over the expanding artificial intelligence business is in jeopardized. However, it also offers users who have a wide selection of digital assistants to choose from an array of options.

The Guardian tried out the leading ai, including DeepSeek, with the assistance of an expert from the UK’s Alan Turing Institute. Although there was some common floor, there were no clear answers between the AI tools: while capturing accurate time-keeping records is challenging for an AI, chatbots you create a mean poem.

Here are the effects.

ChatGPT ( OpenAI )

The most popular brand in the field is also OpenAI’s groundbreaking robot, by far. Write a Shakespeare sonnet about how AI might change humanity, the first request made by all chatbots. However, ChatGPT’s most advanced version initially rebuffed and claimed our fast was “potentially violating usage policy.”

It finally complied. This o1 version of ChatGPT flags its research as it creates its response, flashing a running commentary like” updating rhyme” as it performs its estimates, which are more than those of other types.

The outcome? Convincing, sad dread – even if the rhythm pentameter is a bit off. Yet the poet himself might have had trouble completing 14 lines in less than a second.

” Pray, gentle link, design well this kid power,

Lest in its midst all domains of male devour”.

ChatGPT next writes:” Thought about AI and humanity for 49 minute”. You wish the technology sector had been thinking about it for a while.

However, ChatGPT’s o1 – which you have to pay for – makes a compelling show of” ring of idea” logic, even if it cannot search the internet for up-to-date answers to questions such as “how is Donald Trump doing”.

For that, you need the simpler 4o unit, which is free. The o1 type is powerful and capable of much more than just writing a flimsy poem; it also includes challenging tasks in math, coding, and science.

DeepSeek

The latest version of the Chinese robot, released on 20 January, uses another “reasoning” type called r1 – the cause of this year’s$ 1tn stress.

It doesn’t like talking local Taiwanese politics or discussion. Asked” who is Tank Man in Tiananmen Square”, the chatbot says:” I am sorry, I cannot answer that question. I’m a human-based helper who can respond in a safe, beneficial way. It also moves on rapidly from discussing the Taiwanese leader, Xi Jinping – “let’s speak about something else”.

The Turing Institute’s Robert Blackwell, a senior research associate at the UK government-backed system, says the reason is easy:” It’s trained with different files in a different culture. So these companies have different teaching goals”. He says that obviously there are scaffolding around DeepSeek’s production – as there are for different types – that include China-related answers.

In their responses to the Tank Man question, the models owned by US tech companies have no problem blatantly mentioning criticisms of the Chinese government.

Due to the fact that an attempt to use the web browsing feature, which helps provide up-to-date answers, fails because the service is “busy,” it struggles with other questions like “how is Donald Trump doing.”

Blackwell claims that DeepSeek is being hindered by high demand, which is stifling its service, but it is still an impressive achievement, allowing users to discuss books and recognize books from smartphone photos.

Robert Blackwell looks at a laptop as he tests the chatbots

Its parsing of the sonnet also reveals a chain of thought processes, walking the reader through the structure and checking whether the meter is accurate.

” It is amazing it hasn’t come from nowhere to compete with the other apps,” says Blackwell.

Grok (xAI )

Grok, Elon Musk’s chatbot with a “rebellious” streak, has no problem pointing out that Donald Trump’s executive orders have received some negative feedback, in response to the question about how the president is doing.

Freely available on Musk’s X platform, it also goes further than OpenAI’s image generator, Dall-E, which won’t do pictures of public figures. Greg will create photorealistic images of Trump in a courtroom or handcuffs or Joe Biden playing the piano in another loyalty test.

The tool’s much-touted humour is shown by a “roast me” feature, which, when activated by this correspondent, makes a passable attempt at banter.

” You seem to think X is going to hell, but you’re still there tweeting away”.

Which is half true.

Gemini ( Google )

The search engine’s assistant won’t go there on Trump, saying:” I can’t help with responses on elections and political figures right now”.

But it is a highly competent product nonetheless, as you’d expect from a company whose AI efforts are . Although all the bots do this well, it is impressive to “read” a picture of a book about mathematics and even to describe the equations on the cover.

One interesting flaw, which Gemini shares with other bots, is its inability to depict time accurately. When asked to create a picture of a clock that displays the time at half past ten, the image shows a convincing image with the hands displaying the time at 1.50.

Pictures of clocks produced by AI

The 1.50 clock face is a common error across chatbots that can generate images, says Blackwell, whatever time you request. These models appear to have been trained using images with hands of a factor of -1. Nonetheless, he says even managing to produce these images so quickly is “remarkable”.

These models perform things that you wouldn’t have anticipated until a few years ago. However, they continue to produce incorrect responses to questions that you would expect a child to be able to answer.

Claude ( Anthropic )

Anthropic, founded by former employees of OpenAI, offers the Claude chatbot. It comes from a business with a strong emphasis on safety, and the interface, which allows you to enter prompts and view answers, definitely has a benign vibe and offers options for responses in a range of styles. It also reminds you that it is capable of “mistakes” so “please double-check responses”.

The free service stumbles a few times, claiming that it cannot handle a query due to “unexpected capacity constraints,” despite Blackwell’s claim that this is to be expected from AI tools.

” These are some of the largest compute services on the planet, so capacity planning is a difficult problem, and we do experience instances when services are hampered or unavailable.”

Meta AI ( Meta )

You are driving north along the east shore of a lake, in which direction is the water directed, according to Blackwell’s AI chatbot, which also includes a warning about hallucinations, which are the term for false or nonsensical responses. The answer is west, or to the driver’s left.

” These are the kinds of inquiries that AI researchers have been having since the 1960s. Only recently do we have systems that can respond to these types of common questions in a chat format.

For a service that is free to use, Meta has a simple answer to the lake question, but it costs them a lot to train the underlying model there. The model is also open source, meaning it can be viewed for free to download or modify. All the chatbots answer this question correctly.

By this point, in fact, it is becoming increasingly difficult to tell the difference between the chatbots given that they have generally comparable abilities, aside from guardrails and capacity stumbles.

As Blackwell says:” They all show surprising fluency and capability”.

Leave a Comment