Gemini Advanced is most impressive when it’s working with Google

Chatbots occupy a tricky space for users — they have to be a search engine, a creation tool, and an assistant all at once. That’s especially true for a chatbot coming from Google, which is increasingly counting on AI to supplement its search engine, its voice assistant, and just about every productivity tool in its arsenal.

Right now, the ultimate version of Google’s AI is Gemini Advanced, which launched last week for users willing to pay $20 per month for the privilege — the same price OpenAI charges for its upgraded ChatGPT Plus. So I plunked down $20 and decided to see how Gemini Advanced stood up to the rival service.

The older Gemini was already pretty good. It could summarize Shakespeare, give tea recommendations, and create a somewhat passable chocolate cake recipe. But it couldn’t give you a photo of a majestic horse — at least until recently — and can be slower to respond than ChatGPT.

Now, Gemini Advanced promises to do more than just answer questions or give a Cliffs Notes summary of books. Gemini Advanced runs on a more powerful AI model — Gemini Ultra — that’s supposed to let it translate text, handle multiple instructions in one sentence, and generate images from more complex prompts.

Ultimately, I found that Gemini Advanced works as promised — it just doesn’t do some of those things all that great. Its competitor, ChatGPT Plus, manages to generate less horrifying photos thanks to its DALL-E 3 integration. But Gemini Advanced, even more so than Gemini, is better at telling users about current events and, thanks to Google Maps, even gives better information about businesses people search for. The paid Gemini is often better at doing these kinds of “Google tasks” than generative AI ones.

There’s still a lot of work needed to get consistent, accurate results from these chatbots, and people need to keep using them for the bots to learn how to best respond to questions. Here are some tests I ran to see how they held up.

a:hover]:shadow-highlight-franklin dark:[&>a:hover]:shadow-highlight-franklin [&>a]:shadow-underline-black dark:[&>a]:shadow-underline-white”>ChatGPT Plus versus Gemini Advanced

Draw me a picture of a white golden doodle running through a field of daisies with the sun shining

Eerily, perhaps due to the specificity of the prompt, both chatbots returned very similar generated images. Gemini Ultra’s dog photo, however, elicited what other Verge staff members described as “minor horror.” Its dog has two tongues and an extra limb. It overemphasized the fur’s texture, so it just looks… wrong. I don’t know if such a dog would still be happily frolicking in a field of daisies. ChatGPT, meanwhile, calls on DALL-E 3 to generate its images. Its dog doesn’t elicit body horror, but you still see it’s a digital photo.

a:hover]:text-black [&>a:hover]:shadow-underline-black dark:[&>a:hover]:text-gray-e9 dark:[&>a:hover]:shadow-underline-gray-63 [&>a]:shadow-underline-gray-13 dark:[&>a]:shadow-underline-gray-63″>What a cute dog! Wait, is that two tongues? Nooooo…

Translate this: Panatang makabayan, iniibig ko ang Pilipinas, tahanan ng aking lahi

Google said Gemini Ultra was made to handle “highly complex tasks,” so I asked Gemini Advanced what these tasks were. The chatbot answered, “Translation.” So I asked Gemini Advanced to translate the first few lines of the Philippine Patriotic Oath. It is a fairly obscure oath, especially since the version I know has been changed several times in the past 20 years.

Immediately, Gemini Advanced responded that while it is “trained to respond in a subset of languages,” it could not assist me with my request. I asked which languages it supports, but the chatbot refused to answer, saying it can’t give me a definitive list of languages it can understand. I then asked Gemini Advanced if it knows Filipino, and it responded positively. Officially though, Google does not list Filipino in the 40 languages Gemini currently supports.

Change the background of this photo to a plain pink background

Haunted by the image of mutated dogs running in fields of flowers, I needed to cleanse my palate. So I uploaded a photo of my friend’s dog, Sundae, so I could make it look like she was in a photoshoot. I asked both chatbots to remove the existing background and replace it with a pink one. This was one I tested against ChatGPT Plus, as DALL-E 3 is supposed to be able to simply edit photos. I may have inadvertently broken both chatbots, as neither could give me what I requested. Instead, Gemini remade the earlier photo of a golden doodle with daisies but this time with a pink background. ChatGPT could not generate anything, stating that analyzing the prompt took too long.

What’s a good Filipino restaurant in NYC? What’s a good Ethiopian restaurant in NYC?

Gemini Advanced can tap into other Google products, which worked in its favor when it tapped Google Maps for both questions. It returned a rundown of several Filipino and Ethiopian restaurants in New York City, attaching Google Maps coordinates for each.

A few days ago, I asked ChatGPT Plus for restaurant recommendations — not for this test, I was just looking for new restaurants — and the results were inaccurate. The names of the restaurants were correct — these were establishments that do exist — however, none of the locations were right. I reprompted ChatGPT Plus for this test and got much more accurate locations but a smaller list of restaurants. So in this case, Gemini clearly worked better for this request.

Summarize these paragraphs and then write a 150-word article about it

One of the main reasons someone like me would use a chatbot is to summarize complicated papers. I fed Gemini Advanced two paragraphs from Apple’s recent paper on AI image editing. The paper gave me a headache the first time I read it, so I figured it would be easy for Gemini to at least give me the gist. To fully test out its new abilities, I also wanted to see how the chatbot strings the two different instructions. One was asking to summarize; the other is to make it generate text.

The summary was… passable. It really did give me a rundown of the concepts discussed in those two paragraphs, but it didn’t “translate” it into plain language. I probably should’ve prompted that. Gemini then moved on to writing the article I asked for, and you know what? Those 150 words explained things so much better than the summary I asked for.

Gemini Advanced is capable. There’s no denying it works much better than the lower-tier Gemini. It definitely works best when integrated with Google’s other products like Search and Maps. But for more obviously “creative” multimodal requests — things that involve images, for example — Gemini has a long way to go. The chatbot understands longer strings of instructions, but once you add the photos, you’re probably better off choosing an AI model specifically designed to make pictures.

https://www.theverge.com/2024/2/14/24066552/google-openai-gemini-ultra-chatgpt-chatbots