Monday, December 11, 2023

Google’s Gemini

Casey Newton:

Google this morning announced the rollout of Gemini, its largest and most capable large language model to date. Starting today, the company’s Bard chatbot will be powered by a version of Gemini, and will be available in English in more than 170 countries and territories. Developers and enterprise customers will get access to Gemini via API next week, with a more advanced version set to become available next year.

How good is Gemini? Google says the performance of its most capable model “exceeds current state-of-the-art results on 30 of the 32 widely-used academic benchmarks used in LLM research and development.” Gemini also scored 90.0% on a test known as “Massive Multitask Language Understanding,” or MMLU, which assesses capabilities across 57 subjects including math, physics, history and medicine. It is the first LLM to perform better than human experts on the test, Google said.

Sundar Pichai (Hacker News):

Our first version, Gemini 1.0, is optimized for different sizes: Ultra, Pro and Nano. These are the first models of the Gemini era and the first realization of the vision we had when we formed Google DeepMind earlier this year. This new era of models represents one of the biggest science and engineering efforts we’ve undertaken as a company.

Demis Hassabis:

This promise of a world responsibly empowered by AI continues to drive our work at Google DeepMind. For a long time, we’ve wanted to build a new generation of AI models, inspired by the way people understand and interact with the world. AI that feels less like a smart piece of software and more like something useful and intuitive — an expert helper or assistant.

Today, we’re a step closer to this vision as we introduce Gemini, the most capable and general model we’ve ever built.

Gemini is the result of large-scale collaborative efforts by teams across Google, including our colleagues at Google Research. It was built from the ground up to be multimodal, which means it can generalize and seamlessly understand, operate across and combine different types of information including text, code, audio, image and video.

John Gruber:

Loosely speaking, Gemini Ultra is competing with GPT 4, and Gemini Pro with GPT 3.5. Nano, the on-device model, will first appear on Pixel 8 Pro phones.


It seems like the whole demo ought be considered fraudulent — a fake. What’s wrong with Google as a company that they repeatedly try to pass off concept videos as legitimate demos of actual products?

Nick Heer:

If you read the disclaimer at the beginning of the demo in its most literal sense, Google did not lie, but that does not mean it was fully honest. I do not get the need for trickery. The real story would have undoubtably come to light, if not from an unnamed Google spokesperson, and it undermines how impressive this demo is. And it is remarkable — so why not make the true version part of the story? I do not think I would have found it any less amazing if I had seen a real-time demonstration of the still frame of the video being processed by Gemini with its actual output, and then I saw this simplified version.


Comments RSS · Twitter · Mastodon

Leave a Comment