The recent launch of its Gemini models, Google has introduced Gemma, a novel family of lightweight open-weight models. Comprising Gemma 2B and Gemma 7B, these models, described as "inspired by Gemini," are now available for both commercial and research purposes. While Google has not yet released a comprehensive comparison against similar models from Meta and Mistral, the company touts Gemma as "state-of-the-art." Highlighting that these are dense decoder-only models with the same architecture as the earlier Gemini and PaLM models, Google promises benchmarks on Hugging Face's leaderboard later today.
Developers eager to explore Gemma can leverage ready-to-use Colab and Kaggle notebooks, along with integrations with Hugging Face, MaxText, and Nvidia’s NeMo. Once pre-trained and fine-tuned, these models are designed to be deployable across various environments. Despite being labeled as "open models," it's important to note that Gemma models are not open source. Jeanine Banks from Google emphasized the company's commitment to openness in a press briefing, clarifying that while Gemma models offer wide access, the terms of use, including redistribution and ownership of variants, are subject to the models' specific terms.
Google's Tris Warkentin, the product management director at Google DeepMind, emphasized the improved generation quality over the past year. He highlighted the capability of running inference and tuning on local developer machines or on Google Cloud Platform with Cloud TPUs. Similar offerings from competitors make it imperative to assess Gemma models in real-world scenarios. Alongside the new models, Google introduces a responsible generative AI toolkit designed to offer guidance and tools for creating safer AI applications with Gemma. Additionally, a debugging tool is being released to enhance the user experience.