• Share this News :        


  • December 19, 2023
  • Anjana Suresh
Microsoft Unveils Phi-2, a Compact Yet Powerful Language Model

In the expansive realm of large language models dominated by giants like GPT-4 and Bard, Microsoft has introduced its latest creation—Phi-2. Boasting 2.7 billion parameters, Phi-2 represents an upgrade from its predecessor, Phi-1.5. Now available through the Azure AI Studio model catalogue, Microsoft asserts that Phi-2 surpasses larger models like Llama-2, Mistral, and Gemini-2 across diverse generative AI benchmark tests.Satya Nadella first announced Phi-2 at Ignite 2023, and it officially debuted this week. Crafted by the Microsoft research team, this generative AI model is heralded for its prowess in "common sense," "language understanding," and "logical reasoning." Microsoft contends that Phi-2 even outshines models 25 times its size in specific tasks, showcasing its efficiency and capabilities.

Phi-2's training regimen involves "textbook-quality" data, encompassing synthetic datasets, general knowledge, theory of mind, and daily activities. As a transformer-based model with a next-word prediction objective, Phi-2 stands out in terms of efficiency. Unlike the 90-100 days required to train GPT-4, Microsoft accomplished Phi-2's training in just 14 days using 96 A100 GPUs, emphasizing its cost-effectiveness.Beyond linguistic finesse, Phi-2 demonstrates proficiency in solving complex mathematical equations and physics problems. Impressively, it can even identify errors in student calculations.

In benchmark assessments encompassing commonsense reasoning, language understanding, math, and coding, Phi-2 outperforms models like the 13B Llama-2 and 7B Mistral. Notably, it surpasses the 70B Llama-2 LLM and the Google Gemini Nano 2, a 3.25B model that natively operates on Google Pixel 8 Pro.The significance of a smaller model excelling over its larger counterparts lies in its cost-effectiveness, requiring less power and computing resources. Phi-2's agility in being tailored for specific tasks and running natively on devices further reduces output latency. Developers keen on leveraging Phi-2's capabilities can access the model through Azure AI Studio, marking a pivotal advancement in the landscape of language models.