Microsoft Unveils Phi-3 Mini: A Lightweight Language Model That Outperforms Larger Models on Modern Smartphones

Microsoft has announced the launch of a new small language model, Phi-3 mini, designed to provide performance similar to OpenAI’s GPT-3.5 on modern smartphones. This iteration of Microsoft’s lighter language model has been trained with 3.3 billion tokens from “larger and more advanced” data sets compared to its predecessor model, Phi-2, which was trained with 1.4 billion tokens.

Phi-3 mini consists of 3.8 billion parameters, making it suitable for use in modern smartphones as it only occupies around 1.8GB of memory and can be quantified to 4 bits, according to a text published on Arxiv.org. Researchers tested Phi-3 mini on an iPhone 14 with an A16 Bionic chip and found that it runs natively and completely offline, achieving more than 12 tokens per second. The overall performance of this model “rivals” that of larger models like Mixtral 8x7B and GPT-3.5.

The model utilizes a transformer decoder architecture that supports a 4K text length and is based on a block structure similar to Meta’s Llama 2, benefiting the open-source community and supporting all packages developed for Llama 2. Additionally, Phi-3 mini supports a conversational chat format and aligns with Microsoft’s robustness and security values.

Microsoft has also trained two other models from the same family: Phi-3 medium with 14 billion parameters and Phi-3 small with 7 billion parameters, both trained with 4.8 billion tokens. The company’s emphasis on innovation and performance in the field of language models is evident in their latest offerings.

Overall, Microsoft’s new small language model offers impressive performance on modern smartphones while also benefiting the open-source community through its block structure similarity to Meta’s Llama 2 and robustness values aligned with Microsoft’s own robustness values.

In conclusion, Microsoft’s latest offering in the field of language models demonstrates their commitment to innovation and performance by introducing a new small language model called Phi-3 mini that rivals larger models like Mixtral 8x7B and GPT-3

Related Posts

5 Essential Steps to Building a Profitable Portfolio with Microsoft (MSFT) at the Helm

DXC Technology Names Kaveri Camire Senior Vice President and Chief Marketing Officer to Drive Growth Through Strategic Initiatives and Enhanced Branding Efforts

Revolutionizing Voice Calls: Nokia’s Immersive Audio and Video Technology Provides 3D Sound for More Realistic Interactions

Leave a Reply Cancel reply