Satya Nadella: CEO Microft
Microsoft has unveiled Phi-3, a smaller AI model aimed at reducing costs for users who couldn’t afford its Large Language Models (LLM). The announcement came via a statement on Tuesday. It emphasises Phi-3-mini’s ability to outperform models twice its size across various benchmarks evaluating language, coding, and math capabilities.
Smaller AI models like Phi-3 are designed for simpler tasks, making them more accessible to companies with limited resources. For instance, Phi-3 could be used to summarize lengthy documents and extract relevant insights from market research reports.
According to Eric Boyd, Corporate Vice President of Microsoft Azure AI Platform, Phi-3 Mini is as capable as larger LLMs like GPT-3.5, just in a smaller form factor. Boyd explained that developers trained Phi-3 using a “curriculum,” inspired by how children learn from simplified books and stories.
Related Stories
- Microsoft Has Reduced Nigerian Office Space, Moved Roles to Kenya
- Presidential Aide dismisses reports of Microsoft Nigeria shutdown
Boyd elaborated on the training process, stating, “There aren’t enough children’s books out there, so we took a list of more than 3,000 words and asked an LLM to make ‘children’s books’ to teach Phi.”
Phi-3 builds upon the learnings of its predecessors, including Phi-2 released in December. Microsoft claimed that Phi-3 performs even better than Phi-2, providing responses close to models ten times its size.
Smaller models like Phi-3 require less processing, which enables big tech providers to offer them at a lower cost. Microsoft anticipates that this affordability will enable more customers to leverage AI in areas where larger models were previously too expensive.
While Microsoft mentioned that using Phi-3 would be “substantially cheaper” than larger models like GPT-4, specific pricing details were not provided.
Other tech giants have their own small AI models catering to simpler tasks such as document summarization and coding assistance. For example, Google’s Gemma 2B and 7B are suitable for chatbots and language-related tasks, while Anthropic’s Claude 3 Haiku specializes in summarizing dense research papers. Additionally, Meta’s recently released Llama 3 8B is geared towards chatbots and coding assistance