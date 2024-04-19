wordpress blog stats
Meta Unveils Llama 3: Here’s All You Need to Know and Consider about the LLM

With the introduction of Llama 3 and Meta AI, the tech giant has begun in earnest its battle against leading AI companies like ChatGPT.

Published

What’s the news: Meta released two models of its latest large language model (LLM) Llama 3 on April 18, 2024. As per a company blog post, the models will soon be available on AWS, Databricks, Google Cloud, Hugging Face, Kaggle, IBM WatsonX, Microsoft Azure, NVIDIA NIM, and Snowflake, with support from hardware platforms offered by AMD, AWS, Dell, Intel, NVIDIA, and Qualcomm. Using Llama 3, the company also announced Meta AI, an AI assistant integrated into Meta’s platforms and hailed as Meta’s latest AI champion against OpenAI.

“In the coming months, we expect to introduce new capabilities, longer context windows, additional model sizes, and enhanced performance, and we’ll share the Llama 3 research paper,” said Meta in its blog.

What can Llama 3 do? As per the information on Llama 3’s page, the new LLM model “excels at language nuances, contextual understanding, and complex tasks like translation and dialogue generation.” It talks of handling multi-step tasks with “significantly lower false refusal rates,” improved response alignment, and boosted diversity in model answers. Llama 3 also talks of enhanced capabilities like reasoning, code generation, and instruction following.

In terms of model performance, Meta developed a new high-quality human evaluation set which contains 1,800 prompts covering 12 key use cases: asking for advice, brainstorming, classification, closed question answering, coding, creative writing, extraction, inhabiting a character/persona, open question answering, reasoning, rewriting, and summarization. It then compared Llama 3’s performance across these categories against Claude Sonnet, Mistral Medium, and GPT-3.5.

“Preference rankings by human annotators based on this evaluation set highlight the strong performance of our 70B instruction-following model compared to competing models of comparable size in real-world scenarios,” said Meta.

Data used to train Llama 3: As per the blog post, Llama 3 uses a tokenizer with a vocabulary of 128K tokens that encodes language. To improve the inference efficiency of Llama 3 models, Meta said it adopted grouped query attention (GQA) across both the 8B and 70B sizes.

“We trained the models on sequences of 8,192 tokens, using a mask to ensure self-attention does not cross document boundaries,” said Meta.

The model is also pretrained on over 15T tokens collected from publicly available sources. Llama 3’s training dataset is described as seven times larger than that of Llama 2 and four times more code. To prepare for multilingual use cases, over 5 percent of the Llama 3 pretraining dataset consists of high-quality non-English data that covers over 30 languages. Even so, the company gave a disclaimer that it did not expect the same level of performance in these languages as in English.

“Thanks to improvements in pretraining and post-training, our pretrained and instruction-fine-tuned models are the best models existing today at the 8B and 70B parameter scale,” said Meta.

Llama 3 helps build Meta AI: Along Llama 3, the tech giant also announced Meta AI that is built using the new LLM. The AI assistant will be available on platforms like Facebook, Instagram, WhatsApp, Messenger, in the US, Australia, Canada, Ghana, Jamaica, Malawi, New Zealand, Nigeria, Pakistan, Singapore, South Africa, Uganda, Zambia and Zimbabwe.

Meta AI speeds up image generation: Meta AI can be used to create images from text. The beta version of this feature is available on WhatsApp and the Meta AI web experience in the US. As per the blog post, users will see an image appear as they start typing — and it’ll change with every few letters typed. The app can also animate preferred images or “iterate on it in a new style or even turn it into a GIF to share with friends.”

“Meta will even provide helpful prompts with ideas to change the image, so you can keep iterating from that initial starting point,” said Meta.

Easier interaction with feed: Meta AI can also help access real-time information from across the web without having to switch between apps. For example, if you are scrolling through Facebook feed and come across a post about flower-viewing in Japan, you can ask Meta AI about the best time or best place in Japan to witness such events.

On WhatsApp, users can ask Meta AI a question right from the search feature at top of their chats to get information about sports, entertainment, and current events via search providers.

Is Llama really open-sourced?

Right from the very headline of its announcement, Meta has talked about Llama 3 bring an open-source model. In its blog it even said, “We are embracing the open-source ethos of releasing early and often to enable the community to get access to these models while they are still in development.” However, Meta’s definition of open source has raised some debate in the past. For example, in July 2023, a public benefit corporation Open Source Initiative (OSI) had taken issue with Meta’s claim about Llama 2 being open-source by pointing out how the some made “resources available to some users under some conditions.”

To be fair, Lea Gimpel from the Digital Public Goods Alliance had pointed out during Carnegie India’s Global Technology Summit in December 2023 that “there is no agreed-upon open source definition for AI.” Giving the example of Meta’s Llama, she said the model wouldn’t qualify as open source under the open source software definition because they use restrictions baked into the license.

At the same time, OpenAI’s ChatGPT is open access but neither the datasets used to train the model nor the codebase are public. OpenAI also uses different definitions of what open source means for its different projects. This essentially means that different companies resort to different interpretations of what open-sourcing means for AI.

During the same talk, Gimpel also raised the question whether open source is always the best choice for AI companies considering data and privacy needs. To read more about this argument, click here.

Meta Unveils Llama 3: Here's All You Need to Know and Consider about the LLM

