Introducing Llama 3, META's new AI
William Karkegi
Meta is excited to introduce Meta Llama 3, the next generation of their state-of-the-art open language model. Llama 3 will soon be available on major platforms such as AWS, Databricks, Google Cloud, Hugging Face, Kaggle, IBM WatsonX, Microsoft Azure, NVIDIA NIM, and Snowflake.
It will also be supported by hardware manufacturers like AMD, AWS, Dell, Intel, NVIDIA, and Qualcomm.
Llama 3 offers enhanced capabilities compared to its predecessors. The new models, with 8 billion (8B) and 70 billion (70B) parameters, demonstrate cutting-edge performance across a wide range of industry benchmarks.
These models are pre-trained and fine-tuned with instructions, supporting a variety of uses, including reasoning, code generation, and instruction following.
Meta's goal with Llama 3 is to create the best open models, comparable to the best proprietary models available today.
They have taken developer feedback into account to increase the overall utility of Llama 3 while continuing to play a leading role in the responsible use and deployment of LLMs.
They aim to make Llama 3 multilingual and multimodal, with longer context windows and improved performance in the core capabilities of LLMs.
Meta is committed to developing Llama 3 responsibly. They provide trust and safety tools such as Llama Guard 2, Code Shield, and CyberSec Eval 2 to help users use this model ethically and securely.
Llama 3 uses a single-decoder transformer architecture with a 128,000-token vocabulary, improving language encoding efficiency.
Meta pre-trained Llama 3 on more than 15 trillion (15T) tokens from public sources. Their dataset is seven times larger than that used for Llama 2 and includes four times more code.
Additionally, 5% of Llama 3's training dataset consists of high-quality non-English data covering more than 30 languages.
Meta developed detailed scaling laws to assess downstream performance and optimize the use of their training resources.
Their 8B and 70B parameter models continue to improve even after being trained on 15T tokens. They used three types of parallelism to train their largest models, achieving compute utilization of over 400 TFLOPS per GPU.
To unlock the full potential of their pre-trained models in chat use cases, Meta innovated their instruction fine-tuning approach, combining supervised fine-tuning, rejection sampling, proximal policy optimization (PPO), and direct preference optimization (DPO).
They also established testing teams to evaluate and enhance the security of their models.
Llama 3 will soon be available on all major platforms, including cloud providers, model APIs, and more.
Meta has integrated their latest models into Meta AI, available on Facebook, Instagram, WhatsApp, Messenger, and the web. You can also test Meta AI multimodal on their Ray-Ban Meta smart glasses.
Meta firmly believes that openness leads to better products, faster innovation, and a healthier market. They are excited to see all the amazing creations you will develop with Meta Llama 3.
For more information, visit the Llama 3 website and check out their getting started guide.
Revolutionizing the online experience for businesses and customers.
© 2024 Miwend Paris. All rights reserved.