Is Meta leaking?

Can't view this email?

2023 Mar Week 3 Vol 08

All in One

AI DATA SOLUTION-

DATUMO

Meta leaks on torrent /

Amazon partners with Hugging Face

AI Topic→ →

Meta leaks on torrent /

Amazon partners with Hugging Face

“Do not go where the path may lead; go instead where there is no path and leave a trail.” This anonymous quote seems to have given Meta the idea to put its Large Language Model Meta AI (LLaMa) online for anyone to download.

What’s happening?

The Large Language Model (LLM), which is usually only supposed to be available to a select body of approved researchers, government officials, and members of civil society, was shared on 4Chan. This is the first time a major tech company’s proprietary AI model has leaked to the public. Many people blame Meta for the leak as access to the model was allowed to people (with a .edu email address) who applied through a Google form.

A little bit about LLaMa’s?

LLaMa is a family of language models that are trained on more than 1 trillion tokens of data from sources including WikiPedia, GitHub, CommonCrawl, and a host of others. There are 4 sizes available with 7B, 13B, 33B, and 65B parameters. These models have performed greatly on zero-shot reasoning tasks and other benchmarks but not so well (yet) on Massive Multitask Language Understanding (MMLU) tasks.

Why’s the leak a big deal?

There is worry in the AI community that too many AI models without checks in the public domain will lead to the end of the world. This might happen because these models have unknown capabilities, including dangerous features that bad-faith actors could discover and use for nefarious activities. This potentially makes AI governance hard because there will be too many different kinds, which will be hard to control.

In the past, it took a long time between when AI features were made and when they were made available to the public. This is because these features are usually made by a small group of experts, and putting them into larger systems takes a lot of time and money. On the other hand, by making LLaMa available to the public, Meta is shortening the time between the development of cutting-edge features like those in GPT3 and their spread to the general public.

What’s next for LLaMa?

While Meta has not denied the leak, in a statement to Motherboard, a spokesperson wrote, “It’s Meta's goal to share state-of-the-art AI models with members of the research community to help us evaluate and improve those models. LLaMA was shared for research purposes, consistent with how we have shared previous large language models. While the model is not accessible to all and some have tried to circumvent the approval process, we believe the current release strategy allows us to balance responsibility and openness.”

In the meantime, it looks like Meta is asking for the model to be taken down online through takedown requests to stop it from spreading.

Smile a little; it’s not all gloomy news for AI - Hugging face and Amazon Web Services partner to make AI more accessible.

Hugging Face, with its mission to “democratize good machine learning one commit at a time,” has announced a partnership with Amazon Web Services (AWS). Hugging Face is best known for its transformer library offerings such as PyTorch, TensorFlow, and JAX. Hugging Face is a key part of the ML and AI industries because it has more than 100,000 free machine learning (ML) models.

What’s the partnership?

Both companies say they want to make AI open and available to everyone by sharing next-generation models with the global AI community and making machine learning more democratic. At the moment, it is necessary to build, train, and deploy large language and vision models, this is an expensive and time-consuming process that requires a lot of machine learning expertise. Also, many developers can't use generative AI because the models are so hard to understand and can have hundreds of billions of parameters.

As part of the agreement, AWS will become the preferred cloud provider for Hugging Face, meaning customers can easily fine-tune and deploy unique Hugging Face models on Amazon SageMaker and Amazon Elastic Computing Cloud (EC2) by using purpose-built machine learning accelerators like AWS Trainium and AWS Inferentia, thereby optimizing their models for specific use cases at cheaper costs.

Big deal alert:

This matters because, just as Microsoft, through Azure, is providing OpenAI with cloud services to scale their offering of GPTs, and Google is backing Bard heavily, Amazon is expanding into the game with its own behemoth. It’s expected that if Hugging Face models being integrated with AWS drive more usage of it, more deals like these will be struck in the AI ecosystem in the coming year, potentially expanding it faster. Because of this, Amazon is in the running to be the biggest provider of AI technologies in the near future.

DATUMO FST → →

Use Datumo FST for FREE!

The ultimate SaaS for data analysis and curation. Based on feature space, you can visualize data distribution and improve dataset coverage!

With more freedom in data analysis and curation, you will be able to :

Analyze edge cases
Look up data similar to edge cases, based on algorithms
Extract particular data that represent the whole
Therefore, save time and cost

If you would like to try out our new Datumo FST for free, please join the waitlist by clicking the link below.

Join the waitlist for DATUMO FST alpha test->>

Weekly AI Issues → →

Making Deepfakes Gets Cheaper and Easier Thanks to A.I.

Meme-makers and misinformation peddlers are embracing artificial intelligence tools to create convincing fake videos on the cheap.

Indian Student Uses AI To Translate ASL In Real Time

Priyanjali Gupta, a fourth-year computer science student specializing in data science at the Vellore Institute of Technology, went viral on LinkedIn after using AI to translate American sign language (ASL).

Spotify Asks What Would Happen if Artists Controlled TikTok

Audio company reveals new details about its big plans for video

Promotion → →

Free data labeling service

Datumo offers free open datasets and free data labeling services.

Papers using our datasets have been accepted to world-renowned conferences such as NeurIPS, CVPR and EMNLP.

Feel free to contact us to see if we can support your project!

Apply for 2023 Datumo Data Sponsorship →

DATUMO Inc.
📨 For any enquiries, email: contact@datumo.com
HQ: 10F 11F, 20, Teheran-ro 20-gil, Gangnam-gu,

Seoul, Republic of Korea

Unsubscribe