Signup | Past Issues | Follow on X | Read on Web

AlphaSignal

Hey ,

Welcome to today's edition of AlphaSignal, a newsletter for developers by developers.

We identify and summarize the top 1% news, papers, models, and repos in the AI industry.

IN TODAY'S SIGNAL

Read time: 3 min 53 sec

🎖️ Top News

OpenAI's releases GPT-4o mini: 20x cheaper, same performance as GPT-4o.

📌 WebAI

Learn how companies are leveraging AI to solve real world business problems.

⚡️ Trending Signals

Mistral releases Nemo: open-source, outperforms Llama 3 8B.
Meta's LLaMA 3.1 405B model leaked a day before official release.
Together's new inference stack: 4x faster decoding throughput.
HuggingFace launches SmolLM to run powerful models on your device
Groq open sources Llama-3-Groq-Tool-Use, for function calling

🛠️ Top Repos

mem0: Store, search, update personalized memories in LLMs with ease.
Crawlee: Powerful Python library for web scraping and automation.
LLaMA-Factory: Efficiently fine-tune 100+ language models.

📌 Assembly AI

A checklist to decide if you should build or buy your AI speech recognition system.

🧠 Tutorial

Notebooks: Using Mistral Nemo with 60% less memory

If you're enjoying AlphaSignal please forward this email to a colleague.

It helps us keep this content free.

TOP NEWS

OpenAI

GPT-4o mini: 20x cheaper yet as performant as GPT-4o

⇧ 14,921 Likes

What's New

OpenAI has introduced GPT-4o mini, a cost-efficient small AI model. This model aims to make AI more accessible by reducing costs while maintaining high performance.

GPT-4o mini costs 15 cents per million input tokens and 60 cents per million output tokens, making it 20 times cheaper while providing the same performance as GPT-4o.

GPT-4o mini supports text and vision in the API
The model has a 128K token context window and supports up to 16K output tokens per request. It leverages an improved tokenizer for more cost-effective handling of non-English text. Future updates will include support for video and audio inputs and outputs.

Early testing shows GPT-4o mini's practical applications
The model performs better than GPT-3.5 Turbo in extracting structured data and generating high-quality email responses. Developers can handle tasks that involve chaining or parallelizing multiple model calls, passing a large volume of context, or providing real-time text responses.

Performance metrics highlight GPT-4o mini's capabilities

MMLU: Scores 82%, surpassing Gemini Flash (77.9%) and Claude Haiku (73.8%).
MGSM: Scores 87% in math reasoning, outperforming Gemini Flash (75.5%) and Claude Haiku (71.7%).
HumanEval: Achieves 87.2% in coding performance, compared to Gemini Flash (71.5%) and Claude Haiku (75.9%).
MMMU: Scores 59.4% in multimodal reasoning, higher than Gemini Flash (56.1%) and Claude Haiku (50.2%).

Availability
Developers can access GPT-4o mini via the Assistants API, Chat Completions API, and Batch API. Free, Plus, and Team users in ChatGPT can use it immediately, with Enterprise access starting next week.

Get an insider’s look at the future of AI at WebAI's Summer Release event

Learn how companies are leveraging AI to solve real world business problems:

• Deploy some of the world's largest models locally on distributed networks
• Train and customize models quickly and easily
• Uphold data privacy and IP protection by building and deploying locally

GET YOUR SEAT

partner with us

TRENDING SIGNALS

Language Models

Mistral releases Nemo, their best small model: 12B model with 128k context length that outperforms LLama 3 8B

⇧ 3310 Likes

LLaMa

Meta's LLaMA 3.1 405B base model gets leaked 1 day ahead of release

⇧ 356 Likes

Inference

Together announces an inference stack that provides decoding throughput 4x faster than open-source vLLM

⇧ 342 Likes

Light LLMs

HuggingFace introduces SmolLM, a family of powerful models small enough to run on your device

⇧ 2129 Likes

Function Calling

Groq open sources Llama-3-Groq-Tool-Use, a series of models optimized for function calling

⇧ 1458 Likes

Why Build When You Can Deploy Speech AI Instantly

Not sure whether to build or buy an AI speech recognition system? Our comprehensive guide breaks down the key considerations, from accuracy and internal resources to speed of iteration and data security.

Learn more with the build or buy checklist.

Get the checklist ↗️

TRENDING REPOS

Memory

mem0

☆ 14,510 Stars

Mem0 helps store, search, and update personalized memories in LLMs. It supports multi-level memory retention, adaptive personalization, and cross-platform consistency with a developer-friendly API. Integration is easy with Qdrant for production environments.

Agents

Crawlee

☆ 13,602 Stars

A web scraping and browser automation library for Python to build reliable crawlers. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with BeautifulSoup, Playwright, and raw HTTP. Both headful and headless mode. With proxy rotation.

Code Manager

LLaMA-Factory

☆ 27,413 Stars

LLaMA-Factory helps you efficiently fine-tune 100+ language models using advanced techniques like LoRA, QLoRA, and various optimization algorithms. It supports scalable resources, faster inference, and provides detailed experiment monitoring with up to 3.7 times faster training speeds and improved GPU memory efficiency.

TUTORIAL

Notebooks: Using Mistral Nemo with 60% less memory

Nvidia is releasing two free notebooks for Mistral NeMo 12b, enabling 2x faster finetuning with 60% less memory. Mistral's latest free LLM is the largest multilingual open-source model that fits in a free Colab GPU.

Additionally, Nvidia has uploaded 4-bit pre-quantized base and instruction-tuned models for 8x faster downloads, identified and fixed several bugs/issues, and published a blog post detailing their findings and the new Unsloth AI release.

LAST WEEK'S GREATEST HITS

llm-graph-builder: Create knowledge graphs from unstructured data using LLMs.
FlashAttention-3 speeds up transformers 16x, reaching 1.2 PFLOPS on H100 GPUs
OpenAI has developed a system to evaluate how close we are to AGI.

Stop receiving emails here.

AlphaSignal, 214 Barton Springs RD, Austin, Texas 94123, United States