IN TODAY'S SIGNAL |
Read time: 3 min 53 sec |
ποΈ Top News
π WebAI
β‘οΈ Trending Signals
π οΈ Top Repos
-
mem0: Store, search, update personalized memories in LLMs with ease.
-
Crawlee: Powerful Python library for web scraping and automation.
-
LLaMA-Factory: Efficiently fine-tune 100+ language models.
π Assembly AI
π§ Tutorial
|
|
|
|
If you're enjoying AlphaSignal please forward this email to a colleague.
It helps us keep this content free. |
|
|
|
TOP NEWS |
OpenAI |
GPT-4o mini: 20x cheaper yet as performant as GPT-4o |
β§ 14,921 Likes |
|
What's New |
OpenAI has introduced GPT-4o mini, a cost-efficient small AI model. This model aims to make AI more accessible by reducing costs while maintaining high performance.
GPT-4o mini costs 15 cents per million input tokens and 60 cents per million output tokens, making it 20 times cheaper while providing the same performance as GPT-4o.
GPT-4o mini supports text and vision in the API
The model has a 128K token context window and supports up to 16K output tokens per request. It leverages an improved tokenizer for more cost-effective handling of non-English text. Future updates will include support for video and audio inputs and outputs.
Early testing shows GPT-4o mini's practical applications
The model performs better than GPT-3.5 Turbo in extracting structured data and generating high-quality email responses. Developers can handle tasks that involve chaining or parallelizing multiple model calls, passing a large volume of context, or providing real-time text responses.
Performance metrics highlight GPT-4o mini's capabilities
- MMLU: Scores 82%, surpassing Gemini Flash (77.9%) and Claude Haiku (73.8%).
- MGSM: Scores 87% in math reasoning, outperforming Gemini Flash (75.5%) and Claude Haiku (71.7%).
- HumanEval: Achieves 87.2% in coding performance, compared to Gemini Flash (71.5%) and Claude Haiku (75.9%).
- MMMU: Scores 59.4% in multimodal reasoning, higher than Gemini Flash (56.1%) and Claude Haiku (50.2%).
Availability
Developers can access GPT-4o mini via the Assistants API, Chat Completions API, and Batch API. Free, Plus, and Team users in ChatGPT can use it immediately, with Enterprise access starting next week. |
|
READ MORE |
|
|
|
|
Get an insiderβs look at the future of AI at WebAI's Summer Release event |
Learn how companies are leveraging AI to solve real world business problems:
β’ Deploy some of the world's largest models locally on distributed networks
β’ Train and customize models quickly and easily
β’ Uphold data privacy and IP protection by building and deploying locally
|
GET YOUR SEAT |
partner with us |
|
|
|
TRENDING SIGNALS |
Language Models |
|
β§ 3310 Likes |
|
LLaMa |
|
β§ 356 Likes |
|
Inference |
|
β§ 342 Likes |
|
Light LLMs |
|
β§ 2129 Likes |
|
Function Calling |
|
β§ 1458 Likes |
|
|
|
|
|
|
Why Build When You Can Deploy Speech AI Instantly |
Not sure whether to build or buy an AI speech recognition system? Our comprehensive guide breaks down the key considerations, from accuracy and internal resources to speed of iteration and data security.
Learn more with the build or buy checklist. |
Get the checklist βοΈ |
|
|
|
TRENDING REPOS |
Memory |
|
β 14,510 Stars |
Mem0 helps store, search, and update personalized memories in LLMs. It supports multi-level memory retention, adaptive personalization, and cross-platform consistency with a developer-friendly API. Integration is easy with Qdrant for production environments.
|
|
Agents |
|
β 13,602 Stars |
A web scraping and browser automation library for Python to build reliable crawlers. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with BeautifulSoup, Playwright, and raw HTTP. Both headful and headless mode. With proxy rotation. |
|
Code Manager |
|
β 27,413 Stars |
LLaMA-Factory helps you efficiently fine-tune 100+ language models using advanced techniques like LoRA, QLoRA, and various optimization algorithms. It supports scalable resources, faster inference, and provides detailed experiment monitoring with up to 3.7 times faster training speeds and improved GPU memory efficiency. |
|
|
|
|
|
|
TUTORIAL |
Notebooks: Using Mistral Nemo with 60% less memory |
Nvidia is releasing two free notebooks for Mistral NeMo 12b, enabling 2x faster finetuning with 60% less memory. Mistral's latest free LLM is the largest multilingual open-source model that fits in a free Colab GPU.
Additionally, Nvidia has uploaded 4-bit pre-quantized base and instruction-tuned models for 8x faster downloads, identified and fixed several bugs/issues, and published a blog post detailing their findings and the new Unsloth AI release.
|
|
|
|
LAST WEEK'S GREATEST HITS |
|
|
|
|
|