All tags
Model: "reka-flash-3"
not much happened today
gemini-3.1-flash voxtral-tts cohere-transcribe gpt-5.4-mini gpt-5.4-nano glm-5-turbo reka-edge reka-flash-3 google-deepmind mistral-ai cohere openai zai reka-ai voice vision function-calling context-windows multimodality text-to-speech low-latency human-preference automatic-speech-recognition model-benchmarking cost-efficiency hallucination-detection multi-agent-systems open-source git-worktrees logan_kilpatrick sundar_pichai guillaume_lample aidan_gomez jay_alammar giffmana andrew_curran
Google launched Gemini 3.1 Flash Live, a realtime voice and vision agent model with 2x longer conversation memory, supporting 70 languages and 128k context. Mistral AI released Voxtral TTS, a low-latency, open-weight text-to-speech model supporting 9 languages and competitive with ElevenLabs. Cohere introduced Cohere Transcribe, an audio model with 14-language support and top English ASR leaderboard performance at 5.42 WER. OpenAI released smaller multimodal variants GPT-5.4 mini and GPT-5.4 nano with 400k context, noted for cost-competitiveness but high verbosity and hallucination rates. Other releases include GLM-5-Turbo by Zai, Reka Edge and Flash 3 on OpenRouter, and new multi-agent UX tooling Cline Kanban for orchestrating CLI coding agents.
The new OpenAI Agents Platform
reka-flash-3 o1-mini claude-3-7-sonnet llama-3-3-70b sonic-2 qwen-chat olympiccoder openai reka-ai hugging-face deepseek togethercompute alibaba ai-agents api model-releases fine-tuning reinforcement-learning model-training model-inference multimodality voice-synthesis gpu-clusters model-distillation performance-optimization open-source sama reach_vb
OpenAI introduced a comprehensive suite of new tools for AI agents, including the Responses API, Web Search Tool, Computer Use Tool, File Search Tool, and an open-source Agents SDK with integrated observability tools, marking a significant step towards the "Year of Agents." Meanwhile, Reka AI open-sourced Reka Flash 3, a 21B parameter reasoning model that outperforms o1-mini and powers their Nexus platform, with weights available on Hugging Face. The OlympicCoder series surpassed Claude 3.7 Sonnet and much larger models on competitive coding benchmarks. DeepSeek built a 32K GPU cluster capable of training V3-level models in under a week and is exploring AI distillation. Hugging Face announced Cerebras inference support, achieving over 2,000 tokens/s on Llama 3.3 70B, 70x faster than leading GPUs. Reka's Sonic-2 voice AI model delivers 40ms latency via the Together API. Alibaba's Qwen Chat enhanced its multimodal interface with video understanding up to 500MB, voice-to-text, guest mode, and expanded file uploads. Sama praised OpenAI's new API as "one of the most well-designed and useful APIs ever."