All tags
Topic: "model-speedup"
not much happened today
nemotron-3-ultra nemotron-3.5-asr claude-opus-4 mythos-preview nvidia anthropic togethercompute baseten modal vllm_project fireworksai_hq ollama wandb cline primeintellect nousresearch mixture-of-experts long-context model-quantization agentic-ai streaming-speech asr low-precision-training benchmarking recursive-self-improvement code-generation model-speedup piotrz_zelasko
NVIDIA released Nemotron 3 Ultra, a fully open 550B MoE model with 55B active parameters and 1M context, optimized for long-running agent tasks with up to 5x speedup and 30% cost reduction. It features hybrid Mamba/attention, LatentMoE, native MTP, and was pretrained on 20T tokens using NVFP4 low-precision format. Benchmarks show strong performance with 47.7 Intelligence Index and 400+ output tokens/sec. The model is supported across major serving platforms. Additionally, Nemotron 3.5 ASR is an open streaming ASR model with 0.6B parameters, supporting 40 language-locale combinations and sub-100ms latency, designed for voice agents.
Anthropic highlighted early signs of recursive self-improvement (RSI) in AI, with Claude models authoring 80%+ of merged code and engineers shipping 8x more code. Claude Opus 4 achieved 3x speedup on training scripts, while Mythos Preview reached ~52x speedup and provided better research suggestions than humans 64% of the time.
not much happened today
gemma-3-270m canary-1b parakeet-tdt-0.6b nemotron-nano-v2 qwen-image-edit dino-v3 nvidia alibaba tencent meta-ai-fair ibm datology synthetic-data multilingual-asr self-supervised-learning vision model-efficiency training-data data-augmentation model-speedup domain-transfer demishassabis adrgrondin rasbt reach_vb ctnzr clementdelangue natolambert _akhaliq itspaulai mervenoyann xenovacom tomaarsen pratyushmaini code_star leavittron k_schuerholt giffmana
Gemma 3 270M, an ultra-small model optimized for edge and mobile use, was released and is gaining adoption. NVIDIA launched two open multilingual ASR models, Canary 1B and Parakeet-TDT 0.6B, trained on 1 million hours of data with CC-BY licensing, plus the efficient Nemotron-Nano v2 9B model with significant speedups. Alibaba's Qwen-Image-Edit offers bilingual text editing and semantic image transformations. Tencent Hunyuan introduced a controllable game-world video generator trained on over 1 million gameplay recordings. Meta's DINOv3 presents a scalable self-supervised vision backbone with strong domain transfer capabilities. IBM quietly released efficient English embedding models under a commercial-friendly license. The BeyondWeb synthetic data paper shows significant training speed and performance gains over prior datasets. Analysis of HRM architecture suggests performance improvements largely stem from data augmentation and scaffolding rather than novel architecture. "Models and datasets are openly licensed and available on Hugging Face."