All tags

Company: "deepseek"

    not much happened today
    not much happened today
    not much happened today
    not much happened today
    not much happened today
    not much happened today
    not much happened today
    not much happened today
    DeepSeek v4
    not much happened today
    not much happened today
    Anthropic accuses DeepSeek, Moonshot, and MiniMax of "industrial-scale distillation attacks".
    Qwen3.5-397B-A17B: the smallest Open-Opus class, very efficient model
    not much happened today
    Apple picks Google's Gemini to power Siri's next generation
    not much happened today
    OpenAI GPT Image-1.5 claims to beat Nano Banana Pro, #1 across all Arenas, but completely fails Vibe Checks
    not much happened today
    MCP -> Agentic AI Foundation, Mistral Devstral 2
    OpenRouter's State of AI - An Empirical 100 Trillion Token Study
    not much happened today
    not much happened today
    Anthropic Claude Sonnet 4.5, Claude Code 2.0, new VS Code Extensions
    GDPVal finding: Claude Opus 4.1 within 95% of AGI (human experts in top 44 white collar jobs)
    Qwen3-Next-80B-A3B-Base: Towards Ultimate Training & Inference Efficiency
    not much happened today
    Cohere Command A Reasoning beats GPT-OSS-120B and DeepSeek R1 0528
    DeepSeek V3.1: 840B token continued pretrain, beating Claude 4 Sonnet at 11% of its cost
    Databricks' $100B Series K
    Kimi K2 - SOTA Open MoE proves that Muon can scale to 15T tokens/1T params
    not much happened today
    not much happened today
    Gemini 2.5 Pro/Flash GA, 2.5 Flash-Lite in Preview
    Chinese Models Launch - MiniMax-M1, Hailuo 2 "Kangaroo", Moonshot Kimi-Dev-72B
    not much happened today
    Mary Meeker is so back: BOND Capital AI Trends report
    not much happened today
    ChatGPT Codex, OpenAI's first cloud SWE agent
    not much happened today
    not much happened today
    Cursor @ $9b, OpenAI Buys Windsurf @ $3b
    not much happened today
    not much happened today
    Qwen 3: 0.6B to 235B MoE full+base models that beat R1 and o1
    not much happened today
    Google's Agent2Agent Protocol (A2A)
    not much happened today
    not much happened today
    not much happened today
    >$41B raised today (OpenAI @ 300b, Cursor @ 9.5b, Etched @ 1.5b)
    not much happened today
    Halfmoon is Reve Image: a new SOTA Image Model from ex-Adobe/Stability trio
    not much happened today
    The new OpenAI Agents Platform
    not much happened today
    DeepSeek's Open Source Stack
    Anthropic's $61.5B Series E
    not much happened today
    The Ultra-Scale Playbook: Training LLMs on GPU Clusters
    not much happened today
    not much happened today
    s1: Simple test-time scaling (and Kyutai Hibiki)
    How To Scale Your Model, by DeepMind
    o3-mini launches, OpenAI on "wrong side of history"
    Mistral Small 3 24B and Tulu 3 405B
    not much happened today
    not much happened today
    DeepSeek #1 on US App Store, Nvidia stock tanks -17%
    TinyZero: Reproduce DeepSeek R1-Zero for $30
    Bespoke-Stratos + Sky-T1: The Vicuna+Alpaca moment for reasoning
    DeepSeek R1: o1-level open weights model and a simple recipe for upgrading 1.5B models to Sonnet/4o level
    not much happened today
    not much happened today
    PRIME: Process Reinforcement through Implicit Rewards
    not much happened to end the year
    not much happened today
    not much happened today
    Qwen with Questions: 32B open weights reasoning model nears o1 in GPQA/AIME/Math500
    LMSys killed Model Versioning (gpt 4o 1120, gemini exp 1121)
    DeepSeek-R1 claims to beat o1-preview AND will be open sourced
    Common Corpus: 2T Open Tokens with Provenance
    DeepSeek Janus and Meta SpiRit-LM: Decoupled Image and Expressive Voice Omnimodality
    o1 destroys Lmsys Arena, Qwen 2.5, Kyutai Moshi release
    DataComp-LM: the best open-data 7B model/benchmark/dataset
    FlashAttention 3, PaliGemma, OpenAI's 5 Levels to Superintelligence
    Mozilla's AI Second Act
    Gemini Nano: 50-90% of Gemini Pro, <100ms inference, on device, in Chrome Canary
    There's Ilya!
    Gemini launches context caching... or does it?
    Snowflake Arctic: Fully Open 10B+128x4B Dense-MoE Hybrid LLM
    OpenAI's Instruction Hierarchy for the LLM OS
    Ring Attention for >1M Context
    Qwen 1.5 Released
    Adept Fuyu-Heavy: Multimodal model for Agents
    12/25/2023: Nous Hermes 2 Yi 34B for Christmas
    12/15/2023: Mixtral-Instruct beats Gemini Pro (and matches GPT3.5)