MiniMax ships M3, a Chinese open-weight model claiming frontier coding at one-twentieth the attention cost

A 1M-token sparse-attention model lands above GPT-5.5 on its own coding benchmark, below Claude Opus 4.8, with weights still withheld

AI· active 장기전·그들이 말하지 않는 것 ·7 시각 ·2026년 6월 18일 14:00Z ·rbtfl 업데이트 2026년 6월 29일

Summary

Chinese lab MiniMax released M3, an open-weight model pairing a 1M-token context window, native multimodal input and agentic computer use, and posted 59.0% on the SWE-Bench Pro coding benchmark, above OpenAI's GPT-5.5 (58.6%) and Google Gemini 3.1 Pro (54.2%) on the lab's own runs. It trails Anthropic's Claude Opus 4.8, shipped a week earlier, at a reported 69.2%. The headline engineering claim is MiniMax Sparse Attention (MSA), which selects only relevant key-value blocks and cuts per-token compute to one-twentieth at full context, with the architecture independently verified around June 18. The catch: the promised open weights had not been published at release, and training code and inference operators stayed closed.

The split

US ML press split between the capability story and the caveats. MarkTechPost foregrounded MSA's efficiency; Tech Times hammered that the benchmarks are vendor-run and that M3 sits below Opus 4.8. Outside the US, the framing shifted: Italy's developer coverage and India's Open Source For You centred two things US writeups soft-pedalled, that "open-weight" is not open-source with code withheld, and that China's 2017 National Intelligence Law obliges MiniMax to assist state intelligence on any prompt routed through its API. That governance angle, not the SWE-Bench number, is what the launch hype omits.

By the numbers

59.0%, M3's vendor-run SWE-Bench Pro score (GPT-5.5: 58.6%, Gemini 3.1 Pro: 54.2%).
69.2%, Claude Opus 4.8's reported SWE-Bench Pro, ahead of M3.
1M tokens, M3 context window.
1/20, per-token compute at full context under MSA versus the prior generation.
9x / 15x, faster prefill and decoding claimed under MSA.

Why it matters

Cheap long-context coding from an open-weight Chinese model pressures Western labs on price and pushes more inference toward Chinese infrastructure. But "open-weight" with withheld code, vendor benchmarks, and a legal duty to assist Beijing reframes the adoption question from capability to trust, especially for any team routing source code through the API.

What to watch

Whether MiniMax actually publishes the M3 weights and a technical report.
Independent benchmark reruns versus the vendor numbers.
Enterprise and government bans or carve-outs over the API's data exposure.
Whether DeepSeek, Qwen or others match the sparse-attention efficiency.

지역별 시각 · 4

▸ ML engineering press

MarkTechPost · United States · en · 2026년 6월 1일 15:00Z

Technical writeup of M3's MiniMax Sparse Attention (MSA), which selects relevant key-value blocks to cut per-token compute to one-twentieth at 1M-token context, with native multimodal input and computer use for agentic coding.

“MSA cuts per-token compute to one-twentieth at 1M-token context, with over 9x faster prefill and 15x faster decoding than the prior generation.”

출처 ↗

▸ skeptical benchmark scrutiny

Tech Times · United States · en · 2026년 6월 18일 14:00Z

Reports independent verification of the MSA architecture on June 18 while flagging that M3's 59.0% SWE-Bench Pro is vendor-run, that it trails Anthropic's Claude Opus 4.8 at 69.2%, and that promised open weights had not shipped.

“M3's 59.0% on SWE-Bench Pro beats GPT-5.5 but trails Claude Opus 4.8's 69.2%; the scores are company-run and the weights are still withheld.”

출처 ↗

▸ European developer view

Pasquale Pillitteri · Italy · it · 2026년 6월 2일 08:00Z

European framing of M3 as a Chinese open-weights challenger to GPT-5.5, weighing the appeal of cheap long-context coding against China's 2017 National Intelligence Law obligations on any prompt sent to MiniMax's API.

“China's 2017 intelligence law obliges MiniMax to assist state intelligence work, an obligation that applies to every prompt sent to its API, wherever the user sits.”

출처 ↗

▸ Indian open-source community

Open Source For You · India · en · 2026년 6월 2일 10:00Z

Indian open-source view stressing that M3 is open-weight, not open-source: training code and inference operators stayed closed, so the 'open' label oversells what was actually released.

“M3 is open-weight, not open-source: MiniMax withheld training code and inference operators, stopping short of a full open commitment.”

출처 ↗