Tencent's New Hy3 AI Model Is the Most Efficient Chinese LLM No One's Talking About

Tencent quietly dropped its most capable AI model yet on Thursday, and the benchmark numbers are hard to ignore. Hy3 preview, the company's first model after a full infrastructure rebuild, went open-source today across GitHub, Hugging Face, and ModelScope.

It’s also available on Tencent Cloud’s official website, under a paid plan.

My3 packs 295 billion total parameters (a measurement of a model’s potential breadth of knowledge) but only 21 billion active at any given time. That's the beauty of a Mixture-of-Experts architecture—the model routes each query to a specialized subset of its "expert" sub-networks instead of running everything at once. Less compute, lower cost, roughly similar output quality. It also supports up to 256,000 tokens of context, which is enough to swallow a full-length novel in a single prompt.

The model was built to balance three things Tencent says it stopped sacrificing for each other: capability breadth, honest evaluation, and cost-efficiency. Their previous flagship, Hy2, had over 400 billion parameters. Tencent explicitly walked that back, arguing 295 billion is the optimal sweet spot where reasoning fully matures but the cost of adding more parameters stops paying off.

This also doesn’t mean the model is worse. Models with better training and lower parameters outperform bigger generalist ones quite frequently.

On coding, the improvement is dramatic. SWE-bench Verified is a benchmark that tests whether a model can actually fix real bugs from GitHub repositories—not toy problems, but production code. Hy2 scored 53.0%. Hy3 preview scores 74.4%. That's a 40% jump in one generation, landing it in range of Claude Opus 4.6 (80.8%) and above GLM-5 (77.8%) and Kimi-K2.5 (76.8%). Terminal-Bench 2.0, which measures autonomous task execution in a real command-line environment, went from 23.2% to 54.4%—also a massive leap.

The model, however, can be a very interesting choice for people building with agents. Agents have a very complex set of instructions that involve memories, skills, and tool calls. They usually miss something, which can ruin a workflow or produce poor results. That’s why agentic capabilities are becoming more and more important for AI developers as this area becomes the most hyped thing in the industry. It’s also why the model was immediately made available on Openclaw.

Search and browsing agents—where models must retrieve, filter, and synthesize information from the open web without human guidance—also improved sharply. On BrowseComp, a benchmark tracking complex web research tasks, Hy3 preview reached 67.1% (up from Hy2's 28.7%). On WideSearch, it hit 70.2%, outperforming GLM-5 and Kimi-K2.5 but trailing Claude Opus 4.6's 77.2%.

In reasoning, the model topped every Chinese competitor on Tsinghua University's math PhD qualifying exam (Spring 2026), scoring 88.4 on the average of three runs avg@3. That's a real-world exam, not a curated dataset—the kind of evaluation Tencent says it's prioritizing to avoid benchmark gaming. The model also scored 87.8 on CHSBO 2025 (China's national high school biology olympiad), highest among Chinese models in that category.

Hy3 preview started training in late January 2026 and launched Thursday—under three months from cold start to open-source release. Unusually fast for a frontier-class model. Tencent attributes it to a February infrastructure overhaul led by Yao Shunyu, its chief AI scientist, who pushed a full rebuild of the pretraining and reinforcement learning stack.

This is a very different approach from what Chinese AI labs were doing a year ago, when DeepSeek's R1 shocked the industry with its cost-efficiency.

Hy3 still trails OpenAI and Google DeepMind's flagships, but by the size-to-performance ratio, Hy3 preview is hard to dismiss: the agent benchmark composite shows it in the "optimal zone" with ~295 billion parameters, ahead of DeepSeek-V3.2 (600 billion+) and matching Kimi-K2.5 (over 1 trillion parameters) at a fraction of the compute cost.

Hunyuan models have already been deployed across Yuanbao, CodeBuddy, WorkBuddy, QQ, and Tencent Docs. On CodeBuddy and WorkBuddy, first-token latency dropped 54%, end-to-end generation time fell 47%, and the model successfully ran agent workflows as long as 495 steps. Tencent Cloud is offering API access at approximately $0.18 per million input tokens and $0.59 per million output tokens, with personal Token Plan packages starting at around $4.10 per month.

免责声明：本文章仅代表作者个人观点，不代表本平台的立场和观点。本文章仅供信息分享，不构成对任何人的任何投资建议。用户与作者之间的任何争议，与本平台无关。如网页中刊载的文章或图片涉及侵权，请提供相关的权利证明和身份证明发送邮件到support@aicoin.com，本平台相关工作人员将会进行核查。

Tencent's New Hy3 AI Model Is the Most Efficient Chinese LLM No One's Talking About

Selected Articles by Decrypt

Table of Contents

Related Articles