目录
模型评测
条目:33
2026年三月
2 篇
| 类型 | 阅读 | 条目 |
|---|---|---|
[自动]
[HACKER_NEWS] | 5min | newspaper
Qwen3.5 122B/35B 本地跑出 Sonnet 4.5 性能 03-01
Qwen3.5
Sonnet 4.5
本地部署 |
[自动]
[HACKER_NEWS] | 5min | newspaper
Qwen3.5 122B与35B模型本地实现Sonnet 4.5性能 03-01
Qwen3.5
Sonnet 4.5
本地部署 |
2026年二月
26 篇
| 类型 | 阅读 | 条目 |
|---|---|---|
[自动]
[HACKER_NEWS] | 5min | newspaper
Qwen3.5 122B与35B本地部署性能对标Sonnet 4.5 02-28
Qwen3.5
Sonnet 4.5
本地部署 |
[自动]
[HACKER_NEWS] | 5min | newspaper
Moonshine 开源语音识别模型:精度超越 WhisperLargev3 02-25
语音识别
STT
Moonshine |
[自动]
[HACKER_NEWS] | 5min | newspaper
Moonshine 开源 STT 模型:精度超越 WhisperLargev3 02-25
STT
Whisper
Moonshine |
[自动]
[ARXIV] | 5min | school
DeepSpeed图像工作负载评测:视觉Transformer扩展性能 02-25
DeepSpeed
ViT
视觉Transformer |
[自动]
[HACKER_NEWS] | 5min | newspaper
AI智能体自主性的实践测量方法 02-19
AI Agent
自主性
评估指标 |
[自动]
[BLOGS_PODCASTS] | 3min | mic
Claude Sonnet 4.6 发布:综合性能升级与部分局限 02-19
Claude
Sonnet 4.6
Anthropic |
[自动]
[BLOGS_PODCASTS] | 2min | mic
Qwen3.5-397B-A17B:最小Open-Opus级高效模型 02-19
Qwen3.5
Qwen
MoE |
[自动]
[BLOGS_PODCASTS] | 3min | mic
Claude Sonnet 4.6 发布:基于 4.5 的升级与局限 02-19
Claude
Anthropic
Sonnet 4.6 |
[自动]
[BLOGS_PODCASTS] | 2min | mic
Claude Sonnet 4.6 发布:基于 4.5 的升级与实测表现 02-19
Claude
Anthropic
Sonnet 4.6 |
[自动]
[BLOGS_PODCASTS] | 3min | mic
Claude Sonnet 4.6发布:基于4.5的升级与部分局限 02-18
Claude
Sonnet 4.6
Anthropic |
[自动]
[BLOGS_PODCASTS] | 3min | mic
Qwen3.5-397B-A17B:最小Open-Opus级高效模型 02-18
Qwen3.5
Qwen
MoE |
[自动]
[BLOGS_PODCASTS] | 2min | mic
Qwen3.5-397B-A17B:最小Open-Opus级高效模型 02-18
Qwen3.5
Qwen
MoE |
[自动]
[HACKER_NEWS] | 7min | newspaper
评测 AGENTS.md:对编程 AI 智能体的实际效用分析 02-17
AI Agent
LLM
代码生成 |
[自动]
[JUEJIN] | 2min | sticky_note_2
豆包大模型 2.0 发布:模型能力实测与升级详解 02-16
豆包大模型
字节跳动
LLM |
[自动]
[BLOGS_PODCASTS] | 3min | mic
Z.ai发布GLM-5开放权重模型,性能超越Opus 4.5 02-14
GLM-5
Z.ai
Opus 4.5 |
[自动]
[BLOGS_PODCASTS] | 3min | mic
Z.ai GLM-5开源:新SOTA级开放权重大模型 02-13
GLM-5
Z.ai
SOTA |
[自动]
[BLOGS_PODCASTS] | 3min | mic
Z.ai发布GLM-5开源模型:性能超越Opus 4.5 02-13
GLM-5
Z.ai
SOTA |
[自动]
[JUEJIN] | 2min | sticky_note_2
智谱GLM-5即Pony Alpha:对标Claude Opus 02-12
智谱AI
GLM-5
Pony Alpha |
[自动]
[BLOGS_PODCASTS] | 3min | mic
Qwen Image 2与Seedance 2:中国生成式媒体模型进展 02-12
Qwen Image 2
Seedance 2
生成式媒体 |
[自动]
[BLOGS_PODCASTS] | 4min | mic
OpenAI 对决 Anthropic:Claude Opus 4.6 与 GPT 5.3 Codex 深度评 02-08
OpenAI
Anthropic
Claude |
[自动]
[HACKER_NEWS] | 4min | newspaper
OpenAI前沿技术进展与模型能力解析 02-05
OpenAI
模型能力
前沿技术 |
[自动]
[HACKER_NEWS] | 5min | newspaper
Agent Skills:AI 智能体技能评估框架 02-04
Agent
智能体
评估框架 |
[自动]
[BLOGS_PODCASTS] | 3min | mic
xAI Grok Imagine API 登顶视频模型榜:定价与延迟优势显著 02-03
xAI
Grok
视频生成 |
[自动]
[HACKER_NEWS] | 5min | newspaper
利用Game Arena平台推进AI基准测试 02-03
AI基准测试
Game Arena
LLM评估 |
[自动]
[HACKER_NEWS] | 4min | newspaper
用Game Arena平台推进AI基准测试 02-02
AI基准测试
Game Arena
LLM评估 |
[自动]
[BLOGS_PODCASTS] | 3min | mic
xAi 推出 Grok Imagine API:对标 Sora 的视频模型与性价比优势 02-02
xAI
Grok
Imagine API |
2026年一月
5 篇
| 类型 | 阅读 | 条目 |
|---|---|---|
[自动]
[BLOGS_PODCASTS] | 5min | mic
xAI巩固前沿实验室地位并计划与SpaceX合并 01-31
xAI
Grok
SpaceX |
[自动]
[BLOGS_PODCASTS] | 3min | mic
Kimi K2.5:半价超越Sonnet 4.5,支持原生多模态与百并发Agent 01-31
Kimi k1.5
Moonshot AI
开源模型 |
[自动]
[BLOGS_PODCASTS] | 2min | mic
Alyah:评估阿拉伯语大模型阿联酋方言能力 01-29
LLM
阿拉伯语
方言评估 |
[自动]
[HACKER_NEWS] | 3min | newspaper
🧠炸裂!Gemini Flash在俄罗斯大战Opus胜率66%!🚀 01-27
Gemini Flash
Claude Opus
TetrisBench |
[自动]
[HACKER_NEWS] | 3min | newspaper
震惊!Gemini Flash击败Opus!🎮Tetris胜率66%🚀 01-27
LLM
Gemini Flash
Claude 3 Opus |
无匹配条目