
{"id":8697,"date":"2026-06-02T03:22:53","date_gmt":"2026-06-01T19:22:53","guid":{"rendered":"https:\/\/infernews.com\/blog\/arxiv-2026-dmoe-dllms-with-learnable-block-experts\/"},"modified":"2026-06-02T03:22:53","modified_gmt":"2026-06-01T19:22:53","slug":"arxiv-2026-dmoe-dllms-with-learnable-block-experts","status":"publish","type":"post","link":"https:\/\/infernews.com\/blog\/arxiv-2026-dmoe-dllms-with-learnable-block-experts\/","title":{"rendered":"dMoE\uff1a\u8b93\u64f4\u6563\u8a9e\u8a00\u6a21\u578b\u544a\u5225\u5c08\u5bb6\u66b4\u6f32"},"content":{"rendered":"<figure class=\"wp-block-image size-large\"><img decoding=\"async\" src=\"https:\/\/infernews.com\/blog\/wp-content\/uploads\/2026\/06\/pasted-849545fa563b.jpg\" alt=\"Overview\"><\/figure>\n<p>\u64f4\u6563\u5f0f\u5927\u578b\u8a9e\u8a00\u6a21\u578b\uff08dLLMs\uff09\u8fd1\u5e74\u88ab\u8996\u70ba\u81ea\u8ff4\u6b78\u6a21\u578b\u7684\u53e6\u4e00\u689d\u8def\u7dda\uff0c\u672c\u8eab\u5c31\u652f\u63f4\u5e73\u884c\u89e3\u78bc\uff0c\u4f46\u4e00\u65e6\u642d\u914d MoE\uff08Mixture-of-Experts\uff09\u67b6\u69cb\u4f86\u653e\u5927\u6a21\u578b\u5bb9\u91cf\uff0c\u537b\u6703\u649e\u4e0a\u4e00\u500b\u5c37\u5c2c\u7684\u7246\uff1adLLM \u5728\u540c\u4e00\u500b\u524d\u5411\u50b3\u905e\u4e2d\u6703\u540c\u6642\u8655\u7406\u591a\u500b\u4e92\u76f8\u95dc\u806f\u7684 token\uff0c\u800c\u50b3\u7d71 MoE \u537b\u662f\u91dd\u5c0d\u6bcf\u500b token \u5404\u81ea\u6311\u9078\u5c08\u5bb6\uff0c\u5c0e\u81f4\u4e00\u6b21\u63a8\u8ad6\u8981\u555f\u52d5\u7684\u7368\u7acb\u5c08\u5bb6\u6578\u91cf\u66b4\u589e\uff0c\u8a18\u61b6\u9ad4\u983b\u5bec\u5f88\u5feb\u5c31\u6210\u70ba\u74f6\u9838\u3002<\/p>\n<p>dMoE \u7684\u6838\u5fc3\u69cb\u60f3\u76f8\u7576\u76f4\u89c0\uff1a\u8207\u5176\u5728\u6bcf\u500b token \u5c64\u7d1a\u5404\u81ea\u6c7a\u5b9a\u8981\u7528\u54ea\u500b\u5c08\u5bb6\uff0c\u4e0d\u5982\u5728\u300c\u5340\u584a\u300d\u5c64\u7d1a\u505a\u7d71\u4e00\u6c7a\u7b56\u3002\u5b83\u6703\u5148\u628a\u540c\u4e00\u500b\u5340\u584a\u5167\u5404 token \u7684\u5c08\u5bb6\u5206\u4f48\u805a\u5408\u6210\u4e00\u4efd\uff0c\u518d\u4ee5\u9019\u500b\u5340\u584a\u7d1a\u7684\u5206\u4f48\u53bb\u5f15\u5c0e\u6574\u500b\u5340\u584a\u7684\u8def\u7531\u3002\u9019\u500b\u6539\u52d5\u8b93\u555f\u52d5\u7684\u7368\u7acb\u5c08\u5bb6\u6578\u91cf\u5f9e\u539f\u672c\u7684 69.5 \u500b\u5de6\u53f3\u58d3\u5230 14.6 \u500b\uff0c\u8a18\u61b6\u9ad4\u7528\u91cf\u6e1b\u5c11\u7d04 76% \u81f3 80%\uff0c\u7aef\u5230\u7aef\u5ef6\u9072\u4e5f\u7372\u5f97 1.14 \u500d\u5230 1.66 \u500d\u7684\u52a0\u901f\u3002<\/p>\n<p>\u5728\u6548\u80fd\u7dad\u6301\u65b9\u9762\uff0cdMoE \u5728\u591a\u9805\u63a8\u7406\u8207\u901a\u7528\u57fa\u6e96\u6e2c\u8a66\u4e2d\u4fdd\u7559\u4e86\u539f\u6a21\u578b\u7d04 99.11% \u7684\u8868\u73fe\u3002\u4ee5 MATH500 \u70ba\u4f8b\uff0c\u6210\u7e3e\u53ea\u5f9e 72.0% \u5fae\u8dcc\u5230 71.0%\uff0c\u555f\u52d5\u5c08\u5bb6\u6578\u91cf\u537b\u5f9e 70 \u500b\u964d\u5230 14.1 \u500b\uff0c\u662f\u76f8\u7576\u5212\u7b97\u7684\u4ea4\u63db\u3002<\/p>\n<p>dMoE \u76f4\u63a5\u4ee5 LLaDA-2.0-mini \u70ba\u57fa\u790e\u5efa\u69cb\uff0c\u6c92\u6709\u66f4\u52d5\u4e3b\u67b6\u69cb\uff0c\u56e0\u6b64\u53ef\u9806\u5229\u5957\u7528\u5230\u5176\u4ed6\u906e\u7f69\u5f0f dLLMs\uff0c\u76ee\u524d\u4ea6\u5df2\u5728 Hugging Face \u4e0a\u91cb\u51fa\u540d\u70ba dMoE-16B \u7684\u6a21\u578b\u6b0a\u91cd\u3002\u5c0d\u60f3\u5617\u8a66 dLLM \u537b\u53d7\u9650\u65bc\u986f\u5361\u7684\u7814\u7a76\u8005\u8207\u5de5\u7a0b\u5e2b\u4f86\u8aaa\uff0c\u9019\u500b\u9805\u76ee\u662f\u4f4e\u9580\u6abb\u7684\u5ef6\u4f38\u5207\u5165\u9ede\uff1b\u5c0d\u505a\u6a21\u578b\u6548\u7387\u512a\u5316\u7684\u5718\u968a\uff0c\u5340\u584a\u7d1a\u8def\u7531\u7684\u8a2d\u8a08\u4e5f\u63d0\u4f9b\u4e86\u6709\u53c3\u8003\u50f9\u503c\u7684\u65b9\u5411\u3002<\/p>\n<p><strong>\u91cd\u9ede\u6458\u8981<\/strong><\/p>\n<ul>\n<li><strong>\u5340\u584a\u7d1a\u5c08\u5bb6\u8def\u7531<\/strong>\uff1a\u5728\u5340\u584a\u800c\u975e token \u5c64\u7d1a\u505a MoE \u6c7a\u7b56\uff0c\u5927\u5e45\u58d3\u4f4e\u555f\u52d5\u5c08\u5bb6\u6578\u91cf\u3002<\/li>\n<li><strong>\u8a18\u61b6\u9ad4\u8207\u983b\u5bec\u58d3\u529b\u6e1b\u8f15<\/strong>\uff1a\u7368\u7acb\u5c08\u5bb6\u5f9e\u7d04 69.5 \u500b\u964d\u5230 14.6 \u500b\uff0c\u8a18\u61b6\u9ad4\u7528\u91cf\u6e1b\u5c11 76%\u201380%\u3002<\/li>\n<li><strong>\u901f\u5ea6\u660e\u986f\u63d0\u5347<\/strong>\uff1a\u7aef\u5230\u7aef\u63a8\u8ad6\u5ef6\u9072\u7372\u5f97 1.14\u00d7 \u81f3 1.66\u00d7 \u52a0\u901f\u3002<\/li>\n<li><strong>\u8868\u73fe\u5e7e\u4e4e\u4e0d\u6253\u6298<\/strong>\uff1a\u5728\u591a\u9805\u57fa\u6e96\u6e2c\u8a66\u4e2d\u4fdd\u7559\u7d04 99.11% \u539f\u59cb\u6548\u80fd\u3002<\/li>\n<li><strong>\u96a8\u63d2\u5373\u7528\u8a2d\u8a08<\/strong>\uff1a\u4ee5 LLaDA-2.0-mini \u70ba\u57fa\u790e\uff0c\u4e0d\u6539\u52d5\u67b6\u69cb\u5373\u53ef\u5957\u7528\u81f3\u5176\u4ed6\u906e\u7f69\u5f0f dLLMs\u3002<\/li>\n<\/ul>\n<p><strong>GitHub\uff1a<\/strong> <a href=\"https:\/\/github.com\/fscdc\/dMoE\" rel=\"noopener noreferrer\">https:\/\/github.com\/fscdc\/dMoE<\/a><\/p>\n<p><strong>\u9805\u76ee\uff1a<\/strong> <a href=\"https:\/\/fscdc.github.io\/dMoE\/\" rel=\"noopener noreferrer\">https:\/\/fscdc.github.io\/dMoE\/<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>dMoE \u70ba\u64f4\u6563\u5f0f\u5927\u578b\u8a9e\u8a00\u6a21\u578b\u5f15\u5165\u5340\u584a\u7d1a MoE \u8def\u7531\uff0c\u628a\u63a8\u7406\u6642\u555f\u52d5\u7684\u5c08\u5bb6\u6578\u91cf\u5927\u5e45\u58d3\u4f4e\uff0c\u540c\u6642\u517c\u9867\u901f\u5ea6\u8207\u8868\u73fe\u3002<\/p>\n","protected":false},"author":8,"featured_media":8696,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"ai_generated_summary":"","wpai_meta_description":"","footnotes":""},"categories":[133,127,197],"tags":[],"class_list":["post-8697","post","type-post","status-publish","format-standard","hentry","category-133","category-127","category-framework"],"_links":{"self":[{"href":"https:\/\/infernews.com\/blog\/wp-json\/wp\/v2\/posts\/8697","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/infernews.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/infernews.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/infernews.com\/blog\/wp-json\/wp\/v2\/users\/8"}],"replies":[{"embeddable":true,"href":"https:\/\/infernews.com\/blog\/wp-json\/wp\/v2\/comments?post=8697"}],"version-history":[{"count":0,"href":"https:\/\/infernews.com\/blog\/wp-json\/wp\/v2\/posts\/8697\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/infernews.com\/blog\/wp-json\/wp\/v2\/media\/8696"}],"wp:attachment":[{"href":"https:\/\/infernews.com\/blog\/wp-json\/wp\/v2\/media?parent=8697"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/infernews.com\/blog\/wp-json\/wp\/v2\/categories?post=8697"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/infernews.com\/blog\/wp-json\/wp\/v2\/tags?post=8697"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}