
{"id":9502,"date":"2026-06-22T14:42:50","date_gmt":"2026-06-22T06:42:50","guid":{"rendered":"https:\/\/infernews.com\/blog\/official-repo-for-perceptiondlm-codebase\/"},"modified":"2026-06-22T14:44:43","modified_gmt":"2026-06-22T06:44:43","slug":"official-repo-for-perceptiondlm-codebase","status":"publish","type":"post","link":"https:\/\/infernews.com\/blog\/official-repo-for-perceptiondlm-codebase\/","title":{"rendered":"PerceptionDLM\uff1a\u591a\u5340\u57df\u5716\u50cf\u63cf\u8ff0\u52a0\u901f\u65b9\u6848"},"content":{"rendered":"\n<figure class=\"wp-block-image size-large\"><img decoding=\"async\" src=\"https:\/\/infernews.com\/blog\/wp-content\/uploads\/2026\/06\/teaser-5.jpg\" alt=\"icon\"\/><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">\u73fe\u6642\u4e0d\u5c11 Multimodal Large Language Models (MLLMs) \u505a\u5340\u57df\u63cf\u8ff0\u6642\uff0c\u4ecd\u7136\u4f9d\u8cf4 autoregressive (AR) \u9010\u6bb5\u751f\u6210\uff1a\u4e00\u5f35\u5716\u6709\u5e7e\u591a\u500b mask\uff0c\u5c31\u8981\u9010\u500b\u5340\u57df\u6162\u6162\u89e3\u8b80\u3002PerceptionDLM \u63d0\u51fa\u7684\u65b9\u5411\u5f88\u660e\u78ba\uff0c\u6539\u7528 Multimodal Diffusion Language Model\uff0c\u540c\u4e00\u8f2a denoising process \u5167\u540c\u6642\u8f38\u51fa\u591a\u500b\u5340\u57df\u63cf\u8ff0\uff0c\u76ee\u6a19\u662f\u89e3\u6c7a\u591a\u5340\u57df\u611f\u77e5\u5728\u5ef6\u9072\u4e0a\u96a8\u6578\u91cf\u7dda\u6027\u4e0a\u5347\u7684\u554f\u984c\u3002<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">\u9019\u662f\u4e00\u500b\u504f\u5411<strong>\u6a21\u578b\u52a0\u57fa\u6e96\u6e2c\u8a66<\/strong>\u7684\u958b\u6e90\u9805\u76ee\uff1a\u6838\u5fc3\u662f PerceptionDLM \u8207 PerceptionDLM-Base\uff0c\u53e6\u52a0 ParaDLC-Bench\u3001PerceptionDLM-Data \u548c Bee \/ Honey \u7cfb\u5217\u8a13\u7df4\u8cc7\u6599\u914d\u65b9\u3002\u4f5c\u8005\u9ede\u540d\u6279\u8a55\u820a\u7bc4\u5f0f\u4e3b\u8981\u5361\u5728 autoregressive region captioning\uff0c\u56e0\u6b64\u52a0\u5165 efficient prompting \u8207 structured attention masking\uff0c\u8b93\u5e73\u884c\u751f\u6210\u4e0d\u53ea\u505c\u7559\u5728\u6982\u5ff5\uff0c\u800c\u662f\u843d\u5230 sequence level \u540c token level\u3002<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">\u5f9e\u516c\u958b\u8cc7\u6599\u770b\uff0c\u9019\u500b\u9805\u76ee\u8f03\u9069\u5408\u4ee5 Hugging Face \u5df2\u91cb\u51fa\u7684\u6a21\u578b\u3001\u8cc7\u6599\u96c6\u8207 evaluation suite \u4f86\u7406\u89e3\u548c\u6e2c\u8a66\uff1b\u60f3\u91cd\u73fe\u7d50\u679c\u7684\u4eba\uff0c\u4ea6\u53ef\u6cbf\u4f4f\u8a13\u7df4\u8cc7\u6599\u914d\u65b9\u3001Training \u8207 Evaluation \u6d41\u7a0b\u90e8\u7f72\u3002\u5c0d\u4e00\u822c\u958b\u767c\u5718\u968a\u800c\u8a00\uff0c\u6700\u6709\u53c3\u8003\u50f9\u503c\u7684\u4e0d\u662f\u5b89\u88dd\u7d30\u7bc0\uff0c\u800c\u662f\u5b83\u793a\u7bc4\u4e86 diffusion VLM \u600e\u6a23\u8655\u7406\u300c\u591a\u5340\u57df\u540c\u6642\u63cf\u8ff0\u300d\u9019\u7a2e\u4ee5\u5f80\u8f03\u5c11\u7531 DLM \u627f\u64d4\u7684\u4efb\u52d9\u3002<\/p>\n\n\n<div\n    \tclass=\"align wp-block-vpb-video\"    id='vpbpVideoPlayer-1'\n    data-attributes='{&quot;source&quot;:&quot;https:\\\/\\\/msalab-pku.github.io\\\/projects\\\/PerceptionDLM\\\/assets\\\/videos\\\/demo_1.mp4&quot;,&quot;repeat&quot;:true,&quot;autoplay&quot;:true,&quot;muted&quot;:true,&quot;align&quot;:&quot;&quot;,&quot;poster&quot;:&quot;&quot;,&quot;controls&quot;:{&quot;play-large&quot;:true,&quot;restart&quot;:false,&quot;rewind&quot;:true,&quot;play&quot;:true,&quot;fast-forward&quot;:true,&quot;progress&quot;:true,&quot;current-time&quot;:true,&quot;duration&quot;:false,&quot;mute&quot;:true,&quot;volume&quot;:true,&quot;pip&quot;:false,&quot;airplay&quot;:false,&quot;settings&quot;:true,&quot;download&quot;:false,&quot;fullscreen&quot;:true},&quot;width&quot;:&quot;100%&quot;,&quot;radius&quot;:&quot;0px&quot;,&quot;resetOnEnd&quot;:false,&quot;autoHideControl&quot;:true,&quot;isSetup&quot;:false}'\n><\/div>\n\n\n<ul class=\"wp-block-list\">\n<li>\u55ae\u6b21 denoising pass \u53ef\u540c\u6642\u63cf\u8ff0\u591a\u500b masked regions\uff0c\u5b98\u65b9\u7a31\u5728\u5bc6\u96c6\u591a\u5340\u57df\u60c5\u5883\u53ef\u6709\u6700\u9ad8 3.4\u00d7 throughput speedup<\/li>\n\n\n\n<li>PerceptionDLM-Base \u64da\u7a31\u5728 16 \u500b multimodal benchmarks \u4e4b\u4e2d\uff0c15 \u500b\u52dd\u904e LLaDA-V<\/li>\n\n\n\n<li>ParaDLC-Bench \u4e0d\u53ea\u770b caption quality\uff0c\u4e5f\u628a inference efficiency \u4e00\u4f75\u7d0d\u5165<\/li>\n\n\n\n<li>\u5df2\u516c\u958b code\u3001model weights\u3001training data recipe\u3001evaluation suite\uff0c\u91cd\u73fe\u9580\u6abb\u6bd4\u53ea\u653e\u8ad6\u6587\u4f4e<\/li>\n<\/ul>\n\n\n\n<p class=\"wp-block-paragraph\">\u5b83\u8f03\u9069\u5408\u505a\u8996\u89ba\u7406\u89e3\u3001\u5716\u50cf\u6a19\u8a3b\u3001\u81ea\u52d5\u8cc7\u6599\u6574\u7406\uff0c\u6216\u8005\u9700\u8981\u4e00\u6b21\u770b\u591a\u500b\u5340\u57df\u7684\u7814\u7a76\u5718\u968a\u3002\u9650\u5236\u4e5f\u5f88\u6e05\u695a\uff1a\u76ee\u524d\u516c\u958b\u8cc7\u8a0a\u4e3b\u529b\u5f37\u8abf benchmark \u8207\u541e\u5410\u63d0\u5347\uff0c\u5c0d\u4e00\u822c\u7522\u54c1\u5834\u666f\u7684\u8a18\u61b6\u9ad4\u9700\u6c42\u3001\u5ef6\u9072\u5206\u4f48\u8207\u90e8\u7f72\u6210\u672c\u4ecd\u8981\u518d\u770b\u5be6\u6e2c\uff1b\u76f8\u95dc\u6a21\u578b\u5247\u5305\u62ec PerceptionDLM\u3001PerceptionDLM-Base\uff0c\u4ee5\u53ca\u5176 backbone LLaDA-8B-Instruct\uff0c\u5c0d\u6bd4\u5c0d\u8c61\u5247\u6709 LLaDA-V\u3002<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>GitHub\uff1a<\/strong> <a href=\"https:\/\/github.com\/MSALab-PKU\/PerceptionDLM\" rel=\"noopener noreferrer\">https:\/\/github.com\/MSALab-PKU\/PerceptionDLM<\/a><\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>\u9805\u76ee\u4e3b\u9801\uff1a<\/strong> <a href=\"https:\/\/msalab-pku.github.io\/projects\/PerceptionDLM\/index.html\" rel=\"noopener noreferrer\">https:\/\/msalab-pku.github.io\/projects\/PerceptionDLM\/index.html<\/a><\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>\u9805\u76ee\uff1a<\/strong> <a href=\"https:\/\/huggingface.co\/collections\/MSALab\/perceptiondlm-model-zoo\" rel=\"noopener noreferrer\">https:\/\/huggingface.co\/collections\/MSALab\/perceptiondlm-model-zoo<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>PerceptionDLM \u628a\u591a\u500b\u5340\u57df\u63cf\u8ff0\u6539\u6210\u4e26\u884c\u751f\u6210\uff0c\u91cd\u9ede\u4e0d\u662f\u53ea\u8ffd\u6e96\u78ba\u5ea6\uff0c\u800c\u662f\u9023\u901f\u5ea6\u4e00\u9f4a\u7d0d\u5165\u6bd4\u8f03\u3002<\/p>\n","protected":false},"author":8,"featured_media":9501,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"ai_generated_summary":"","footnotes":""},"categories":[133,176,33,119,161,76,127,149,195,199],"tags":[],"class_list":["post-9502","post","type-post","status-publish","format-standard","hentry","category-133","category-176","category-stablediffusion","category-119","category-161","category-76","category-127","category-149","category-195","category-dataset-"],"_links":{"self":[{"href":"https:\/\/infernews.com\/blog\/wp-json\/wp\/v2\/posts\/9502","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/infernews.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/infernews.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/infernews.com\/blog\/wp-json\/wp\/v2\/users\/8"}],"replies":[{"embeddable":true,"href":"https:\/\/infernews.com\/blog\/wp-json\/wp\/v2\/comments?post=9502"}],"version-history":[{"count":2,"href":"https:\/\/infernews.com\/blog\/wp-json\/wp\/v2\/posts\/9502\/revisions"}],"predecessor-version":[{"id":9505,"href":"https:\/\/infernews.com\/blog\/wp-json\/wp\/v2\/posts\/9502\/revisions\/9505"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/infernews.com\/blog\/wp-json\/wp\/v2\/media\/9501"}],"wp:attachment":[{"href":"https:\/\/infernews.com\/blog\/wp-json\/wp\/v2\/media?parent=9502"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/infernews.com\/blog\/wp-json\/wp\/v2\/categories?post=9502"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/infernews.com\/blog\/wp-json\/wp\/v2\/tags?post=9502"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}