
{"id":8618,"date":"2026-05-30T09:03:21","date_gmt":"2026-05-30T01:03:21","guid":{"rendered":"https:\/\/infernews.com\/blog\/a-minimal-and-elegant-framework-amp-tutorial-for-real-time-interactive-world-mod\/"},"modified":"2026-05-30T09:03:21","modified_gmt":"2026-05-30T01:03:21","slug":"a-minimal-and-elegant-framework-amp-tutorial-for-real-time-interactive-world-mod","status":"publish","type":"post","link":"https:\/\/infernews.com\/blog\/a-minimal-and-elegant-framework-amp-tutorial-for-real-time-interactive-world-mod\/","title":{"rendered":"minWM\uff1a\u7531\u5f71\u7247\u751f\u6210\u8d70\u5411 World Model"},"content":{"rendered":"<figure class=\"wp-block-image size-large\"><img decoding=\"async\" src=\"https:\/\/infernews.com\/blog\/wp-content\/uploads\/2026\/05\/pasted-cf3c760c2105.jpg\" alt=\"Repository image for shengshu-ai\/minWM\"><\/figure>\n<p>minWM \u7684\u5b9a\u4f4d\u5f88\u6e05\u695a\uff1a\u5b83\u4e0d\u662f\u518d\u63d0\u4f9b\u4e00\u500b\u65b0\u6a21\u578b\uff0c\u800c\u662f\u628a\u5efa\u7acb video world model \u7684\u6574\u689d\u6d41\u7a0b\u62c6\u958b\uff0c\u8b93\u4eba\u7531 bidirectional T2V\uff08Text-to-Video\uff09\u6216 TI2V\uff08Text-and-Image-to-Video\uff09\u57fa\u790e\u6a21\u578b\uff0c\u4e00\u6b65\u6b65\u8f49\u6210 action-conditioned video world model\u3002\u5c0d\u525b\u63a5\u89f8\u9019\u500b\u9818\u57df\u7684\u4eba\u4f86\u8aaa\uff0c\u9019\u7a2e\u5b8c\u6574\u8def\u7dda\u6bd4\u53ea\u653e\u6b0a\u91cd\u6216\u55ae\u6bb5\u7a0b\u5f0f\u78bc\u66f4\u6709\u5e6b\u52a9\u3002<\/p>\n<p>\u9019\u9805\u76ee\u91cd\u9ede\u4e0d\u662f\u300c\u88dd\u5b8c\u5373\u7528\u300d\uff0c\u800c\u662f\u6309\u5b83\u63d0\u4f9b\u7684\u8cc7\u6599\u8655\u7406\u3001\u8a13\u7df4\u3001\u84b8\u993e\u8207\u63a8\u7406\u6d41\u7a0b\u9010\u6bb5\u8d70\u3002\u9805\u76ee\u516c\u958b\u4e86 data \u2192 training \u2192 inference \u7684\u5168\u6d41\u7a0b\uff0c\u4e26\u63d0\u4f9b example data\u3001runnable scripts\u3001Claude Skills \u8207\u65b0\u624b\u77e5\u8b58\u6574\u7406\uff0c\u65b9\u4fbf\u4f60\u5148\u8ddf\u4e00\u6b21\u6a19\u6e96\u6d41\u7a0b\uff0c\u518d\u6309\u81ea\u5df1\u9700\u8981\u6539 backbone\u3001\u8cc7\u6599\u5206\u4f48\u6216\u63a7\u5236\u65b9\u5f0f\u3002<\/p>\n<p>\u5b83\u8981\u89e3\u6c7a\u7684\u554f\u984c\uff0c\u5728\u65bc\u9ad8\u8cea\u5f71\u7247\u751f\u6210\u6a21\u578b\u672a\u5fc5\u7b49\u540c\u53ef\u4e92\u52d5\u7684 world model\u3002\u8981\u505a\u5230\u4f4e\u5ef6\u9072\u3001\u53ef\u56e0\u679c rollout\u3001\u53ef\u56de\u61c9\u93e1\u982d\u8ecc\u8de1\u7b49\u64cd\u4f5c\uff0c\u80cc\u5f8c\u9700\u8981 camera control\u3001autoregressive training\u3001few-step distillation \u53ca streaming inference \u7b49\u6574\u5957\u6a5f\u5236\uff1bminWM \u6b63\u662f\u628a\u9019\u4e9b\u74b0\u7bc0\u6a21\u7d44\u5316\uff0c\u4e26\u7528 Causal Forcing\u3001Causal Forcing++\u3001Teacher Forcing \u8207 asymmetric DMD \u4e32\u9023\u8d77\u4f86\u3002<\/p>\n<ul>\n<li>\u652f\u63f4 4-step DMD inference\uff0c\u4e26\u63d0\u5230 multi-GPU sequence parallelism<\/li>\n<li>\u53ef\u7528 pose strings \u6216 JSON \u6a94\u63a7\u5236 camera trajectory<\/li>\n<li>\u63d0\u4f9b debug-world-model\uff0c\u6574\u7406 loss NaN\u3001jitter\u3001camera drift \u7b49\u5e38\u898b\u5931\u6557\u6a21\u5f0f<\/li>\n<li>\u63d0\u4f9b integrate-new-backbone\uff0c\u793a\u7bc4\u600e\u6a23\u63a5\u5165\u65b0\u7684 video DiT<\/li>\n<li>\u53c3\u8003 backbone \u5305\u62ec Wan2.1-T2V-1.3B\u3001HY1.5-TI2V-8B\uff0c\u4ea6\u63d0\u5230 HY Action2V\u3001HY TI2V\u3001Wan Action2V<\/li>\n<\/ul>\n<p>\u9805\u76ee\u7684\u65b0\u610f\u5728\u65bc\u5b83\u540c\u6642\u8655\u7406\u300c\u600e\u6a23\u8a13\u7df4\u300d\u8207\u300c\u600e\u6a23\u6539\u9020\u300d\u3002\u9664\u4e86\u652f\u63f4\u4e0d\u540c backbone \u8207 condition injection \u65b9\u5f0f\uff0c\u4e5f\u628a\u5718\u968a\u7d2f\u7a4d\u7684\u6392\u932f\u7d93\u9a57\u8207 Claude \u5354\u4f5c\u6d41\u7a0b\u5beb\u9032\u9805\u76ee\uff0c\u4ee4\u7814\u7a76\u8005\u6216\u5de5\u7a0b\u4eba\u54e1\u4e0d\u53ea\u770b\u5230\u7d50\u679c\uff0c\u9084\u80fd\u7406\u89e3\u5e38\u898b\u932f\u8aa4\u5f9e\u54ea\u88e1\u51fa\u73fe\u3002<\/p>\n<p>\u5b83\u7684\u76ee\u6a19\u662f real-time interactive video world models\uff0c\u4e26\u9644\u6709\u5c0d camera trajectory quality\u3001controllability training steps\u3001minimal batch-size requirements \u7684\u5be6\u9a57\u5206\u6790\u3002\u4e0d\u904e\u516c\u958b\u8cc7\u8a0a\u8f03\u504f\u5411\u6846\u67b6\u8207\u6d41\u7a0b\uff0c\u82e5\u4f60\u60f3\u6bd4\u8f03\u55ae\u4e00\u6a21\u578b\u8dd1\u5206\uff0c\u9019\u500b\u9805\u76ee\u66f4\u9069\u5408\u7576\u4f5c\u5efa\u7acb\u3001\u91cd\u73fe\u53ca\u64f4\u5c55 World Model \u7684\u5de5\u4f5c\u5e95\u5ea7\u3002<\/p>\n<p><strong>GitHub\uff1a<\/strong> <a href=\"https:\/\/github.com\/shengshu-ai\/minWM\" rel=\"noopener noreferrer\">https:\/\/github.com\/shengshu-ai\/minWM<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>minWM\u4e0d\u662f\u55ae\u4e00\u6a21\u578b\uff0c\u800c\u662f\u4e00\u5957\u7531\u8cc7\u6599\u5230\u63a8\u7406\u7684 World Model \u6846\u67b6\u3002\u5b83\u7279\u5225\u7167\u9867\u521d\u5b78\u8005\uff0c\u4e5f\u517c\u9867\u7814\u7a76\u8207\u6539\u9020\u5f48\u6027\u3002<\/p>\n","protected":false},"author":8,"featured_media":8617,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"ai_generated_summary":"","wpai_meta_description":"","footnotes":""},"categories":[133,165,116,157,120,76,127,149,141,186,197],"tags":[],"class_list":["post-8618","post","type-post","status-publish","format-standard","hentry","category-133","category-165","category-agentic","category-157","category-120","category-76","category-127","category-149","category-141","category-186","category-framework"],"_links":{"self":[{"href":"https:\/\/infernews.com\/blog\/wp-json\/wp\/v2\/posts\/8618","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/infernews.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/infernews.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/infernews.com\/blog\/wp-json\/wp\/v2\/users\/8"}],"replies":[{"embeddable":true,"href":"https:\/\/infernews.com\/blog\/wp-json\/wp\/v2\/comments?post=8618"}],"version-history":[{"count":0,"href":"https:\/\/infernews.com\/blog\/wp-json\/wp\/v2\/posts\/8618\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/infernews.com\/blog\/wp-json\/wp\/v2\/media\/8617"}],"wp:attachment":[{"href":"https:\/\/infernews.com\/blog\/wp-json\/wp\/v2\/media?parent=8618"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/infernews.com\/blog\/wp-json\/wp\/v2\/categories?post=8618"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/infernews.com\/blog\/wp-json\/wp\/v2\/tags?post=8618"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}