
{"id":8923,"date":"2026-06-07T20:20:34","date_gmt":"2026-06-07T12:20:34","guid":{"rendered":"https:\/\/infernews.com\/blog\/streamchar\/"},"modified":"2026-06-07T20:20:34","modified_gmt":"2026-06-07T12:20:34","slug":"streamchar","status":"publish","type":"post","link":"https:\/\/infernews.com\/blog\/streamchar\/","title":{"rendered":"StreamChar\uff1a\u9577\u6642\u9593\u89d2\u8272\u8072\u756b\u751f\u6210\u65b0\u8def\u7dda"},"content":{"rendered":"<figure class=\"wp-block-image size-large\"><img decoding=\"async\" src=\"https:\/\/infernews.com\/blog\/wp-content\/uploads\/2026\/06\/backbone-ffcc13cf9c6c.jpg\" alt=\"StreamChar architecture overview\"><\/figure>\n<p>StreamChar \u662f\u4e00\u500b\u7814\u7a76\u5c55\u793a\u9805\u76ee\uff0c\u805a\u7126 Long-Horizon Streaming Character Audio-Video Generation\uff0c\u4e5f\u5c31\u662f\u9577\u6642\u9593\u3001\u4e32\u6d41\u5f0f\u7684\u89d2\u8272\u97f3\u8a0a\u8207\u5f71\u7247\u751f\u6210\u3002\u5f9e\u9801\u9762\u53ef\u898b\uff0c\u5b83\u7684\u6838\u5fc3\u7d44\u5408\u5305\u62ec Decoupled LLM orchestration\u3001joint audio\u2013video DiT denoising backbone\u3001Sink-Chunk Memory\uff0c\u4ee5\u53ca Online Rollout Distillation\u3002<\/p>\n<p>\u9019\u500b\u9805\u76ee\u8981\u8655\u7406\u7684\u91cd\u9ede\uff0c\u662f\u89d2\u8272\u8072\u756b\u5167\u5bb9\u5728\u8f03\u9577\u8f38\u51fa\u904e\u7a0b\u4e2d\u7684\u9023\u7e8c\u6027\u8207\u7a69\u5b9a\u5ea6\u3002\u4e00\u822c\u751f\u6210\u6d41\u7a0b\u4e00\u65e6\u62c9\u9577\uff0c\u5bb9\u6613\u51fa\u73fe\u5167\u5bb9\u65b7\u88c2\u3001\u89d2\u8272\u72c0\u614b\u4e0d\u4e00\u81f4\uff0c\u6216\u97f3\u8a0a\u8207\u756b\u9762\u7bc0\u594f\u4e0d\u540c\u6b65\uff1bStreamChar \u770b\u4f86\u5c31\u662f\u91dd\u5c0d\u9019\u985e\u9577\u5e8f\u5217\u751f\u6210\u554f\u984c\u800c\u8a2d\u8a08\u3002<\/p>\n<p>\u4f7f\u7528\u9019\u500b\u9805\u76ee\u6642\uff0c\u73fe\u968e\u6bb5\u8f03\u50cf\u89c0\u770b\u7814\u7a76\u6210\u679c\u8207\u793a\u7bc4\uff0c\u800c\u4e0d\u662f\u76f4\u63a5\u63d0\u4f9b\u5b8c\u6574\u7522\u54c1\u5316\u64cd\u4f5c\u6d41\u7a0b\u3002\u9801\u9762\u63d0\u4f9b Paper (arXiv) \u8207\u793a\u7bc4\u5f71\u7247\uff0c\u9069\u5408\u5148\u5f9e demo \u89c0\u5bdf\u8f38\u51fa\u6548\u679c\uff0c\u518d\u914d\u5408\u8ad6\u6587\u7406\u89e3\u6574\u9ad4\u65b9\u6cd5\u8207\u7cfb\u7d71\u62c6\u5206\u65b9\u5f0f\u3002<\/p>\n<p>\u5b83\u7684\u6280\u8853\u65b9\u5411\u5e7e\u500b\u91cd\u9ede\u76f8\u7576\u6e05\u695a\uff1a\u628a LLM \u7684 orchestration \u8207\u5e95\u5c64\u8072\u756b\u751f\u6210\u89e3\u8026\u3001\u4ee5 Streaming DiT Backbone \u8ca0\u8cac\u9023\u7e8c\u751f\u6210\uff0c\u4e26\u52a0\u5165 Sink-Chunk Memory \u652f\u63f4\u9577\u6642\u9593\u4e0a\u4e0b\u6587\u3002Online Rollout Distillation \u5247\u986f\u793a\u5718\u968a\u6709\u91dd\u5c0d\u4e32\u6d41\u751f\u6210\u904e\u7a0b\u505a\u6548\u7387\u6216\u7a69\u5b9a\u6027\u4e0a\u7684\u8a13\u7df4\u5b89\u6392\uff0c\u4f46\u9801\u9762\u6458\u8981\u672a\u63d0\u4f9b\u66f4\u5b8c\u6574\u6578\u5b57\u3002<\/p>\n<ul>\n<li>\u805a\u7126 Long-Horizon Streaming Character Audio-Video Generation<\/li>\n<li>\u7d50\u5408 Decoupled LLM orchestration \u8207 joint audio\u2013video DiT denoising<\/li>\n<li>\u4ee5 Sink-Chunk Memory \u8655\u7406\u9577\u5e8f\u5217\u4e0a\u4e0b\u6587<\/li>\n<li>\u63d0\u4f9b\u7814\u7a76\u793a\u7bc4\u5f71\u7247\uff0c\u8f38\u51fa\u70ba native resolution<\/li>\n<li>\u9069\u5408\u95dc\u6ce8\u89d2\u8272\u751f\u6210\u3001\u4e32\u6d41\u751f\u6210\u8207\u591a\u6a21\u614b\u7814\u7a76\u7684\u4eba<\/li>\n<\/ul>\n<p>\u5982\u679c\u4f60\u662f\u505a\u751f\u6210\u5f0f AI\u3001\u865b\u64ec\u89d2\u8272\u3001\u6578\u78bc\u4eba\u6216\u5f71\u7247\u5408\u6210\u76f8\u95dc\u9805\u76ee\uff0c\u9019\u500b\u9805\u76ee\u6709\u53c3\u8003\u50f9\u503c\u3002\u81f3\u65bc\u6027\u80fd\u548c\u8a55\u4f30\uff0c\u9801\u9762\u76ee\u524d\u53ea\u898b\u65b9\u6cd5\u540d\u7a31\u3001\u8ad6\u6587\u5165\u53e3\u8207 demo\uff0c\u672a\u898b\u660e\u78ba\u57fa\u6e96\u5206\u6578\uff1b\u8f03\u7a69\u59a5\u7684\u505a\u6cd5\uff0c\u662f\u628a\u5b83\u8996\u70ba\u4e00\u689d\u503c\u5f97\u8ffd\u8e64\u7684\u7814\u7a76\u8def\u7dda\uff0c\u518d\u5230\u8ad6\u6587\u4e2d\u67e5\u770b\u5b8c\u6574\u5be6\u9a57\u7d30\u7bc0\u3002<\/p>\n<p><strong>\u9805\u76ee\uff1a<\/strong> <a href=\"https:\/\/humanaigc.github.io\/StreamChar_page\/\" rel=\"noopener noreferrer\">https:\/\/humanaigc.github.io\/StreamChar_page\/<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>StreamChar \u4e3b\u6253\u9577\u6642\u9593\u4e32\u6d41\u89d2\u8272\u8072\u756b\u751f\u6210\u3002\u91cd\u9ede\u5728\u65bc\u628a\u8a9e\u8a00\u8abf\u5ea6\u3001\u8a18\u61b6\u6a5f\u5236\u8207\u97f3\u8996\u8a0a\u751f\u6210\u6d41\u7a0b\u5206\u958b\u8655\u7406\u3002<\/p>\n","protected":false},"author":8,"featured_media":8922,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"ai_generated_summary":"","footnotes":""},"categories":[171,120,141],"tags":[],"class_list":["post-8923","post","type-post","status-publish","format-standard","hentry","category-171","category-120","category-141"],"_links":{"self":[{"href":"https:\/\/infernews.com\/blog\/wp-json\/wp\/v2\/posts\/8923","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/infernews.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/infernews.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/infernews.com\/blog\/wp-json\/wp\/v2\/users\/8"}],"replies":[{"embeddable":true,"href":"https:\/\/infernews.com\/blog\/wp-json\/wp\/v2\/comments?post=8923"}],"version-history":[{"count":0,"href":"https:\/\/infernews.com\/blog\/wp-json\/wp\/v2\/posts\/8923\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/infernews.com\/blog\/wp-json\/wp\/v2\/media\/8922"}],"wp:attachment":[{"href":"https:\/\/infernews.com\/blog\/wp-json\/wp\/v2\/media?parent=8923"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/infernews.com\/blog\/wp-json\/wp\/v2\/categories?post=8923"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/infernews.com\/blog\/wp-json\/wp\/v2\/tags?post=8923"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}