
{"id":9079,"date":"2026-06-14T01:16:58","date_gmt":"2026-06-13T17:16:58","guid":{"rendered":"https:\/\/infernews.com\/blog\/rethinking-action-interface-for-agentic-spatial-reasoning\/"},"modified":"2026-06-14T01:20:30","modified_gmt":"2026-06-13T17:20:30","slug":"rethinking-action-interface-for-agentic-spatial-reasoning","status":"publish","type":"post","link":"https:\/\/infernews.com\/blog\/rethinking-action-interface-for-agentic-spatial-reasoning\/","title":{"rendered":"SpatialClaw\uff1a\u7528\u7a0b\u5f0f\u78bc\u63a8\u52d5\u7a7a\u9593\u63a8\u7406\u4ee3\u7406"},"content":{"rendered":"\n<figure class=\"wp-block-image size-large\"><img decoding=\"async\" src=\"https:\/\/infernews.com\/blog\/wp-content\/uploads\/2026\/06\/pasted-8198bedcb70c.jpg\" alt=\"SpatialClaw logo\"\/><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">SpatialClaw \u662f\u4e00\u500b<strong>\u514d\u8a13\u7df4\u7684\u7a7a\u9593\u63a8\u7406\u6846\u67b6<\/strong>\uff0c\u91cd\u9ede\u4e0d\u662f\u518d\u52a0\u66f4\u591a\u5de5\u5177\uff0c\u800c\u662f\u6539\u5beb\u4ee3\u7406\u5982\u4f55\u8abf\u7528\u5de5\u5177\u3002\u5b83\u628a\u7a0b\u5f0f\u78bc\u7576\u6210\u52d5\u4f5c\u4ecb\u9762\uff0c\u8b93 Vision-Language Model \u4ee3\u7406\u9010\u6b65\u5beb\u5165 Python cell\uff0c\u5728\u540c\u4e00\u500b\u6301\u7e8c\u904b\u884c\u7684 Jupyter kernel \u5167\u67e5\u770b\u4e2d\u9593\u7d50\u679c\u3001\u518d\u8abf\u6574\u4e0b\u4e00\u6b65\u5224\u65b7\uff0c\u76ee\u6a19\u662f\u8655\u7406 3D\u30014D \u4ee5\u53ca\u5f71\u7247\u5834\u666f\u4e2d\u7684\u7a7a\u9593\u7406\u89e3\u554f\u984c\u3002<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">\u9019\u500b\u9805\u76ee\u7684\u65b0\u610f\uff0c\u5728\u65bc\u5b83\u907f\u958b\u55ae\u6b21\u57f7\u884c\u6574\u6bb5\u7a0b\u5f0f\u6216\u50f5\u786c\u7684 tool-call \u65b9\u5f0f\u3002\u4ee3\u7406\u6bcf\u6b21\u53ea\u63d0\u4ea4\u4e00\u683c\u7a0b\u5f0f\uff0c\u80fd\u7d50\u5408 SAM3 segmentation\u3001Depth-Anything-3 reconstruction\u3001geometry utilities\uff0c\u4ee5\u53ca NumPy\u3001SciPy\u3001Matplotlib \u9019\u985e\u79d1\u5b78\u904b\u7b97\u5eab\uff0c\u5206\u6790\u904e\u7a0b\u66f4\u50cf\u9010\u6b65\u67e5\u8b49\uff0c\u800c\u4e0d\u662f\u4e00\u6b21\u904e\u731c\u7b54\u6848\u3002<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">\u5982\u679c\u4f60\u60f3\u6e2c\u8a66\u5b83\uff0c\u8f03\u5408\u9069\u7684\u505a\u6cd5\u662f\u7528\u591a\u8996\u89d2\u5716\u7247\u3001\u5f71\u7247\u7247\u6bb5\uff0c\u6216\u9700\u8981\u5224\u65b7\u4f4d\u7f6e\u3001\u8ddd\u96e2\u3001\u906e\u64cb\u3001\u79fb\u52d5\u95dc\u4fc2\u7684\u984c\u76ee\u4f86\u8dd1\u3002\u6587\u4ef6\u4ea6\u63d0\u5230\u90e8\u7f72\u6a21\u578b\u6709\u660e\u78ba\u786c\u4ef6\u8981\u6c42\uff1aFP8 \u7248\u672c\u9700\u8981 Linux \u8207 NVIDIA Hopper\uff08H100\uff09\u6216\u66f4\u65b0 GPU\uff1b\u82e5\u624b\u4e0a\u662f A100 \u6216 L40S\uff0c\u5247\u53ef\u6539\u7528 models.json \u5167\u5217\u51fa\u7684 AWQ \u6216 GPTQ Int4 \u689d\u76ee\uff0c\u4e26\u6cbf\u7528\u76f8\u540c served_name\uff0c\u6a21\u578b\u8a2d\u5b9a\u6bcb\u9808\u91cd\u6539\u3002\u9019\u4e5f\u53cd\u6620 NVIDIA \u8fd1\u5e74\u5728 Robotic \u8207 World Model \u76f8\u95dc\u9805\u76ee\u4e0a\u7684\u6295\u5165\u76f8\u7576\u7a4d\u6975\u3002<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">\u6210\u7e3e\u65b9\u9762\uff0c\u516c\u958b\u8cc7\u6599\u6307\u51fa\u5b83\u5728 20 \u500b\u7a7a\u9593\u63a8\u7406\u57fa\u6e96\u53d6\u5f97 <strong>59.9% \u5e73\u5747\u6e96\u78ba\u7387<\/strong>\uff0c\u6bd4\u5148\u524d\u6700\u4f73\u7a7a\u9593\u4ee3\u7406\u9ad8 <strong>11.2 \u500b\u767e\u5206\u9ede<\/strong>\u3002\u66f4\u91cd\u8981\u7684\u662f\uff0c\u9019\u500b\u7d50\u679c\u64da\u7a31\u5728\u76f8\u540c system prompt\u3001\u5de5\u5177\u7d44\u5408\u8207 hyperparameters \u4e0b\u5b8c\u6210\uff0c\u8986\u84cb\u516d\u500b VLM \u9aa8\u5e79\uff0c\u4ee3\u8868\u5b83\u7684\u63d0\u5347\u672a\u5fc5\u53ea\u9760\u7279\u5b9a benchmark \u5fae\u8abf\u3002<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>\u5c6c\u65bc<strong>\u7a7a\u9593\u63a8\u7406\u4ee3\u7406\u6846\u67b6<\/strong>\uff0c\u89e3\u6c7a VLM \u5728 3D\/4D \u95dc\u4fc2\u5224\u65b7\u4e0a\u4e0d\u5920\u9748\u6d3b\u7684\u554f\u984c<\/li>\n\n\n\n<li>\u6838\u5fc3\u65b9\u6cd5\u662f\u4ee5<strong>\u7a0b\u5f0f\u78bc\u4f5c\u70ba\u52d5\u4f5c\u4ecb\u9762<\/strong>\uff0c\u9010\u6b65\u57f7\u884c\u8207\u4fee\u6b63\u5206\u6790<\/li>\n\n\n\n<li>\u652f\u63f4\u7684\u611f\u77e5\u6a21\u7d44\u5305\u62ec <strong>SAM3 segmentation<\/strong>\u3001<strong>Depth-Anything-3 reconstruction<\/strong> \u8207 geometry utilities<\/li>\n\n\n\n<li>\u516c\u958b\u7d50\u679c\u6db5\u84cb 20 \u500b benchmarks\uff0c\u5e73\u5747\u6e96\u78ba\u7387\u70ba <strong>59.9%<\/strong><\/li>\n\n\n\n<li>\u76f8\u95dc\u6a21\u578b\u5bb6\u65cf\u5305\u62ec <strong>Qwen3.5\u3001Qwen3.6\u3001Gemma4<\/strong>\uff0c\u898f\u6a21\u7531 26B \u81f3 397B<\/li>\n<\/ul>\n\n\n\n<p class=\"wp-block-paragraph\">\u9019\u500b\u9805\u76ee\u7279\u5225\u9069\u5408\u7814\u7a76 Computer-use agents\u3001\u7a7a\u9593\u667a\u80fd\u3001\u6a5f\u68b0\u4eba\u611f\u77e5\uff0c\u6216\u8005\u60f3\u6bd4\u8f03 tool-augmented agent \u8207 VLM \u63a8\u7406\u6d41\u7a0b\u7684\u4eba\u3002\u82e5\u4f60\u95dc\u5fc3\u7684\u4e0d\u662f\u804a\u5929\u8868\u73fe\uff0c\u800c\u662f\u6a21\u578b\u80fd\u5426\u4e00\u6b65\u6b65\u89c0\u5bdf\u756b\u9762\u3001\u8abf\u5de5\u5177\u3001\u4fee\u6b63\u63a8\u8ad6\uff0cSpatialClaw \u5c55\u793a\u4e86\u4e00\u689d\u5e7e\u6709\u8aaa\u670d\u529b\u7684\u8def\u7dda\u3002<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>GitHub\uff1a<\/strong> <a href=\"https:\/\/github.com\/NVlabs\/SpatialClaw\" rel=\"noopener noreferrer\">https:\/\/github.com\/NVlabs\/SpatialClaw<\/a><\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>\u9805\u76ee\uff1a<\/strong> <a href=\"https:\/\/spatialclaw.github.io\/\" rel=\"noopener noreferrer\">https:\/\/spatialclaw.github.io\/<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>SpatialClaw\u628a\u7a0b\u5f0f\u78bc\u8b8a\u6210\u4ee3\u7406\u7684\u52d5\u4f5c\u4ecb\u9762\uff0c\u4e3b\u653b\u8907\u96dc\u7a7a\u9593\u63a8\u7406\u3002\u5b83\u572820\u500b\u57fa\u6e96\u4e0a\u4ea4\u51fa\u660e\u986f\u9818\u5148\u7684\u5e73\u5747\u6210\u7e3e\u3002<\/p>\n","protected":false},"author":8,"featured_media":9078,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"ai_generated_summary":"","footnotes":""},"categories":[133,179,116,76,149],"tags":[],"class_list":["post-9079","post","type-post","status-publish","format-standard","hentry","category-133","category-nvidia","category-agentic","category-76","category-149"],"_links":{"self":[{"href":"https:\/\/infernews.com\/blog\/wp-json\/wp\/v2\/posts\/9079","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/infernews.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/infernews.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/infernews.com\/blog\/wp-json\/wp\/v2\/users\/8"}],"replies":[{"embeddable":true,"href":"https:\/\/infernews.com\/blog\/wp-json\/wp\/v2\/comments?post=9079"}],"version-history":[{"count":1,"href":"https:\/\/infernews.com\/blog\/wp-json\/wp\/v2\/posts\/9079\/revisions"}],"predecessor-version":[{"id":9082,"href":"https:\/\/infernews.com\/blog\/wp-json\/wp\/v2\/posts\/9079\/revisions\/9082"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/infernews.com\/blog\/wp-json\/wp\/v2\/media\/9078"}],"wp:attachment":[{"href":"https:\/\/infernews.com\/blog\/wp-json\/wp\/v2\/media?parent=9079"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/infernews.com\/blog\/wp-json\/wp\/v2\/categories?post=9079"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/infernews.com\/blog\/wp-json\/wp\/v2\/tags?post=9079"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}