
{"id":9187,"date":"2026-06-16T21:46:27","date_gmt":"2026-06-16T13:46:27","guid":{"rendered":"https:\/\/infernews.com\/blog\/coda-bench-is-a-benchmark-for-code-agents-on-data-intensive-tasks\/"},"modified":"2026-06-16T21:48:13","modified_gmt":"2026-06-16T13:48:13","slug":"coda-bench-is-a-benchmark-for-code-agents-on-data-intensive-tasks","status":"publish","type":"post","link":"https:\/\/infernews.com\/blog\/coda-bench-is-a-benchmark-for-code-agents-on-data-intensive-tasks\/","title":{"rendered":"\u7576 AI \u7a0b\u5f0f\u52a9\u624b\u9047\u4e0a\u6eff\u5c71\u6578\u64da\uff1aCoDA-Bench \u60f3\u8003\u751a\u9ebc\uff1f"},"content":{"rendered":"\n<figure class=\"wp-block-image size-large\"><img decoding=\"async\" src=\"https:\/\/infernews.com\/blog\/wp-content\/uploads\/2026\/06\/logo-b595421c0291.jpg\" alt=\"CoDA-Bench\"\/><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">\u73fe\u6709\u91dd\u5c0d AI \u7a0b\u5f0f\u4ee3\u7406\uff08AI coding agents\uff09\u7684\u8a55\u6e2c\uff0c\u5927\u81f4\u5206\u70ba\u5169\u985e\uff1a\u4e00\u985e\u5c08\u6ce8\u65bc\u8edf\u4ef6\u5de5\u7a0b\u4efb\u52d9\uff08\u4f8b\u5982 SWE-Bench\u3001Terminal-Bench\uff09\uff0c\u53ea\u8003\u9a57\u4ee3\u78bc\u672c\u8eab\uff1b\u53e6\u4e00\u985e\u5c08\u6ce8\u65bc\u6578\u64da\u5206\u6790\u80fd\u529b\uff08\u4f8b\u5982 DS-1000\u3001DA-Code\u3001DataSciBench\uff09\uff0c\u537b\u628a\u6240\u9700\u6578\u64da\u76f4\u63a5\u6524\u5728\u684c\u9762\uff0c\u7b49\u7740\u4ee3\u7406\u53bb\u8b80\u3002\u4e2d\u570b\u4eba\u6c11\u5927\u5b78\u6578\u64da\u5be6\u9a57\u5ba4\u5718\u968a\u6307\u51fa\uff0c\u9019\u7a2e\u628a\u300c\u4ee3\u78bc\u300d\u8207\u300c\u6578\u64da\u300d\u5206\u958b\u8a55\u4f30\u7684\u7bc4\u5f0f\uff0c\u8207\u771f\u5be6\u958b\u767c\u5834\u666f\u812b\u7bc0\u2014\u2014\u73fe\u5be6\u4e2d\u7684\u5de5\u7a0b\u5e2b\uff0c\u5f80\u5f80\u8981\u5728\u5806\u6eff\u96dc\u4e82\u6a94\u6848\u7684\u74b0\u5883\u4e2d\uff0c\u81ea\u5df1\u6478\u7d22\u51fa\u54ea\u4e9b\u6578\u64da\u6709\u7528\uff0c\u518d\u5beb\u4ee3\u78bc\u8655\u7406\u5b83\u5011\u3002<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">\u70ba\u6b64\u4ed6\u5011\u63d0\u51fa CoDA-Bench\uff08Code and Data-intensive Benchmark\uff09\uff0c\u5c6c\u65bc benchmark \u985e\u578b\u7684\u8a55\u6e2c\u6846\u67b6\u3002\u5b83\u5efa\u69cb\u4e86\u4e00\u500b\u57fa\u65bc Kaggle \u751f\u614b\u7684 Linux \u6c99\u76d2\uff0c\u6bcf\u500b\u4efb\u52d9\u74b0\u5883\u5e73\u5747\u5305\u542b\u7d04 980 \u500b\u6a94\u6848\uff0c\u7e3d\u5171 1,009 \u9053\u984c\u76ee\u6a6b\u8de8 31 \u500b\u4e3b\u984c\u793e\u5340\uff0c\u8981\u6c42\u4ee3\u7406\u5148\u5728\u8a9e\u610f\u76f8\u8fd1\u7684\u773e\u591a\u6a94\u6848\u4e2d\u5927\u6d77\u6488\u91dd\uff0c\u518d\u6574\u5408\u7570\u8cea\u8cc7\u6599\u3001\u64b0\u5beb\u5206\u6790\u4ee3\u78bc\uff0c\u7522\u51fa\u6700\u7d42\u7b54\u6848\u3002<\/p>\n\n\n<div\n    \tclass=\"align wp-block-vpb-video\"    id='vpbpVideoPlayer-1'\n    data-attributes='{&quot;source&quot;:&quot;https:\\\/\\\/coda-bench.github.io\\\/assets\\\/demo_video_compressed.mp4&quot;,&quot;repeat&quot;:true,&quot;autoplay&quot;:true,&quot;muted&quot;:true,&quot;align&quot;:&quot;&quot;,&quot;poster&quot;:&quot;&quot;,&quot;controls&quot;:{&quot;play-large&quot;:true,&quot;restart&quot;:false,&quot;rewind&quot;:true,&quot;play&quot;:true,&quot;fast-forward&quot;:true,&quot;progress&quot;:true,&quot;current-time&quot;:true,&quot;duration&quot;:false,&quot;mute&quot;:true,&quot;volume&quot;:true,&quot;pip&quot;:false,&quot;airplay&quot;:false,&quot;settings&quot;:true,&quot;download&quot;:false,&quot;fullscreen&quot;:true},&quot;width&quot;:&quot;100%&quot;,&quot;radius&quot;:&quot;0px&quot;,&quot;resetOnEnd&quot;:false,&quot;autoHideControl&quot;:true,&quot;isSetup&quot;:false}'\n><\/div>\n\n\n<p class=\"wp-block-paragraph\">\u5718\u968a\u6e2c\u8a66\u4e86\u591a\u6b3e\u9802\u5c16\u4ee3\u7406\u5f8c\u767c\u73fe\uff0c\u5373\u4f7f\u8868\u73fe\u6700\u597d\u7684\u7cfb\u7d71\uff0c\u6210\u529f\u7387\u4e5f\u53ea\u6709 61.1%\uff0c\u66b4\u9732\u51fa\u73fe\u6709\u6a21\u578b\u5728\u300c\u6578\u64da\u767c\u73fe\u300d\u8207\u300c\u4ee3\u78bc\u57f7\u884c\u300d\u4e4b\u9593\u7f3a\u4e4f\u6709\u6548\u929c\u63a5\u3002\u9019\u500b\u7f3a\u53e3\u6b63\u597d\u70ba\u4e0b\u4e00\u4ee3\u7814\u7a76\u6307\u660e\u65b9\u5411\uff1a\u672a\u4f86\u7684\u4ee3\u7406\u4e0d\u53ea\u8981\u6703\u5beb\u4ee3\u78bc\uff0c\u66f4\u8981\u61c2\u5f97\u5728\u96dc\u4e82\u7684\u6a94\u6848\u7cfb\u7d71\u4e2d\u81ea\u884c\u5c0e\u822a\u3002<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">\u5982\u679c\u4f60\u5f9e\u4e8b Agentic AI \u7814\u767c\u3001\u6578\u64da\u5206\u6790\u81ea\u52d5\u5316\uff0c\u6216\u60f3\u6e2c\u8a66 LLM \u5728\u8907\u96dc\u74b0\u5883\u4e2d\u7684\u63a8\u7406\u8207\u7de8\u7a0b\u6574\u5408\u80fd\u529b\uff0c\u9019\u5957\u958b\u6e90 benchmark \u63d0\u4f9b\u4e86\u4e00\u500b\u8cbc\u8fd1\u73fe\u5be6\u7684\u8a66\u91d1\u77f3\u3002\u5b8c\u6574\u984c\u76ee\u5df2\u91cb\u51fa\u65bc HuggingFace\uff0c\u8a55\u4f30\u5247\u53ef\u900f\u904e Docker \u4e00\u9375\u57f7\u884c\u3002<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">\u91cd\u9ede\u6458\u8981\uff1a<br>\n&#8211; <strong>\u4fee\u6b63\u820a\u7bc4\u5f0f<\/strong>\uff1a\u7a81\u7834 SWE-Bench \u8207 DS-1000 \u628a\u4ee3\u78bc\u8207\u6578\u64da\u5206\u958b\u8003\u7684\u505a\u6cd5\uff0c\u7d71\u4e00\u5728\u540c\u4e00\u74b0\u5883\u5167\u8a55\u6e2c\u3002<br>\n&#8211; <strong>\u771f\u5be6\u898f\u6a21\u6c99\u76d2<\/strong>\uff1a\u6bcf\u984c\u7d04 980 \u500b\u6a94\u6848\uff0c\u6a21\u64ec Kaggle \u4e0a\u96dc\u4e82\u800c\u9f90\u5927\u7684\u771f\u5be6\u6578\u64da\u74b0\u5883\u3002<br>\n&#8211; <strong>\u96d9\u91cd\u80fd\u529b\u6574\u5408<\/strong>\uff1a\u540c\u6642\u8003\u9a57\u8cc7\u6599\u63a2\u7d22\u3001\u6a94\u6848\u5c0e\u822a\u3001\u8de8\u683c\u5f0f\u6574\u5408\u8207\u4ee3\u78bc\u751f\u6210\u56db\u500b\u9762\u5411\u3002<br>\n&#8211; <strong>\u6210\u7e3e\u6158\u6de1<\/strong>\uff1a\u9802\u5c16\u4ee3\u7406\u5728\u5b8c\u6574\u984c\u96c6\u4e0a\u50c5\u7d04 61.1% \u6210\u529f\u7387\uff0c\u986f\u793a\u4ecd\u6709\u660e\u986f\u6539\u9032\u7a7a\u9593\u3002<br>\n&#8211; <strong>\u5b8c\u6574\u958b\u6e90<\/strong>\uff1a\u5305\u542b 1,009 \u9053\u984c\u76ee\u300131 \u500b\u793e\u5340\u6578\u64da\uff08\u7d04 43 GB\uff09\uff0c\u4ee5\u53ca Docker \u8a55\u6e2c\u6d41\u7a0b\u3002<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>GitHub\uff1a<\/strong> <a href=\"https:\/\/github.com\/ruc-datalab\/CoDA-Bench\" rel=\"noopener noreferrer\">https:\/\/github.com\/ruc-datalab\/CoDA-Bench<\/a><\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Paper\uff1a<\/strong> <a href=\"https:\/\/arxiv.org\/pdf\/2606.15300\" rel=\"noopener noreferrer\">https:\/\/arxiv.org\/pdf\/2606.15300<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>\u4e2d\u570b\u4eba\u6c11\u5927\u5b78\u5718\u968a\u767c\u8868\u7684 CoDA-Bench\uff0c\u628a AI \u4ee3\u7406\u4eba\u4e1f\u9032\u85cf\u6709\u6578\u767e\u6a94\u6848\u7684 Linux \u6c99\u76d2\uff0c\u6e2c\u8a66\u7260\u5011\u5728\u771f\u5be6\u6578\u64da\u5bc6\u96c6\u5834\u666f\u4e0b\u627e\u6a94\u6848\u3001\u5beb\u4ee3\u78bc\u3001\u505a\u5206\u6790\u7684\u7d9c\u5408\u80fd\u529b\u3002<\/p>\n","protected":false},"author":8,"featured_media":9186,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"ai_generated_summary":"","footnotes":""},"categories":[133,116,24,162,168,159,76,146,189,196,197,199],"tags":[],"class_list":["post-9187","post","type-post","status-publish","format-standard","hentry","category-133","category-agentic","category-aisoftware","category-ai-productions","category-linux","category-vibe-coding","category-76","category-146","category-189","category-196","category-framework","category-dataset-"],"_links":{"self":[{"href":"https:\/\/infernews.com\/blog\/wp-json\/wp\/v2\/posts\/9187","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/infernews.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/infernews.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/infernews.com\/blog\/wp-json\/wp\/v2\/users\/8"}],"replies":[{"embeddable":true,"href":"https:\/\/infernews.com\/blog\/wp-json\/wp\/v2\/comments?post=9187"}],"version-history":[{"count":1,"href":"https:\/\/infernews.com\/blog\/wp-json\/wp\/v2\/posts\/9187\/revisions"}],"predecessor-version":[{"id":9190,"href":"https:\/\/infernews.com\/blog\/wp-json\/wp\/v2\/posts\/9187\/revisions\/9190"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/infernews.com\/blog\/wp-json\/wp\/v2\/media\/9186"}],"wp:attachment":[{"href":"https:\/\/infernews.com\/blog\/wp-json\/wp\/v2\/media?parent=9187"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/infernews.com\/blog\/wp-json\/wp\/v2\/categories?post=9187"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/infernews.com\/blog\/wp-json\/wp\/v2\/tags?post=9187"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}