
{"id":3618,"date":"2024-11-18T05:18:27","date_gmt":"2024-11-17T21:18:27","guid":{"rendered":"https:\/\/infernews.com\/?p=3618"},"modified":"2024-12-18T14:38:25","modified_gmt":"2024-12-18T06:38:25","slug":"scaling-laws-ai-%e7%84%a1%e6%b3%95%e8%b7%a8%e8%b6%8a%e9%80%99%e6%a2%9d%e7%b7%9a%ef%bc%8c%e6%88%91%e5%80%91%e4%b8%8d%e7%9f%a5%e9%81%93%e7%82%ba%e4%bb%80%e9%ba%bc%e3%80%82","status":"publish","type":"post","link":"https:\/\/infernews.com\/blog\/scaling-laws-ai-%e7%84%a1%e6%b3%95%e8%b7%a8%e8%b6%8a%e9%80%99%e6%a2%9d%e7%b7%9a%ef%bc%8c%e6%88%91%e5%80%91%e4%b8%8d%e7%9f%a5%e9%81%93%e7%82%ba%e4%bb%80%e9%ba%bc%e3%80%82\/","title":{"rendered":"Scaling Laws &#8211; AI \u7121\u6cd5\u8de8\u8d8a\u9019\u689d\u7dda\uff0c\u6211\u5011\u4e0d\u77e5\u9053\u70ba\u4ec0\u9ebc\u3002"},"content":{"rendered":"\n<p>\u672c\u7247\u63a2\u8a0e\u4eba\u5de5\u667a\u6167\u6a21\u578b\u6548\u80fd\u7684\u300c\u795e\u7d93\u7db2\u8def\u7e2e\u653e\u5b9a\u5f8b\u300d\u3002\u7814\u7a76\u767c\u73fe\uff0c\u6a21\u578b\u7684\u932f\u8aa4\u7387\u8207\u904b\u7b97\u91cf\u3001\u6a21\u578b\u5927\u5c0f\u548c\u6578\u64da\u96c6\u5927\u5c0f\u4e4b\u9593\u5b58\u5728\u8457\u51aa\u5f8b\u95dc\u4fc2\uff0c\u9019\u610f\u5473\u8457\u6548\u80fd\u63d0\u5347\u9075\u5faa\u8457\u7c21\u55ae\u7684\u6578\u5b78\u898f\u5f8b\uff0c\u4e14\u8207\u6a21\u578b\u67b6\u69cb\u95dc\u4fc2\u4e0d\u5927\u3002OpenAI \u548c DeepMind \u7684\u7814\u7a76\u8b49\u5be6\u4e86\u6b64\u5b9a\u5f8b\u5728\u5ee3\u6cdb\u7684\u61c9\u7528\u4e2d\u6210\u7acb\uff0c\u4f46\u6a21\u578b\u6548\u80fd\u6700\u7d42\u6703\u9054\u5230\u4e00\u500b\u4e0b\u9650\uff0c\u9019\u500b\u4e0b\u9650\u8207\u81ea\u7136\u8a9e\u8a00\u7684\u672c\u8cea\u71b5\u6709\u95dc\u3002\u6700\u5f8c\u4ea6\u63a2\u8a0e\u4e86\u300c\u6d41\u5f62\u5047\u8a2d\u300d\uff0c\u8a66\u5716\u5f9e\u7406\u8ad6\u4e0a\u89e3\u91cb\u70ba\u4f55\u6548\u80fd\u6703\u9075\u5faa\u51aa\u5f8b\u95dc\u4fc2\uff0c\u8a8d\u70ba\u6a21\u578b\u5b78\u7fd2\u7684\u9ad8\u7dad\u6578\u64da\u7a7a\u9593\u4e2d\u7684\u4f4e\u7dad\u6d41\u5f62\uff0c\u6578\u64da\u91cf\u548c\u6a21\u578b\u5927\u5c0f\u6c7a\u5b9a\u4e86\u6a21\u578b\u5c0d\u6d41\u5f62\u7684\u89e3\u6790\u5ea6\uff0c\u5f9e\u800c\u5f71\u97ff\u6548\u80fd\u3002\u96d6\u7136\u7406\u8ad6\u8207\u5be6\u9a57\u7d50\u679c\u6709\u4e00\u5b9a\u7a0b\u5ea6\u7684\u543b\u5408\uff0c\u4f46\u4ecd\u672a\u80fd\u5b8c\u5168\u5efa\u7acb\u4e00\u5957\u5b8c\u6574\u7684AI\u7406\u8ad6\u3002<\/p>\n\n\n<figure class=\"wp-block-embed-youtube wp-block-embed is-type-video is-provider-youtube wp-embed-aspect-16-9 wp-has-aspect-ratio\"><div class=\"lyte-wrapper\" title=\"AI can&amp;#039;t cross this line and we don&amp;#039;t know why.\" style=\"width:853px;max-width:100%;margin:5px auto;\"><div class=\"lyMe\" id=\"WYL_5eqRuVp65eY\" itemprop=\"video\" itemscope itemtype=\"https:\/\/schema.org\/VideoObject\"><div><meta itemprop=\"thumbnailUrl\" content=\"https:\/\/infernews.com\/blog\/wp-content\/plugins\/wp-youtube-lyte\/lyteCache.php?origThumbUrl=https%3A%2F%2Fi.ytimg.com%2Fvi%2F5eqRuVp65eY%2Fhqdefault.jpg\" \/><meta itemprop=\"embedURL\" content=\"https:\/\/www.youtube.com\/embed\/5eqRuVp65eY\" \/><meta itemprop=\"duration\" content=\"PT24M7S\" \/><meta itemprop=\"uploadDate\" content=\"2024-09-13T18:09:57Z\" \/><\/div><div id=\"lyte_5eqRuVp65eY\" data-src=\"https:\/\/infernews.com\/blog\/wp-content\/plugins\/wp-youtube-lyte\/lyteCache.php?origThumbUrl=https%3A%2F%2Fi.ytimg.com%2Fvi%2F5eqRuVp65eY%2Fhqdefault.jpg\" class=\"pL\"><div class=\"tC\"><div class=\"tT\" itemprop=\"name\">AI can&#039;t cross this line and we don&#039;t know why.<\/div><\/div><div class=\"play\"><\/div><div class=\"ctrl\"><div class=\"Lctrl\"><\/div><div class=\"Rctrl\"><\/div><\/div><\/div><noscript><a href=\"https:\/\/youtu.be\/5eqRuVp65eY\" rel=\"nofollow\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/infernews.com\/blog\/wp-content\/plugins\/wp-youtube-lyte\/lyteCache.php?origThumbUrl=https%3A%2F%2Fi.ytimg.com%2Fvi%2F5eqRuVp65eY%2F0.jpg\" alt=\"AI can&amp;#039;t cross this line and we don&amp;#039;t know why.\" width=\"853\" height=\"460\" \/><br \/>Watch this video on YouTube<\/a><\/noscript><meta itemprop=\"description\" content=\"Have we discovered an ideal gas law for AI? Head to https:\/\/brilliant.org\/WelchLabs\/ to try Brilliant for free for 30 days and get 20% off an annual premium subscription. Welch Labs Imaginary Numbers Book! https:\/\/www.welchlabs.com\/resources\/imaginary-numbers-book Welch Labs Posters: https:\/\/www.welchlabs.com\/resources Support Welch Labs on Patreon! https:\/\/www.patreon.com\/welchlabs Special thanks to Patrons: Juan Benet, Ross Hanson, Yan Babitski, AJ Englehardt, Alvin Khaled, Eduardo Barraza, Hitoshi Yamauchi, Jaewon Jung, Mrgoodlight, Shinichi Hayashi, Sid Sarasvati, Dominic Beaumont, Shannon Prater, Ubiquity Ventures, Matias Forti, Brian Henry, Tim Palade, Petar Vecutin Learn more about WelchLabs! https:\/\/www.welchlabs.com TikTok: https:\/\/www.tiktok.com\/@welchlabs Instagram: https:\/\/www.instagram.com\/welchlabs REFERENCES A Neural Scaling Law from the Dimension of the Data Manifold: https:\/\/arxiv.org\/pdf\/2004.10802 First 2020 OpenAI Scaling Paper: https:\/\/arxiv.org\/pdf\/2001.08361 GPT-3 Paper: https:\/\/arxiv.org\/pdf\/2005.14165 Second 202 OpenAI Scaling Paper: https:\/\/arxiv.org\/pdf\/2010.14701 Google Deepmind \u201cChinchilla Scaling\u201d Paper: https:\/\/arxiv.org\/abs\/2203.15556 Nice summary of Chinchilla Scaling: https:\/\/www.lesswrong.com\/posts\/6Fpvch8RR29qLEWNH\/chinchilla-s-wild-implications GPT-4 Technical Report: https:\/\/arxiv.org\/pdf\/2303.08774 Nice Neural Scaling Laws Summary: https:\/\/www.lesswrong.com\/posts\/Yt5wAXMc7D2zLpQqx\/an-140-theoretical-models-that-predict-scaling-laws Explaining Neural Scaling Laws: https:\/\/arxiv.org\/pdf\/2102.06701 High Cost of Training GPT-4: https:\/\/www.wired.com\/story\/openai-ceo-sam-altman-the-age-of-giant-ai-models-is-already-over\/ Nvidia V100 FLOPs: https:\/\/lambdalabs.com\/blog\/demystifying-gpt-3 Nvidia V100 Original Price: [https:\/\/www.microway.com\/hpc-tech-tips\/nvidia-tesla-v100-price-analysis\/#:~:text=Tesla GPU model,Key Points](https:\/\/www.microway.com\/hpc-tech-tips\/nvidia-tesla-v100-price-analysis\/#:~:text=TeslaGPUmodel,KeyPoints) Great paper on scaling up training infrastructure: https:\/\/arxiv.org\/pdf\/2104.04473 Eight Things to Know about LLMs: https:\/\/arxiv.org\/abs\/2304.00612 Emergent Properties of LLMs: https:\/\/arxiv.org\/abs\/2206.07682 Theoretical Motivation for Cross Entropy (Section 6.2): https:\/\/www.deeplearningbook.org\/ Some papers that appear to pass the compute efficient frontier https:\/\/arxiv.org\/pdf\/2206.14486 https:\/\/arxiv.org\/abs\/2210.11399 CFAQJOTYQHT7JYIT Leaked GPT-4 training info https:\/\/patmcguinness.substack.com\/p\/gpt-4-details-revealed https:\/\/www.semianalysis.com\/p\/gpt-4-architecture-infrastructure https:\/\/epochai.org\/blog\/tracking-large-scale-ai-models\"><\/div><\/div><div class=\"lL\" style=\"max-width:100%;width:853px;margin:5px auto;\"><\/div><figcaption><\/figcaption><\/figure>","protected":false},"excerpt":{"rendered":"<p>\u672c\u7247\u63a2\u8a0e\u4eba\u5de5\u667a\u6167\u6a21\u578b\u6548\u80fd\u7684\u300c\u795e\u7d93\u7db2\u8def\u7e2e\u653e\u5b9a\u5f8b\u300d\u3002\u7814\u7a76\u767c\u73fe\uff0c\u6a21\u578b\u7684\u932f\u8aa4\u7387\u8207\u904b\u7b97\u91cf\u3001\u6a21\u578b\u5927\u5c0f\u548c\u6578\u64da\u96c6\u5927\u5c0f\u4e4b\u9593\u5b58\u5728\u8457\u51aa\u5f8b\u95dc\u4fc2\uff0c\u9019\u610f\u5473\u8457\u6548\u80fd\u63d0\u5347\u9075\u5faa\u8457\u7c21\u55ae\u7684\u6578\u5b78\u898f\u5f8b\uff0c\u4e14\u8207\u6a21\u578b\u67b6\u69cb\u95dc\u4fc2\u4e0d\u5927\u3002OpenAI \u548c DeepMind \u7684\u7814\u7a76\u8b49\u5be6\u4e86\u6b64\u5b9a\u5f8b\u5728\u5ee3\u6cdb\u7684\u61c9\u7528\u4e2d\u6210\u7acb\uff0c\u4f46\u6a21\u578b\u6548\u80fd\u6700\u7d42\u6703\u9054\u5230\u4e00\u500b\u4e0b\u9650\uff0c\u9019\u500b\u4e0b\u9650\u8207\u81ea\u7136\u8a9e\u8a00\u7684\u672c\u8cea\u71b5\u6709\u95dc\u3002\u6700\u5f8c\u4ea6\u63a2\u8a0e\u4e86\u300c\u6d41\u5f62\u5047\u8a2d\u300d\uff0c\u8a66\u5716\u5f9e\u7406\u8ad6\u4e0a\u89e3\u91cb\u70ba\u4f55\u6548\u80fd\u6703\u9075\u5faa\u51aa\u5f8b\u95dc\u4fc2\uff0c\u8a8d\u70ba\u6a21\u578b\u5b78\u7fd2\u7684\u9ad8\u7dad\u6578\u64da\u7a7a\u9593\u4e2d\u7684\u4f4e\u7dad\u6d41\u5f62\uff0c\u6578\u64da\u91cf\u548c\u6a21\u578b\u5927\u5c0f\u6c7a\u5b9a\u4e86\u6a21\u578b\u5c0d\u6d41\u5f62\u7684\u89e3\u6790\u5ea6\uff0c\u5f9e\u800c\u5f71\u97ff\u6548\u80fd\u3002\u96d6\u7136\u7406\u8ad6\u8207\u5be6\u9a57\u7d50\u679c\u6709\u4e00\u5b9a\u7a0b\u5ea6\u7684\u543b\u5408\uff0c\u4f46\u4ecd\u672a\u80fd\u5b8c\u5168\u5efa\u7acb\u4e00\u5957\u5b8c\u6574\u7684AI\u7406\u8ad6\u3002<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"googlesitekit_rrm_CAowvqSiDA:productID":"","footnotes":""},"categories":[27],"tags":[],"class_list":["post-3618","post","type-post","status-publish","format-standard","hentry","category-paper"],"_links":{"self":[{"href":"https:\/\/infernews.com\/blog\/wp-json\/wp\/v2\/posts\/3618","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/infernews.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/infernews.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/infernews.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/infernews.com\/blog\/wp-json\/wp\/v2\/comments?post=3618"}],"version-history":[{"count":0,"href":"https:\/\/infernews.com\/blog\/wp-json\/wp\/v2\/posts\/3618\/revisions"}],"wp:attachment":[{"href":"https:\/\/infernews.com\/blog\/wp-json\/wp\/v2\/media?parent=3618"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/infernews.com\/blog\/wp-json\/wp\/v2\/categories?post=3618"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/infernews.com\/blog\/wp-json\/wp\/v2\/tags?post=3618"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}