
{"id":3284,"date":"2024-08-21T22:16:54","date_gmt":"2024-08-21T14:16:54","guid":{"rendered":"https:\/\/infernews.com\/?p=3284"},"modified":"2024-08-21T22:16:55","modified_gmt":"2024-08-21T14:16:55","slug":"%e5%b5%8c%e5%85%a5embedding%e5%b0%8d%e6%96%bc-rag-%e7%9a%84%e9%87%8d%e8%a6%81","status":"publish","type":"post","link":"https:\/\/infernews.com\/blog\/%e5%b5%8c%e5%85%a5embedding%e5%b0%8d%e6%96%bc-rag-%e7%9a%84%e9%87%8d%e8%a6%81\/","title":{"rendered":"\u5d4c\u5165(Embedding)\u5c0d\u65bc RAG \u7684\u91cd\u8981"},"content":{"rendered":"\n<p>\u5d4c\u5165(Embedding)\u5c0d\u65bc RAG \u7cfb\u7d71\u81f3\u95dc\u91cd\u8981\uff0c\u4f46\u7d93\u5e38\u88ab\u5ffd\u7565\u3002\u672c\u8996\u983b\u4ecb\u7d39\u4e86\u6210\u672c\u3001\u5132\u5b58\u6ce8\u610f\u4e8b\u9805\u4ee5\u53ca\u4f7f\u7528\u964d\u7dad\u548c\u91cf\u5316\u7b49\u6280\u8853\u964d\u4f4e\u5132\u5b58\u9700\u6c42\u7684\u65b9\u6cd5\u3002\u4e86\u89e3\u9019\u4e9b\u65b9\u6cd5\u5982\u4f55\u5728\u4e0d\u5f71\u97ff\u6548\u80fd\u7684\u60c5\u6cc1\u4e0b\u63d0\u9ad8\u901f\u5ea6\u4e26\u7bc0\u7701\u6210\u672c\u3002<\/p>\n\n\n<figure class=\"wp-block-embed-youtube wp-block-embed is-type-video is-provider-youtube wp-embed-aspect-4-3 wp-has-aspect-ratio\"><div class=\"lyte-wrapper\" title=\"The Hidden Cost of Embeddings in RAG and how to Fix it\" style=\"width:853px;max-width:100%;margin:5px auto;\"><div class=\"lyMe\" id=\"WYL_Ra8n_9wnHFs\" itemprop=\"video\" itemscope itemtype=\"https:\/\/schema.org\/VideoObject\"><div><meta itemprop=\"thumbnailUrl\" content=\"https:\/\/infernews.com\/blog\/wp-content\/plugins\/wp-youtube-lyte\/lyteCache.php?origThumbUrl=https%3A%2F%2Fi.ytimg.com%2Fvi%2FRa8n_9wnHFs%2Fhqdefault.jpg\" \/><meta itemprop=\"embedURL\" content=\"https:\/\/www.youtube.com\/embed\/Ra8n_9wnHFs\" \/><meta itemprop=\"duration\" content=\"PT15M45S\" \/><meta itemprop=\"uploadDate\" content=\"2024-08-20T09:45:02Z\" \/><\/div><div id=\"lyte_Ra8n_9wnHFs\" data-src=\"https:\/\/infernews.com\/blog\/wp-content\/plugins\/wp-youtube-lyte\/lyteCache.php?origThumbUrl=https%3A%2F%2Fi.ytimg.com%2Fvi%2FRa8n_9wnHFs%2Fhqdefault.jpg\" class=\"pL\"><div class=\"tC\"><div class=\"tT\" itemprop=\"name\">The Hidden Cost of Embeddings in RAG and how to Fix it<\/div><\/div><div class=\"play\"><\/div><div class=\"ctrl\"><div class=\"Lctrl\"><\/div><div class=\"Rctrl\"><\/div><\/div><\/div><noscript><a href=\"https:\/\/youtu.be\/Ra8n_9wnHFs\" rel=\"nofollow\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/infernews.com\/blog\/wp-content\/plugins\/wp-youtube-lyte\/lyteCache.php?origThumbUrl=https%3A%2F%2Fi.ytimg.com%2Fvi%2FRa8n_9wnHFs%2F0.jpg\" alt=\"The Hidden Cost of Embeddings in RAG and how to Fix it\" width=\"853\" height=\"460\" \/><br \/>Watch this video on YouTube<\/a><\/noscript><meta itemprop=\"description\" content=\"Embeddings are crucial for a production-ready RAG system but often get overlooked. I cover the costs, storage considerations, and ways to reduce storage requirements using techniques like dimensionality reduction and quantization. Learn how these methods can improve speed and save costs without compromising too much on performance. LINKS: Blogpost: https:\/\/huggingface.co\/blog\/embedding-quantization \ud83d\udcbb RAG Beyond Basics Course: https:\/\/prompt-s-site.thinkific.com\/courses\/rag Let&#039;s Connect: \ud83e\uddbe Discord: https:\/\/discord.com\/invite\/t4eYQRUcXB \u2615 Buy me a Coffee: https:\/\/ko-fi.com\/promptengineering |\ud83d\udd34 Patreon: https:\/\/www.patreon.com\/PromptEngineering \ud83d\udcbcConsulting: https:\/\/calendly.com\/engineerprompt\/consulting-call \ud83d\udce7 Business Contact: engineerprompt@gmail.com Become Member: http:\/\/tinyurl.com\/y5h28s6h \ud83d\udcbb Pre-configured localGPT VM: https:\/\/bit.ly\/localGPT (use Code: PromptEngineering for 50% off). Signup for Newsletter, localgpt: https:\/\/tally.so\/r\/3y9bb0 00:00 Introduction to Embeddings in RAG Systems 00:47 Understanding Embedding Costs 01:17 Storage Costs and Considerations 03:32 Reducing Storage Needs 03:41 Dimensionality Reduction Techniques 04:24 Matrosha Representation Learning 05:14 Precision Reduction Techniques 06:28 Quantization Study by Hugging Face 10:07 Implementing Quantization in Your Pipelines 12:56 Using Open Source Vector Stores 15:01 Conclusion and Final Thoughts All Interesting Videos: Everything LangChain: https:\/\/www.youtube.com\/playlist?list=PLVEEucA9MYhOu89CX8H3MBZqayTbcCTMr Everything LLM: https:\/\/youtube.com\/playlist?list=PLVEEucA9MYhNF5-zeb4Iw2Nl1OKTH-Txw Everything Midjourney: https:\/\/youtube.com\/playlist?list=PLVEEucA9MYhMdrdHZtFeEebl20LPkaSmw AI Image Generation: https:\/\/youtube.com\/playlist?list=PLVEEucA9MYhPVgYazU5hx6emMXtargd4z\"><\/div><\/div><div class=\"lL\" style=\"max-width:100%;width:853px;margin:5px auto;\"><\/div><figcaption><\/figcaption><\/figure>","protected":false},"excerpt":{"rendered":"<p>\u5d4c\u5165(Embedding)\u5c0d\u65bc RAG \u7cfb\u7d71\u81f3\u95dc\u91cd\u8981\uff0c\u4f46\u7d93\u5e38\u88ab\u5ffd\u7565\u3002\u672c\u8996\u983b\u4ecb\u7d39\u4e86\u6210\u672c\u3001\u5132\u5b58\u6ce8\u610f\u4e8b\u9805\u4ee5\u53ca\u4f7f\u7528\u964d\u7dad\u548c\u91cf\u5316\u7b49\u6280\u8853\u964d\u4f4e\u5132\u5b58\u9700\u6c42\u7684\u65b9\u6cd5\u3002\u4e86\u89e3\u9019\u4e9b\u65b9\u6cd5\u5982\u4f55\u5728\u4e0d\u5f71\u97ff\u6548\u80fd\u7684\u60c5\u6cc1\u4e0b\u63d0\u9ad8\u901f\u5ea6\u4e26\u7bc0\u7701\u6210\u672c\u3002<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"googlesitekit_rrm_CAowvqSiDA:productID":"","footnotes":""},"categories":[109,27],"tags":[110,62],"class_list":["post-3284","post","type-post","status-publish","format-standard","hentry","category-rag","category-paper","tag-embedding","tag-rag"],"_links":{"self":[{"href":"https:\/\/infernews.com\/blog\/wp-json\/wp\/v2\/posts\/3284","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/infernews.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/infernews.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/infernews.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/infernews.com\/blog\/wp-json\/wp\/v2\/comments?post=3284"}],"version-history":[{"count":0,"href":"https:\/\/infernews.com\/blog\/wp-json\/wp\/v2\/posts\/3284\/revisions"}],"wp:attachment":[{"href":"https:\/\/infernews.com\/blog\/wp-json\/wp\/v2\/media?parent=3284"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/infernews.com\/blog\/wp-json\/wp\/v2\/categories?post=3284"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/infernews.com\/blog\/wp-json\/wp\/v2\/tags?post=3284"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}