❝你以为RAG(Retrieval-Augmented Generation)已经很强了?别急,今天教你一招“上下文压缩”,让你的RAG系统脱胎换骨,效率翻倍,答案更准,内存更省,老板看了都说好!
RAG系统的本质,就是“先检索,再生成”。你问个问题,系统先去知识库里搜一圈,把相关的内容捞出来,然后丢给大模型生成答案。
听起来很美好,但实际用起来,常常是这样的:
举个栗子:
你问“AI决策的伦理问题有哪些?”,检索出来的段落里,既有“AI的历史”,又有“AI的优点”,还有“AI的缺点”,真正和伦理相关的内容,可能只占三分之一。
怎么办?
别慌,今天我们就来聊聊——上下文压缩(Contextual Compression)!
❝上下文压缩,就是在RAG检索后,用大模型把无关内容“剪掉”,只留下和问题最相关的部分。
这样做的好处:
压缩不是一刀切,常见有三种玩法:
Selective(选择性保留)
只保留和问题直接相关的句子/段落,原文照抄,不做改写。
Summary(摘要压缩)
把相关内容浓缩成简明扼要的摘要,信息密度高。
Extraction(句子抽取)
只抽取原文中包含关键信息的句子,逐句列出。
不同场景选不同流派:
别急着看代码,先看思路!
# 1. 文档处理
text = extract_text_from_pdf(pdf_path)
chunks = chunk_text(text, size=1000, overlap=200)
embeddings = create_embeddings(chunks)
# 2. 构建向量库
vector_store = SimpleVectorStore()
for chunk, emb in zip(chunks, embeddings):
vector_store.add_item(chunk, emb)
# 3. 用户提问
query = "AI决策的伦理问题有哪些?"
query_emb = create_embeddings(query)
top_chunks = vector_store.similarity_search(query_emb, k=10)
# 4. 上下文压缩
compressed_chunks = []
for chunk in top_chunks:
compressed, ratio = compress_chunk(chunk, query, compression_type="summary")
compressed_chunks.append(compressed)
# 5. 生成答案
context = "
---
".join(compressed_chunks)
answer = generate_response(query, context)
核心压缩函数的伪代码:
def compress_chunk(chunk, query, compression_type):
# 根据compression_type选择不同的system prompt
# 用LLM生成只保留相关内容的压缩结果
return compressed_chunk, compression_ratio
假设我们有一份《AI伦理白皮书.pdf》,我们来问它:
❝“AI在决策中的伦理问题有哪些?”
原始chunk(1000字,节选):
Many AI systems, particularly deep learning models, are "black boxes," making it difficult to understand how they arrive at their decisions. Enhancing transparency and explainability is crucial for building trust and accountability.
Privacy and Security
AI systems often rely on large amounts of data, raising concerns about privacy and data security. Protecting sensitive information and ensuring responsible data handling are essential.
Job Displacement
The automation capabilities of AI have raised concerns about job displacement, particularly in industries with repetitive or routine tasks. Addressing the potential economic and social impacts of AI-driven automation is a key challenge.
...
Selective压缩后(654字):
Many AI systems, particularly deep learning models, are "black boxes," making it difficult to understand how they arrive at their decisions. Enhancing transparency and explainability is crucial for building trust and accountability.
Establishing clear guidelines and ethical frameworks for AI development and deployment is crucial.
Protecting sensitive information and ensuring responsible data handling are essential.
Addressing the potential economic and social impacts of AI-driven automation is a key challenge.
As AI systems become more autonomous, questions arise about control, accountability, and the potential for unintended consequences.
Summary压缩后(514字):
The ethical concerns surrounding the use of AI in decision-making include:
- Lack of transparency and explainability in AI decision-making processes
- Privacy and data security concerns due to reliance on large amounts of data
- Potential for job displacement, particularly in industries with repetitive or routine tasks
- Questions about control, accountability, and unintended consequences as AI systems become more autonomous
- Need for clear guidelines and ethical frameworks for AI development and deployment
Extraction压缩后(335字):
Many AI systems, particularly deep learning models, are "black boxes," making it difficult to understand how they arrive at their decisions. Enhancing transparency and explainability is crucial for building trust and accountability.
Establishing clear guidelines and ethical frameworks for AI development and deployment is crucial.
一句话总结:
❝“RAG+上下文压缩=更聪明、更高效、更省钱的AI检索系统!”
试试上下文压缩吧!
让你的RAG系统,像减肥成功的程序员一样,既有料又轻盈!
def rag_with_compression(pdf_path, query, k=10, compression_type="summary"):
# 1. 文档处理
vector_store = process_document(pdf_path)
# 2. 检索
query_emb = create_embeddings(query)
top_chunks = vector_store.similarity_search(query_emb, k)
# 3. 压缩
compressed_chunks = batch_compress_chunks(top_chunks, query, compression_type)
# 4. 生成答案
context = "
---
".join(compressed_chunks)
answer = generate_response(query, context)
return answer
RAG不是终点,压缩才是王道!
下次你做AI检索,不妨加一道“上下文压缩”,让你的系统更聪明、更高效、更懂你!
下期我们聊聊“多文档RAG压缩”和“多轮对话压缩”,敬请期待!
更新时间:2025-07-02
本站资料均由网友自行发布提供,仅用于学习交流。如有版权问题,请与我联系,QQ:4156828
© CopyRight 2020-=date("Y",time());?> All Rights Reserved. Powered By 61893.com 闽ICP备11008920号
闽公网安备35020302035593号