GraphRAG is much better than standard RAG in several important ways, especially for global sense-making questions when applied to big data.
Here is a summary of the key points-
Manage Global Issues:
Limitations of RAG: RAG is excellent at retrieving some kind of information but is weak in giving answers to global questions that require summarization of the whole text corpus, such as finding out what are some main themes.
GraphRAG Solution: GraphRAG addresses this by treating the problem as a query-focused summarization (QFS) task. It builds a comprehensive understanding of the entire corpus by summarizing groups of related entities.
Scalability:
RAG Constraints: Standard QFS methods fail to scale effectively to the large quantities of text that typical RAG systems index.
Scalability of GraphRAG: GraphRAG scales according to how general user queries are as well as the volume of source text. This is achieved by first building an entity knowledge graph and then generating summaries over related entity groups.
Structured Summary
Graph-based Text Indexing GraphRAG makes use of a two-stage graph-based text indexing procedure. This encompasses:
Building a knowledge graph of entities from source documents.
Pre-generating community summaries of closely related entities.
This structured approach enables more systematic and comprehensive answers to queries.
Better Quality of Answers:
Partial and Final Responses: Given any query, GraphRAG employs summaries of communities to generate partial responses. The partial responses are then summarized in the formation of a final coherent answer.
Image Source: DALL.E
Steps in the GraphRAG Pipeline:
1. Source Docs to Text Chunks
Text Extraction and Chunking:
Source documents are tokenized into text chunks. The chunk size is significant as it determines the trade-off between the number of LLM calls and the overall context quality. Slimmer chunks support better recall of reference to entities.
2. Text Chunks to Element Instances
Entity and Relationship Extraction:
Entities and relationships within the text chunks are discovered and extracted with LLM prompts. This includes multi-part LLM prompts for extracting entity names, types, descriptions of entities, and relationships.
In several rounds of “gleanings,” all entities are iterated over, and any that are missed are extracted.
3. Element Instances to Element Summaries
Elements Summary from Extracted Elements:
The extracted entities and relations can be summarized into coherent pieces of descriptive text. This step ensures that even where the same entity was extracted through different formats, it can be aggregated and understood as a single entity.
4. Element Summaries to Graph Communities
Graph Construction and Community Detection:
The condensed elements form a graph where the nodes represent entities, and edges signify relationships. Community detection algorithms break up this graph into communities that signify clusters or closely associated nodes.
The hierarchical nature of the community allows for efficient summarization at different levels.
5. Graph Communities to Community Summaries
Generating Community Summaries:
Each community, especially at the leaf level, is summarized. Prioritization will ensure the most relevant summaries are included within the token limits of the LLM context window.
6. Community Summaries to Community Answers to Global Answer
Query Processing and Answer Generation:
Community summaries to generate intermediate answers to user queries. Summaries are shuffled and divided into chunks for parallel processing.
The chunk generates a partial response with a helpfulness score. Then, the answers are aggregated, and the high-scoring answers make up the final global answer.