1. Abstract

Background:

  1. summarization → Traditional RAG works well for specific questions (“When was Company X founded?”), but it struggles with broad, global ones (“What are the main ideas in all these documents?”).
  2. scalability → (Such questions need summarization of the whole dataset, not just retrieving a few passages — that’s called query-focused summarization (QFS).) Prior QFS methods, meanwhile, do not scale to the quantities of text indexed by typical RAG systems.
  3. we need to combine scalability and summarization: combines knowledge graph generation and query-focused summarization

Given a question, each community summary is used to generate a partial response, before all partial responses are again summarized in a final response to the user.

1. Introduction

GraphRAG contrasts with vector RAG (text embeddings) in its ability to answer queries that require global sensemaking over the entire data corpus.

2. Background

3. Methods

The high-level data flow of the GraphRAG approach and pipeline:

<strong>Community detection</strong> is used to partition the graph index into groups of elements (nodes, edges, covariates) that the LLM can summarize in parallel at both indexing time and query time.

Community detection is used to partition the graph index into groups of elements (nodes, edges, covariates) that the LLM can summarize in parallel at both indexing time and query time.

Entities & Relationships → Knowledge Graph

ComponentPurposeTypical Technique (as described or implied)
LLM extractionIdentify entities/relations/claimsPrompt-based, few-shot examples
Entity matchingMerge identical namesExact string match (default), fuzzy possible
Graph constructionStore nodes/edgesSimple adjacency list or NetworkX graph
Edge weightingTrack frequency of relationshipsCount duplicates
Aggregation & summarizationProduce node/edge descriptionsLLM summarization
Community detectionFind clustersLeiden algorithm (modularity optimization)

image.png