Paper
Enhancing RAG With Hierarchical Text Segmentation
George Babakhanov
Aug 10, 2025 · 2 min read
Exploring a new method for improving RAG retrieval accuracy through better text chunking and semantic clustering.
Key Takeaways
- Weird cut-offs in chunking can make retrieval inefficient-vector embeddings may not align well with queries, losing precision and TOP k ranking
- RAG and GraphRAG serve different purposes: GraphRAG excels with high entity relationships, while simple RAG can be better for semantic search
- Chunk size is critical: bigger chunks contain more info but lose focus, potentially missing relevant results. Testing different sizes is a MUST
Summary
This paper presents a better method of retrieving the right chunks to enhance LLM output. The approach:
- Segment text using a supervised model
- Cluster chunks using semantic relationships while preserving sequential order (cohesive clustering)
Each chunk gets both segment-level and cluster-level embeddings, giving the retriever more chances to match queries accurately. The benchmarks (NarrativeQA, QuALITY, QASPER) showed improved results, except at 2048 chunk size where it loses focus.
What I Need to Learn More About
- LSTM (Long Short Term Memory): Used for semantic segmentation-seems like a neural network variant
- LangChain: Library for building AI apps, need hands-on experience
- Stochastic Gradient Descent: Formula used in training, want to understand applications
- Cliques Q: Mentioned but not fully understood
Terms Learned
- Supervised Model: Trained using labeled input-output pairs, model adjusts to predict correct outputs
- Unsupervised Model: No output labels, model finds its own patterns and clusters
- Segment-level: Breaking content into groups-sentences, phrases, or multi-word chunks (not single words)
- Cluster-level: Grouping semantically similar segments, using average vectors to create summary vectors for faster search
- Entity: Important elements in text that can be used for improved search
- Relationships (RAG): Connections between entities (e.g., Elon → Founded → SpaceX)
- Cohesive Structure: Preserves original structure by grouping only adjacent chunks to maintain context
- Epoch: Number of times the model sees the same training data; too many can cause overfitting
Written by
George Babakhanov
George Babakhanov is an engineer working at the intersection of artificial intelligence, systems, and real-world infrastructure. He builds reliable AI-driven systems, from model training and automation pipelines to fault-tolerant software and hardware integration. His work focuses on making complex systems understandable, deployable, and useful.
