Paper

Enhancing RAG With Hierarchical Text Segmentation

George Babakhanov

Aug 10, 2025 · 2 min read

Exploring a new method for improving RAG retrieval accuracy through better text chunking and semantic clustering.

Key Takeaways

Weird cut-offs in chunking can make retrieval inefficient-vector embeddings may not align well with queries, losing precision and TOP k ranking
RAG and GraphRAG serve different purposes: GraphRAG excels with high entity relationships, while simple RAG can be better for semantic search
Chunk size is critical: bigger chunks contain more info but lose focus, potentially missing relevant results. Testing different sizes is a MUST

Summary

This paper presents a better method of retrieving the right chunks to enhance LLM output. The approach:

Segment text using a supervised model
Cluster chunks using semantic relationships while preserving sequential order (cohesive clustering)

Each chunk gets both segment-level and cluster-level embeddings, giving the retriever more chances to match queries accurately. The benchmarks (NarrativeQA, QuALITY, QASPER) showed improved results, except at 2048 chunk size where it loses focus.

What I Need to Learn More About

LSTM (Long Short Term Memory): Used for semantic segmentation-seems like a neural network variant
LangChain: Library for building AI apps, need hands-on experience
Stochastic Gradient Descent: Formula used in training, want to understand applications
Cliques Q: Mentioned but not fully understood

Terms Learned

Supervised Model: Trained using labeled input-output pairs, model adjusts to predict correct outputs
Unsupervised Model: No output labels, model finds its own patterns and clusters
Segment-level: Breaking content into groups-sentences, phrases, or multi-word chunks (not single words)
Cluster-level: Grouping semantically similar segments, using average vectors to create summary vectors for faster search
Entity: Important elements in text that can be used for improved search
Relationships (RAG): Connections between entities (e.g., Elon → Founded → SpaceX)
Cohesive Structure: Preserves original structure by grouping only adjacent chunks to maintain context
Epoch: Number of times the model sees the same training data; too many can cause overfitting

Written by

George Babakhanov

George Babakhanov is an engineer working at the intersection of artificial intelligence, systems, and real-world infrastructure. He builds reliable AI-driven systems, from model training and automation pipelines to fault-tolerant software and hardware integration. His work focuses on making complex systems understandable, deployable, and useful.