Paper

Enhancing RAG With Hierarchical Text Segmentation

George Babakhanov
George Babakhanov
Aug 10, 2025 · 2 min read

Exploring a new method for improving RAG retrieval accuracy through better text chunking and semantic clustering.

Key Takeaways

  1. Weird cut-offs in chunking can make retrieval inefficient-vector embeddings may not align well with queries, losing precision and TOP k ranking
  2. RAG and GraphRAG serve different purposes: GraphRAG excels with high entity relationships, while simple RAG can be better for semantic search
  3. Chunk size is critical: bigger chunks contain more info but lose focus, potentially missing relevant results. Testing different sizes is a MUST

Summary

This paper presents a better method of retrieving the right chunks to enhance LLM output. The approach:

  1. Segment text using a supervised model
  2. Cluster chunks using semantic relationships while preserving sequential order (cohesive clustering)

Each chunk gets both segment-level and cluster-level embeddings, giving the retriever more chances to match queries accurately. The benchmarks (NarrativeQA, QuALITY, QASPER) showed improved results, except at 2048 chunk size where it loses focus.

What I Need to Learn More About

  • LSTM (Long Short Term Memory): Used for semantic segmentation-seems like a neural network variant
  • LangChain: Library for building AI apps, need hands-on experience
  • Stochastic Gradient Descent: Formula used in training, want to understand applications
  • Cliques Q: Mentioned but not fully understood

Terms Learned

  • Supervised Model: Trained using labeled input-output pairs, model adjusts to predict correct outputs
  • Unsupervised Model: No output labels, model finds its own patterns and clusters
  • Segment-level: Breaking content into groups-sentences, phrases, or multi-word chunks (not single words)
  • Cluster-level: Grouping semantically similar segments, using average vectors to create summary vectors for faster search
  • Entity: Important elements in text that can be used for improved search
  • Relationships (RAG): Connections between entities (e.g., Elon → Founded → SpaceX)
  • Cohesive Structure: Preserves original structure by grouping only adjacent chunks to maintain context
  • Epoch: Number of times the model sees the same training data; too many can cause overfitting
George Babakhanov
Written by
George Babakhanov
George Babakhanov is an engineer working at the intersection of artificial intelligence, systems, and real-world infrastructure. He builds reliable AI-driven systems, from model training and automation pipelines to fault-tolerant software and hardware integration. His work focuses on making complex systems understandable, deployable, and useful.