Measuring Information Distortion in Hierarchical Ultra long Novel Reconstruction:The Optimal Expansion Ratio
Abstract
A two stage novel generation framework (outline -> section outline -> manuscript) is widely used in long novel generation,(e.g., DOME, Plan\&Write, Long Writer), but study of such framework in ultra long novel(>1M words) reconstruction is little. Building on recent text compression methods (LLMZip, LLM2Vec), we conduct an information-theoretic analysis to quantify semantic distortion under different compression-expansion ratios. We examine how outline length affects information preservation. Experiments on ultra-long novels show that the optimal compression-expansion ratio significantly reduces semantic distortion compared to other non-optimal compression-expansion ratio.
Turn this paper into a full lesson
ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.