Researchers from the Shanghai AI Laboratory and Tsinghua University have developed NeedleBench, a novel framework to evaluate the retrieval and reasoning capabilities of large language models (LLMs) in exceedingly long contexts (up to 1 million tokens). The tool is critical for real-world applications such as legal document analysis, academic research, and business intelligence, which rely…
