FocusLLM, developed by researchers from Tsinghua and Xiamen Universities, is designed to extend the context length for language models. It processes long texts by dividing them into chunks and uses parallel decoding to extract and integrate relevant information efficiently. This approach enables handling texts up to 400K tokens with reduced computational costs. FocusLLM outperforms other methods in long-text comprehension tasks while maintaining low perplexity and high training efficiency, making it a valuable solution for long-context applications.
Sort: