VirProtRAG: Literature-grounded viral protein function annotation with retrieval-augmented generation
Viruses play indispensable roles in ecosystems and human health, yet deciphering their molecular functions remains challenging. Many viral protein annotations are incomplete or poorly characterized. Existing tools typically predict functional categories without linking to verifiable evidence, hindering the credibility of functional interpretation. Here, we present VirProtRAG, a viral protein function annotation framework that integrates information retrieval with evidence-grounded knowledge generation. It introduces three task-adapted components: a hybrid retrieval module combining keyword-based and semantic dense retrieval to maximize literature coverage, synonym-expanded and rank-aware ret