Scientific Citation Retrieval

Information RetrievalNLPEnsemblesEmbeddings

Built with

PythonPyTorchTransformersBM25scikit-learn

This was a competition task: given a query paper, retrieve the 100 papers it is most likely to cite from a corpus of 20,000, scored on NDCG@10.

No single retriever wins everywhere, so I built an ensemble of eight. Dense models (SPECTER2, SciNCL, MiniLM) capture meaning; sparse methods (BM25 over title, abstract, full text, and per-section, plus TF-IDF) capture exact terms and rare entities; a citation-context pass reads how papers actually reference each other.

The eight rankings are merged with weighted reciprocal rank fusion, with domain and venue boosting on top. I tuned the fusion weights by coordinate descent against held-out relevance judgements, then submitted the final pipeline to the public leaderboard.