Retrieval is usually two stages: a fast retriever pulls a broad candidate set, then a reranker scores each candidate against the query with a more precise model and reorders them. This sharply improves the quality of the context fed into a RAG system, which directly improves its answers.
Open-source cross-encoder rerankers drop into existing vector-search pipelines as that second stage.