Intresting that you think that it is a strong assumption
On this point, I think that it mainly disadvantages non-English-language models.
The SOTA for English models is most often updated because English is the language favored by research. Therefore, downloads of old models are transferred to the new ones. Whereas for other languages, downloads remain on the old models and are therefore no longer counted after one year.
So if, for example, I take the RAG example but in French rather than English, I will have to use a French embedding model. To create it, it turns out that although models are fairly recent (https://huggingface.co/almanach/moderncamembert-base was released eight months ago for instance), people will turn to the best known one, CamemBERT (https://huggingface.co/almanach/camembert-base), to fine-tune it. The latter was published at the end of 2019 on Hugging Face. However, it is expected to reach around 10 million downloads this year. These downloads (and all those from January 2021 onwards, basically) are not taken into account in their study (and therefore in not for encoders if with stiff think about the encoders vs decoders comparaison).
And from conversations with HF Fellows specializing in other languages (Spanish, Korean, Dutch), this is a very common phenomenon.
Otherwise, indeed figure 7 of the paper is better than figure 6 in our reflexion 😅