A Comparative Analysis of Static Word Embeddings for Hungarian
Paper
•
2505.07809
•
Published
This repository contains static word embedding models extracted from the following BERT-based models:
Each model is provided in three static embedding variants:
These embeddings were developed and evaluated as part of the paper: A Comparative Analysis of Static Word Embeddings for Hungarian by Máté Gedeon. They can be used for intrinsic tasks (e.g., word analogies) and extrinsic tasks (e.g., POS tagging, NER) in Hungarian NLP applications.
The paper can be found here: https://arxiv.org/abs/2505.07809
The corresponding GitHub repository: https://github.com/gedeonmate/hungarian_static_embeddings
If you use these models, code, or any part of the accompanying materials in your research, please cite:
@article{Gedeon_2025,
title={A Comparative Analysis of Static Word Embeddings for Hungarian},
volume={17},
ISSN={2061-2079},
url={http://dx.doi.org/10.36244/ICJ.2025.2.4},
DOI={10.36244/icj.2025.2.4},
number={2},
journal={Infocommunications Journal},
publisher={Infocommunications Journal},
author={Gedeon, Máté},
year={2025},
pages={28–34}
}
Base model
FacebookAI/xlm-roberta-base