
About

I am István Üveges, a computational linguist and Machine Learning specialist with a PhD in linguistics and a background in computer science. I focus on natural language processing, where I design and deploy Machine Learning models and build end-to-end pipelines for text analysis.
I have practical experience in large-scale text classification, sentiment and emotion detection, plain language identification, and domain-specific NLP in areas such as legal and financial texts. I also work on core NLP challenges including corpus building, linguistic annotation, morphological analysis, and spell-checking systems.
My technical toolkit covers Python, PyTorch, and Hugging Face, plus scikit-learn, spaCy, FastAPI, PostgreSQL, and Docker. I am skilled at developing data pipelines, optimizing model performance, and creating interactive demos that bring NLP solutions into real-world use.
In addition to engineering work, I also lead the Tech & AI section of a professional blog, where I write accessible articles on current technology topics. This allows me to bridge research, practical applications, and public understanding of AI and Machine Learning.
I am passionate about applying NLP and ML to practical problems and I am looking to contribute as a Machine Learning engineer, especially in NLP-driven projects.
Currently at HUN-REN Centre for Social Sciences (POLTEXTLAB).
Focus areas
- Applied NLP for specialized domains – experience with legal, political, and financial texts, building models for classification, sentiment and emotion analysis, and plain language detection.
- Core NLP tasks – corpus building, linguistic annotation, morphological analysis, and spell-checking systems, combined with modern ML-based approaches.
- Model development and evaluation – including fine-tuning of transformer-based and LLM models, with a focus on explainability, robustness, multilingual NLP, and data augmentation.
- LLM-based applications – experience with retrieval-augmented generation (RAG) pipelines and domain-specific adaptation of large language models.
Tech stack
- Programming & ML frameworks – Python, PyTorch, Hugging Face, scikit-learn, spaCy.
- APIs and deployment – FastAPI for serving models, Docker for containerization, PostgreSQL for data storage, and basic front-end solutions for interactive demos.
- Workflow – building data pipelines, fine-tuning transformer and LLM models, and optimizing end-to-end NLP systems from preprocessing to deployment.
Publications & Outreach
- Lead editor of the Tech & AI section at a professional blog – writing accessible articles on AI and Machine Learning for a broader audience.
- Author of peer-reviewed publications on NLP, sentiment analysis, plain language, and applied Machine Learning.
- Actively bridging research, engineering practice, and public understanding of new technologies.