About

I am István Üveges, a computational linguist and Machine Learning specialist with a PhD in linguistics and a background in computer science. I focus on natural language processing, where I design and deploy Machine Learning models and build end-to-end pipelines for text analysis.

I have practical experience in large-scale text classification, sentiment and emotion detection, plain language identification, and domain-specific NLP in areas such as legal and financial texts. I also work on core NLP challenges including corpus building, linguistic annotation, morphological analysis, and spell-checking systems.

My technical toolkit covers Python, PyTorch, and Hugging Face, plus scikit-learn, spaCy, FastAPI, PostgreSQL, and Docker. I am skilled at developing data pipelines, optimizing model performance, and creating interactive demos that bring NLP solutions into real-world use.

In addition to engineering work, I also lead the Tech & AI section of a professional blog, where I write accessible articles on current technology topics. This allows me to bridge research, practical applications, and public understanding of AI and Machine Learning.

I am passionate about applying NLP and ML to practical problems and I am looking to contribute as a Machine Learning engineer, especially in NLP-driven projects.

Currently at HUN-REN Centre for Social Sciences (POLTEXTLAB).

Download short CV (PDF)Get in touch Read the blog

Focus areas

Applied NLP for specialized domains – experience with legal, political, and financial texts, building models for classification, sentiment and emotion analysis, and plain language detection.
Core NLP tasks – corpus building, linguistic annotation, morphological analysis, and spell-checking systems, combined with modern ML-based approaches.
Model development and evaluation – including fine-tuning of transformer-based and LLM models, with a focus on explainability, robustness, multilingual NLP, and data augmentation.
LLM-based applications – experience with retrieval-augmented generation (RAG) pipelines and domain-specific adaptation of large language models.

Tech stack

Programming & ML frameworks – Python, PyTorch, Hugging Face, scikit-learn, spaCy.
APIs and deployment – FastAPI for serving models, Docker for containerization, PostgreSQL for data storage, and basic front-end solutions for interactive demos.
Workflow – building data pipelines, fine-tuning transformer and LLM models, and optimizing end-to-end NLP systems from preprocessing to deployment.

Publications & Outreach

Lead editor of the Tech & AI section at a professional blog – writing accessible articles on AI and Machine Learning for a broader audience.
Author of peer-reviewed publications on NLP, sentiment analysis, plain language, and applied Machine Learning.
Actively bridging research, engineering practice, and public understanding of new technologies.

Workflow

📂 Data🧹 Preprocessing🤖 Model Training🚀 Deployment💻 Demo