Quoc-Huy Trinh
Deep Generative Model · Computer Vision · NLP · Multimodal
Hi 👋, I'm Huy — a Vietnamese researcher and a Master's student in Computer Science at Aalto University. I'm passionate about exploring new ideas at the frontier of deep learning and computer vision, and I love building things just to see how they work. Outside of research, I enjoy reading, soccer, music, and films.
On the industry side, I was a Lead Machine Learning Engineer at SpexAI GmbH (acquired by Nynomic AG), where I led the development and delivery of an AI analysis system for plants. Before that, as a Data Scientist at VNG Corporation, I built a FaceID system and led its Face Security feature, and also developed license plate detection & recognition and multi-object tracking for surveillance cameras.
My research is supervised by Prof. Minh-Triet Tran, Prof. Ulas Bagci, Prof. Sebastian Szyller, Prof. Bo Zhao, Dr. Debesh Jha, and Msc. Hai-Dang Nguyen.
Research Interests
News
Education
Aalto University, Espoo, Finland
Prof. Bo Zhao
University of Science, VNU-HCM, Vietnam
Msc. Hai-Dang Nguyen
Le Hong Phong High School for the Gifted
Selected Publications
Industry Experience
Lead Machine Learning Engineer — Spex A.I GmbH (Mar 2024 – May 2026)
Machine Learning Engineer — Spex A.I GmbH (Oct 2021 – Mar 2024)
Senior Research Scientist — SongGen (Mar 2024 – May 2025)
Data Scientist — VNG Corporation (2022 – 2024)
Founder — Aeyes – Smart Glasses for Blind (2020 – Present)
Software Developer — Microbox (2021 – 2022)
BaoData (2020)
Academic Experience
Research Intern — Xu Lab, Carnegie Mellon University (Jun 2025 – Present)
Investigating the spatial reasoning of 3D multimodal LLMs over volumetric medical data (SpatialMed).
Research Assistant — TAC Lab, Aalto University (Present)
Exploring in-context attribution and how it interacts with the knowledge stored in a model's in-weight data.
Research Assistant — Bagci Lab, Northwestern University (Aug 2024 – Present)
Studying knowledge distillation for efficient medical image segmentation (SAM-EG), and the grounding and position-reasoning capabilities of multimodal LLMs (PRS-MED).
Research Intern — Empathic Computing Lab (2022 – 2023)
Performed multimodal analysis of EEG signals to study human cognition.
Research Intern — AIOZ (2021 – 2022)
Academic Service
Conference Reviewer
Journal Reviewer
Awards
Open Source
SongGen-AI / LLambada
Open-source music generation project.
