Quoc-Huy Trinh

Deep Generative Model  ·  Computer Vision  ·  NLP  ·  Multimodal

Hi 👋, I'm Huy — a Vietnamese researcher and a Master's student in Computer Science at Aalto University. I'm passionate about exploring new ideas at the frontier of deep learning and computer vision, and I love building things just to see how they work. Outside of research, I enjoy reading, soccer, music, and films.

On the industry side, I was a Lead Machine Learning Engineer at SpexAI GmbH (acquired by Nynomic AG), where I led the development and delivery of an AI analysis system for plants. Before that, as a Data Scientist at VNG Corporation, I built a FaceID system and led its Face Security feature, and also developed license plate detection & recognition and multi-object tracking for surveillance cameras.

My research is supervised by Prof. Minh-Triet Tran, Prof. Ulas Bagci, Prof. Sebastian Szyller, Prof. Bo Zhao, Dr. Debesh Jha, and Msc. Hai-Dang Nguyen.

Research Interests

Deep Generative Model Computer Vision NLP Audio Generation Medical Image Analysis Trustworthy AI

News

Jun 2026
🎉 Huy will present PRS-MED and Firebolt-VL at the CVPR Workshop 2026.
Mar 2026
New preprint: Beyond Medical Diagnostics: How Medical Multimodal Large Language Models Think in Space.

Education

Aalto

Aalto University, Espoo, Finland

M.Sc.  ·  Major: Computer Science Minor: Machine Learning, Data Science and Artificial Intelligence GPA: 4.82 / 5  ·  Top 5% Thesis: In-Context Attribution for Large Language Model
Supervisors Prof. Sebastian Szyller
Prof. Bo Zhao
HCMUS

University of Science, VNU-HCM, Vietnam

B.Sc. (Honor Program)  ·  Major: Information Technology GPA: 3.58 / 4.0  ·  Top 15% Thesis: Pose Knowledge Guidance for Person Re-Identification
Supervisors Prof. Minh-Triet Tran
Msc. Hai-Dang Nguyen
LHP

Le Hong Phong High School for the Gifted

Selected Publications

2026
PRS-MED: Position Reasoning Segmentation in Medical Imaging
Quoc-Huy Trinh, Minh-Van Nguyen, Jung Zeng, Debesh Jha*, Ulas Bagci*
Firebolt-VL: Efficient Vision-Language Understanding with Cross-Modality Modulation
Quoc-Huy Trinh, Mustapha Abdullahi, Bo Zhao, Debesh Jha
2025
CMATalk: Cross Modality Alignment for Talking Head Generation
Xuan-Nam Cao, Quoc-Huy Trinh, Minh-Triet Tran
NeIn: Telling What You Don't Want
Nhat-Tan Bui, Dinh-Hieu Hoang, Quoc-Huy Trinh, Minh-Triet Tran, Truong Nguyen, Susan Gauch
2024
Validating polyp and instrument segmentation methods in colonoscopy through Medico 2020 and MedAI 2021 Challenges
Debesh Jha, Vanshali Sharma, Debapriya Banik, Quoc-Huy Trinh, et al.
SAM-EG: Segment Anything Model with Edge Guidance framework for efficient Polyp Segmentation
Quoc-Huy Trinh, Hai-Dang Nguyen, Bao-Tram Nguyen Ngoc, Debesh Jha, Ulas Bagci, Minh-Triet Tran
PDGS: Pose-Guided Deep Supervision for Mitigating Clothes-Changing in Person Re-Identification
Quoc-Huy Trinh, Nhat-Tan Bui, Dinh-Hieu Hoang, Phuoc-Thao Vo Thi, Hai-Dang Nguyen, Debesh Jha, Ulas Bagci, Ngan Le, Minh-Triet Tran
KDAS: Knowledge distillation Framework via Attention Supervision for Polyp Segmentation
Quoc-Huy Trinh, Minh-Van Nguyen, Phuoc-Thao Vo Thi
Pose Knowledge Distill Guidance: Effective Pose guide learning for Person Re-Identification
Quoc-Huy Trinh, Phuoc-Thao Vo Thi, Minh-Triet Tran, Hai-Dang Nguyen
ICMR 2024 — ACM Best Paper Oral
2023
SpeechSyncNet: Speech to Talking Landmark via the fusion of prior frame landmark and the audio
Xuan-Nam Cao, Quoc-Huy Trinh, Van-Son Ho, Minh-Triet Tran
An objective validation of polyp and instrument segmentation methods in colonoscopy through Medico 2020 and MedAI 2021 transparency challenges
Debesh Jha, Vanshali Sharma, Debapriya Banik, Quoc-Huy Trinh, et al.
Graph for Transformer Feature: A New Approach for Face Anti-Spoofing
Quoc-Huy Trinh, Hieu Nguyen, Van Nguyen, Xuan-Mao Nguyen, Hai-Dang Nguyen
M2UNet: MetaFormer Multi-scale Upsampling Network for Polyp Segmentation
Quoc-Huy Trinh, Nhat-Tan Bui, Trong-Hieu Nguyen Mau, Minh-Van Nguyen, Hai-Minh Phan, Minh-Triet Tran, Hai-Dang Nguyen
Meta-Polyp: a baseline for efficient Polyp segmentation
Quoc-Huy Trinh
PEFNet: Positional Embedding Feature for Polyp Segmentation
Trong-Hieu Nguyen-Mau, Quoc-Huy Trinh, Nhat-Tan Bui, Phuoc-Thao Vo Thi, Minh-Van Nguyen, Xuan-Nam Cao, Minh-Triet Tran, Hai-Dang Nguyen
2022
EfficientNet for Brain-Lesion Classification — International Workshop BrainLes 2021
Quoc-Huy Trinh, Trong-Hieu Nguyen Mau, Radmir Zosimov, Minh-Van Nguyen
2021
SHREC 2021: Retrieval of Cultural Heritage Objects
Ivan Sipiran, Patrick Lazo, Cristian Lopez, ..., Quoc-Huy Trinh, et al.

Industry Experience

SpexAI

Lead Machine Learning Engineer — Spex A.I GmbH (Mar 2024 – May 2026)

SpexAI

Machine Learning Engineer — Spex A.I GmbH (Oct 2021 – Mar 2024)

SongGen

Senior Research Scientist — SongGen (Mar 2024 – May 2025)

VNG

Data Scientist — VNG Corporation (2022 – 2024)

Aeyes

Founder — Aeyes – Smart Glasses for Blind (2020 – Present)

Software Developer — Microbox (2021 – 2022)

BaoData (2020)

Academic Experience

Research Intern — Xu Lab, Carnegie Mellon University (Jun 2025 – Present)

Investigating the spatial reasoning of 3D multimodal LLMs over volumetric medical data (SpatialMed).

Research Assistant — TAC Lab, Aalto University (Present)

Exploring in-context attribution and how it interacts with the knowledge stored in a model's in-weight data.

Research Assistant — Bagci Lab, Northwestern University (Aug 2024 – Present)

Studying knowledge distillation for efficient medical image segmentation (SAM-EG), and the grounding and position-reasoning capabilities of multimodal LLMs (PRS-MED).

Research Intern — Empathic Computing Lab (2022 – 2023)

Performed multimodal analysis of EEG signals to study human cognition.

Research Intern — AIOZ (2021 – 2022)

Academic Service

Conference Reviewer

NeurIPS, CVPR, ICCV, ECCV, WACV, MICCAI, ACM MM, ICME, MMM, IJCNN

Journal Reviewer

TMI, TCSVT

Awards

2024Scholarship awarded by ICME 2024
2024Best Paper Award — AI-SIPM @ ICMR 2024
2023Third Prize — Vietnamese National Invention 2023
2022Second Prize — Vietnamese Invention for Society
2022Excellent Paper Award — ICMV 2022
2022Best Paper Award — RIVF 2022
2022First Prize — Innocity
2022Top 1 — Zoo Hackathon
2021Second Prize — Makethon (no first prize awarded)
2021Second Prize — Software for Student (no first prize awarded)
2021Top 2 Best Project — TensorFlow Community
2021Second Prize — Vietnamese Edtech Startup 2021
2021Top 2 Medical Track — Mediaeval 2021
2021Top 3 — KO Hackathon
2020Top 3 Medical Track — Mediaeval 2020
2020Third Prize — Vietnam National Talent Youth in Computer Science
2019Third Prize — Vietnam National Science and Engineering Fair
2018Bronze Medal — Robotacon

Open Source

SongGen-AI / LLambada

Open-source music generation project.