About me

I am Haiwen Huang, my Chinese name is 黄海文. I am a PhD student at University of Tübingen, co-supervised by Prof. Andreas Geiger and Dr. Dan Zhang, starting in Jan. 2022. I am also an ELLIS PhD student.

My PhD research focuses on improving the generalization of vision and multimodal models, enabling them to perform robustly across diverse tasks and domains. For example, I have developed methods that leverage 3D priors to enhance 2D object detection in GOOD (ICLR 2023) and use self-distillation to upsample features in Vision Foundation Models in LoftUp (coming soon). I have also contributed to building more reliable evaluations of open-vocabulary generalization in RENOVATE (NeurIPS 2024).

I envision my work through a three-tiered “research pyramid”: (1) Method Development, (2) Performance Evaluation, and (3) Normative Meta-Evaluation. My PhD projects focus on the first two levels, but my long-term goal is to address all three to ensure that AI truly benefits society.

Research Pyramid

🔴 I am curently seeking internships starting from Sept/Oct 2025 onward where I can explore methods for understanding and evaluating the generalization and capabilities of VLMs, as well as the scientific principles behind designing robust benchmarks for them.

Before my PhD studies, I completed my MSc in CS at University of Oxford, advised by Prof. Yarin Gal, and my undergrad study in Mathematics at Peking University, advised by Prof. Bin Dong. I have also worked as a researcher for a year at Megvii (previously known as Face++) with Xinyu Zhou as my group leader, studying large-scale annotation and OoD detection methods.

My CV is here.

Highlighted Research

Teaching