Ya Gao

I am a 3rd-year PhD student at Aalto University School of Science, supervised by Pekka Marttinen. My research focuses on Large Language Models, particularly post-training, LLM Reasoning & Agency, foundation model self-improvement, and applications in healthcare.

Previously, I obtained my Master degree in Machine Learning from Aalto University and Bachelor degree in Computer Science from Sichuan University.

Email  |  CV  |  Google Scholar  |   | 

profile photo
Research

The recent progress in LLM capabilities has been driven by massive scaling of pre-training data. However, high-quality human data is a finite resource, and relying solely on imitation learning inherently caps model performance at the level of human demonstrators. My research aims to move beyond this ceiling by studying how models can evolve and improve themselves continuously with only limited human supervision.

Specifically, I focus on: (1) utilizing self-generated reasoning trajectories and multi-model interaction to bootstrap performance without extensive human labeling, and (2) analyzing learning dynamics within reasoning traces (e.g., how supervision format, reasoning depth, and data composition affect model behavior) to enable more effective self-improvement. In terms of real-world impact, I am interested in translating these advances into healthcare applications.

* indicates equal contributor, + indicates advisor

Edit Knowledge, Not Just Facts via Multi-Step Reasoning over Background Stories
Ya Gao, Kalle Kujanpää, Pekka Marttinen, Harri Valpola, Alexander Ilin
Under review, 2026
paper

We treat knowledge internalization as a reasoning problem and propose a training procedure that introduces knowledge as background stories, enforces its use through multi-hop reasoning, and distills the resulting behavior.

Memento No More: Coaching AI Agents to Master Multiple Tasks via Hints Internalization
Minttu Alakuijala*, Ya Gao*, Georgy Ananov, Samuel Kaski, Pekka Marttinen, Alexander Ilin, Harri Valpola
Under review, 2025
paper | code

We train LLM agents to internalize knowledge and skills for multiple tasks without relying on ever-expanding prompts or prior demonstrations, through context distillation and efficient use of corrective feedback from humans.

Adaptive Residual-Update Steering for Low-Overhead Hallucination Mitigation in Large Vision Language Models
Zhengtao Zou, Ya Gao+, Jiarui Guan, Bin Li, Pekka Marttinen
Under review, 2025
paper

We propose a low-overhead framework that steers LVLMs towards visually-grounded generation by leveraging a single-pass intervention, to address the trade-off between effectiveness and efficiency.

Leveraging Large Language Models for Digital Phenotyping: Detecting Depressive State Changes for Patients with Depressive Episodes
Yunhao Yuan*, Ya Gao*, Hans Moen, Erkki Isometsä, Pekka Marttinen, Talayeh Aledavood,
Under review, 2025
paper

We evaluate the potential of LLMs in analyzing digital phenotyping data to predict changes in depression severity among individuals experiencing major depressive episodes.

Query-Guided Self-Supervised Summarization of Nursing Notes
Ya Gao, Hans Moen, Saila Koivusalog, Miika Koskinen, Pekka Marttinen
ML4H, 2024
paper | code

We introduce a novel query-guided self-supervised domain adaptation approach for abstractive nursing note summarization. The method uses patient-related clinical queries for guidance, and hence does not need reference summaries for training.


Design and source code from Jon Barron's website.