Juhan Bae

Juhan Bae 배주한

Last updated · 05.22.2026

About

I am currently a research scientist in industry. I completed my PhD in machine learning at the University of Toronto in 2025, and received my HBSc in Computer Science and Statistics from the same university in 2019. My research focuses on training data attribution, tracing how individual training examples influence a model's predictions.

Selected Publications

01

Studying Large Language Model Generalization with Influence Functions

Roger Grosse*, Juhan Bae*, Cem Anil*, Nelson Elhage, Alex Tamkin, Amirhossein Tajdini, Benoit Steiner, Dustin Li, Esin Durmus, Ethan Perez, Evan Hubinger, Kamilė Lukošiūtė, Karina Nguyen, Nicholas Joseph, Sam McCandlish, Jared Kaplan, Samuel Bowman

02

Training Data Attribution via Approximate Unrolled Differentiation

Juhan Bae, Wu Lin, Jonathan Lorraine, Roger Grosse

03

Better Training Data Attribution via Better Inverse Hessian-Vector Products

Andrew Wang, Elisa Nguyen, Runshi Yang, Juhan Bae, Sheila A. McIlraith, Roger Grosse

04

What is Your Data Worth to GPT? LLM-Scale Data Valuation with Influence Functions

Sang Keun Choe, Hwijeen Ahn*, Juhan Bae*, Kewen Zhao*, Minsoo Kang, Youngseog Chung, Adithya Pratapa, Willie Neiswanger, Emma Strubell, Teruko Mitamura, Jeff Schneider, Eduard Hovy, Roger Grosse, Eric Xing

06

Procedural Knowledge in Pretraining Drives Reasoning in Large Language Models

Laura Ruis, Maximilian Mozes, Juhan Bae, Siddhartha Kamalakara, Dwaraknath Gnaneshwar, Acyr Locatelli, Robert Kirk, Tim Rocktäschel, Edward Grefenstette, Max Bartolo

07

Influence Functions for Scalable Data Attribution in Diffusion Models

Bruno Mlodozeniec, Runa Eschenhagen, Juhan Bae, Alexander Immer, David Krueger, Richard Turner

08

IF-Guide: Influence Function-Guided Detoxification of LLMs

Zachary Coalson, Juhan Bae, Nicholas Carlini, Sanghyun Hong