On the job market

I'm seeking full-time research scientist roles starting early 2026. If there's a good fit, please contact me at {last_name}{first_name[0]}@umich.com.

At a very high-level, my work aims to build verifiable, reliable, and trustworthy reasoning systems. Here's a summary of my relevant work/interests:

I am a final-year PhD candidate at the University of Michigan in Ann Arbor advised by Lu Wang and Honglak Lee. I’m currently an intern at LG AI Research working on web agents. I am interested in test-time techniques, reasoning agents, and LLM post-training. I was fortunate to work in different places with so many amazing people. In summer 2024, I was part of CodeGen team at Cohere led by Matthias Gallé where I worked on large-scale model merging. In summer 2023, I was AI2 working with Iz Beltagy and Hao Peng on training models to cite their pretraining data. In 2021, I was Amazon AWS working with Kathy Mckeown. Prior to that, I was an intern at Naver Labs Europe where I worked on Controllable Text Generation and Energy-based models with Hady Elsahar and Marc Dymetman.

Fun fact: I play the piano, write, and produce my own music.

( 🐦 Twitter / 💼 LinkedIn / 🎓 Scholar / 💻 Github / 📄 CV )



Selected Works

(For a complete list, visit my google scholar)

Process Reward Models That Think

Muhammad Khalifa, Rishabh Agarwal, Lajanugen Logeswaran, Jaekyeom Kim, Hao Peng, Moontae Lee, Honglak Lee, Lu Wang

2025

[Paper] [Code]

ThinkPRM visualization

A Distributional Approach to Controlled Text Generation

Muhammad Khalifa*, Hady Elsahar*, Marc Dymetman*

ICLR 2021

[Paper] [Code] [Blog]

Distributional Control visualization

GRACE: Discriminator-Guided Chain-of-Thought Reasoning

Muhammad Khalifa, Lajanugen Logeswaran, Moontae Lee, Honglak Lee, Lu Wang

EMNLP Findings 2023

[Paper] [Code]

GRACE visualization

Learning to Reason via Program Generation, Emulation, and Search

Nathaniel Weir*, Muhammad Khalifa*, Linlu Qiu, Orion Weller, Peter Clark

NeurIPS 2024

[Paper]

COGEX visualization

Source-Aware Training Enables Knowledge Attribution in Language Models

Muhammad Khalifa, David Wadden, Emma Strubell, Honglak Lee, Lu Wang, Iz Beltagy, Hao Peng

COLM 2024

[Paper]

Source-Aware Training visualization

If You Can't Use Them, Recycle Them: Optimizing Merging at Scale Mitigates Performance Tradeoffs

Muhammad Khalifa, Yi-Chern Tan, Arash Ahmadian, Tom Hosking, Honglak Lee, Lu Wang, Ahmet Üstün, Tom Sherborne, Matthias Gallé

arXiv 2024

[Paper]

Model Merging visualization