Prasanna Mayilvahanan

I'm a fourth-year Ph.D. student at the Max Planck Institute for Intelligent Systems, Germany, and currently a Research Intern at Cohere working on exploration methods for LLM reasoning. In summer 2024, I spent time at Apple MLR as a Research Intern. Prior to this, I completed my master's studies at USI Lugano / ETH Zurich, and my bachelor's at IIT Guwahati.

Email  /  CV  /  Blog  /  Scholar  /  Twitter  /  Github

profile photo

Research

My research spans benchmarking LLM reasoning capabilities, robustness of LLMs and VLMs, and foundations of representation learning. I am particularly interested in developing methods that enable models to discover patterns 'beyond their training distribution.' Currently, my focus is on exploration-driven RL approaches and methodologies that enhance generalization. Selected papers are highlighted.

MATH-Beyond: A Benchmark for RL to Expand Beyond the Base Model
Prasanna Mayilvahanan, Ricardo Dominguez-Olmedo, Thaddäus Wiedemer, Wieland Brendel
arXiv preprint, 2025
arXiv / code / data

MATH-B provides a diagnostic tool to distinguish whether post-training methods merely sharpen existing reasoning modes or genuinely discover novel solution paths beyond base model reach.

LLMs on the Line: Data Determines Loss-to-Loss Scaling Laws
Prasanna Mayilvahanan*, Thaddäus Wiedemer*, Sayak Mallick, Matthias Bethge, Wieland Brendel
ICML, 2025
arXiv / project page

We find that two substantially different training setups—differing in architectures, tokenizers, optimizers, etc.—when trained on the same data and achieving identical training losses, consistently yield matching downstream performance across diverse tasks.

In Search of Forgotten Domain Generalization
Prasanna Mayilvahanan*, Roland S. Zimmermann*, Thaddäus Wiedemer, Evgenia Rusak, Attila Juhos, Matthias Bethge, Wieland Brendel
ICLR, 2025   (Spotlight)
arXiv / project page

CLIP's high performance on style-centric domain shifts is significantly influenced by the presence of such images in its training set.

Does CLIP's Generalization Performance Mainly Stem from High Train-Test Similarity?
Prasanna Mayilvahanan*, Thaddäus Wiedemer*, Evgenia Rusak, Matthias Bethge, Wieland Brendel
ICLR, 2024
arXiv / project page

CLIP's ability to generalize to standard OOD benchmarks does not mainly stem from exact duplicates and near-duplicates in its training dataset.

Compositional Generalization from First Principles
Thaddäus Wiedemer*, Prasanna Mayilvahanan*, Matthias Bethge, Wieland Brendel
NeurIPS, 2023
arXiv

We introduce a theoretical framework to analyze compositional generalization of neural networks within the regression setting.

Representation Learning for the Clustering of Multi-Omics Data
Gautier Viaud, Prasanna Mayilvahanan, Paul-Henry Cournède
IEEE/ACM TCBB, 2022
paper

We provide a neural network-based representation learning and clustering method for multi-omics data integration.


* denotes equal contribution
Template from Jon Barron's website.