Nicholas Roberts

/ Academic Bio

I am a Ph.D. candidate in CS at University of Wisconsin–Madison where I am advised by Fred Sala along with my many talented colleagues in the Sprocket Lab.

This past Fall, I interned with the Llama Generative AI team at Meta in London, where I worked with Dieuwke Hupkes on language model pretraining and evaluation. This past Summer, I interned at Together AI in San Francisco with Tri Dao, where I worked on hybrid language models. In Summer 2023, I interned with the Physics of AGI research group at Microsoft Research led by Sébastien Bubeck, where I also worked on language models.

Before starting my Ph.D., I had the pleasure of working with Ameet Talwalkar and Zack Lipton during my M.S. in the Machine Learning Department at Carnegie Mellon University. As an undergraduate, I was extremely fortunate to work with both Sanjoy Dasgupta and Gary Cottrell at the University of California, San Diego. Prior to all of this, I was a community college student at Fresno City College, where I was lucky enough to learn calculus, linear algebra, and C++ from Greg Jamison.

In 2025 I received an honorable mention for the Jane Street Graduate Research Fellowship and in 2023, I was named an MLCommons Rising Star. I have also been awarded the Prove AI and UnifyID AI Fellowships in 2021 and 2019, respectively.

/ Research Interests

My research is motivated by the need to accelerate foundation model (FM) adoption toward solving humanity’s most challenging problems. Doing so is a long-term effort requiring substantial community involvement. The goal of my Ph.D. research is to take steps towards realizing this high-impact vision, categorized roughly into three sub-topics:

The science of scaling laws,
Automation for improving FMs beyond naive scaling, and
Determining how FMs interact with data.

While furthering these directions for language, I have had the unique opportunity to pretrain LLMs at industrial scales. On the other hand, to accelerate adoption of FMs beyond language, I have also worked with a wide array of problems from different scientific domains, which includes solving PDEs, protein folding, climate modelling, and beyond.

/ Contact

Email: nick11roberts [at] cs [dot] wisc [dot] edu
Office: CS Dept. 5378, 1210 W Dayton St, Madison, WI 53706

/ News

Excited to release two papers from my internship at Meta!
- We discover a new scaling law phenomenon! Compute optima for knowledge favor larger models, whereas those for reasoning favor more data.
- We release Meta MLGym: a framework for benchmarking and developing agents for AI research!
I am extremely fortunate to have received an honorable mention for the Jane Street Graduate Research Fellowship! Thank you, Jane Street!
At NeurIPS ‘24, we improve Weak Supervision benchmarking and show that WS is stronger than was thought in prior work
I moved to London for my internship with the Llama pretraining team 🦙 at Meta, working on scaling laws and skills!
I moved to San Francisco to intern at Together AI 🐍, working on hybrid LLMs!

/ Publications

Fresh off the Press

Compute Optimal Scaling of Skills: Knowledge vs Reasoning
Nicholas Roberts, Niladri Chatterji, Sharan Narang, Mike Lewis, Dieuwke Hupkes.
ACL Findings 2025.
[arXiv]

Reward Models Enable Scalable Code Verification by Trading Accuracy for Throughput
Gabriel Orlanski, Nicholas Roberts, Aws Albarghouthi, Frederic Sala.
Preprint.
[arXiv]

R&B: Domain Regrouping and Data Mixture Balancing for Efficient Foundation Model Training
Albert Ge, Tzu-Heng Huang, John Cooper, Avi Trost, Ziyi Chu, Satya Sai Srinath Namburi GNVV, Ziyang Cai, Kendall Park, Nicholas Roberts, Frederic Sala.
ICML 2025 DIG-BUGS Workshop (oral).
ICML 2025 DataWorld Workshop.
[arXiv]

Conference Publications

Pretrained Hybrids with MAD Skills
Nicholas Roberts, Samuel Guo, Zhiqi Gao, Satya Sai Srinath Namburi GNVV, Sonia Cromp, Chengjun Wu, Chengyu Duan, Frederic Sala.
COLM 2025.
[arXiv]

MLGym: A New Framework and Benchmark for Advancing AI Research Agents
Deepak Nathani, Lovish Madaan, Nicholas Roberts, Nikolay Bashlykov, Ajay Menon, Vincent Moens, Amar Budhiraja, Despoina Magka, Vladislav Vorotilov, Gaurav Chaurasia, Dieuwke Hupkes, Ricardo Silveira Cabral, Tatiana Shavrina, Jakob Foerster, Yoram Bachrach, William Yang Wang, Roberta Raileanu.
COLM 2025.
[arXiv]

Stronger Than You Think: Benchmarking Weak Supervision on Realistic Tasks
Tianyi Zhang*, Linrong Cai*, Jeffrey Li, Nicholas Roberts, Neel Guha, Frederic Sala.
NeurIPS 2024.
[Paper] [arXiv]

Geometry-Aware Adaptation for Pretrained Models
Nicholas Roberts, Xintong Li, Dyah Adila, Sonia Cromp, Tzu-Heng Huang, Jitian Zhao, Frederic Sala.
NeurIPS 2023.
[Paper] [arXiv]

Skill-it! A data-driven skills framework for understanding and training language models
Mayee Chen, Nicholas Roberts, Kush Bhatia, Jue Wang, Ce Zhang, Frederic Sala, Christopher Ré.
NeurIPS 2023 (spotlight).
[Paper] [arXiv]

Generative Modeling Helps Weak Supervision (and Vice Versa)
Benedikt Boecking, Nicholas Roberts, Willie Neiswanger, Stefano Ermon, Frederic Sala, Artur Dubrawski.
ICLR 2023.
[Paper] [arXiv] [Code]

AutoWS-Bench-101: Benchmarking Automated Weak Supervision with 100 Labels
Nicholas Roberts*, Xintong Li*, Tzu-Heng Huang, Dyah Adila, Spencer Schoenberg, Cheng-Yu Liu, Lauren Pick, Haotian Ma, Aws Albarghouthi, Frederic Sala.
NeurIPS 2022.
[Paper] [arXiv] [Code] [Blog]

NAS-Bench-360: Benchmarking Neural Architecture Search on Diverse Tasks
Renbo Tu*, Nicholas Roberts*, Mikhail Khodak, Junhong Shen, Frederic Sala, Ameet Talwalkar.
NeurIPS 2022.
[Paper] [arXiv] [Code] [Website] [Blog]

Lifting Weak Supervision To Structured Prediction
Harit Vishwakarma, Nicholas Roberts, Frederic Sala.
NeurIPS 2022.
[Paper] [arXiv]

Universalizing Weak Supervision
Changho Shin, Winfred Li, Harit Vishwakarma, Nicholas Roberts, Frederic Sala.
ICLR 2022.
[Paper] [arXiv]

Rethinking Neural Operations for Diverse Tasks
Nicholas Roberts*, Mikhail Khodak*, Tri Dao, Liam Li, Christopher Ré, Ameet Talwalkar.
NeurIPS 2021.
[Paper] [arXiv] [Code] [Software Package] [Talk]

Preliminary version: Searching for Convolutions and a More Ambitious NAS
Nicholas Roberts*, Mikhail Khodak*, Tri Dao, Liam Li, Maria-Florina Balcan, Christopher Ré, Ameet Talwalkar.
AAAI 2021 Workshop on Learning Network Architecture During Training (plenary talk).

Learning from Discriminative Feature Feedback
Sanjoy Dasgupta, Akansha Dey, Nicholas Roberts, Sivan Sabato.
NeurIPS 2018.
[Paper]

Journal Publications

Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models
Aarohi Srivastava, ..., Nicholas Roberts (276), ..., (442 authors).
Transactions on Machine Learning Research (TMLR) 2023 (Finalist for Outstanding Certification).
ICLR 2025.
[arXiv] [Code]

NL-Augmenter: A Framework for Task-Sensitive Natural Language Augmentation
Kaustubh D. Dhole, ..., Nicholas Roberts (85), ..., (128 authors).
Northern European Journal of Language Technology (NEJLT) 2023.
[arXiv] [Code]

AutoML Decathlon: Diverse Tasks, Modern Methods, and Efficiency at Scale
Nicholas Roberts*, Samuel Guo*, Cong Xu*, Ameet Talwalkar, David Lander, Lvfang Tao, Linhang Cai, Shuaicheng Niu, Jianyu Heng, Hongyang Qin, Minwen Deng, Johannes Hog, Alexander Pfefferle, Sushil Ammanaghatta Shivakumar, Arjun Krishnakumar, Yubo Wang, Rhea Sukthanker, Frank Hutter, Euxhen Hasanaj, Tien-Dung Le, Mikhail Khodak, Yuriy Nevmyvaka, Kashif Rasul, Frederic Sala, Anderson Schneider, Junhong Shen, Evan Sparks
PMLR NeurIPS 2022 Competition Track.
[Paper] [Website] [Submission Site] [Code] [Blog]

Small Molecule Accurate Recognition Technology (SMART) to Enhance Natural Products Research
Chen Zhang*, Yerlan Idelbayev*, Nicholas Roberts, Yiwen Tao, Yashwanth Nannapaneni, Brendan M. Duggan, Jie Min, Eugene C. Lin, Erik C. Gerwick, Garrison W. Cottrell, William H. Gerwick.
Scientific Reports 2017.
[Paper]

Poster: Small Molecule Accurate Recognition Technology (SMART): A Digital Frontier to Reshape Natural Product Research
Chen Zhang*, Yerlan Idelbayev*, Nicholas Roberts (presenter), Yiwen Tao, Yashwanth Nannapaneni, Brendan M. Duggan, Jie Min, Eugene C. Lin, Erik C. Gerwick, Garrison W. Cottrell, William H. Gerwick.
Best Spotlight Presentation Award: Applied Machine Learning Days 2018.

Workshop Publications and Preprints

Tabby: Tabular Adaptation for Language Models
Sonia Cromp, Satya Sai Srinath Namburi GNVV, Catherine Cao, Mohammed Alkhudhayri, Samuel Guo, Nicholas Roberts, Frederic Sala
NeurIPS 2024 Table Representation Learning Workshop.
[Paper]

MoRe Fine-Tuning with 10x Fewer Parameters
Wenxuan Tan, Nicholas Roberts, Tzu-Heng Huang, Jitian Zhao, John Cooper, Samuel Guo, Chengyu Duan, Frederic Sala.
ICML 2024 Efficient Systems for Foundation Models (ES-FoMo) Workshop.
ICML 2024 Workshop on Foundation Models in the Wild.
[arXiv]

Understanding Neural Architecture Search by its Architecture Parameters
Nicholas Roberts, Yingyu Liang, Frederic Sala.
Midwest Machine Learning Symposium 2023.

ScriptoriumWS: A Code Generation Assistant for Weak Supervision
Tzu-Heng Huang, Harit Vishwakarma, Catherine Cao, Spencer Schoenberg, Nicholas Roberts, Frederic Sala.
ICLR 2023 Deep Learning for Code Workshop.

AutoML for Climate Change: A Call to Action
Renbo Tu, Nicholas Roberts, Vishak Prasad, Sibasis Nayak, Paarth Jain, Frederic Sala, Ganesh Ramakrishnan, Ameet Talwalkar, Willie Neiswanger, Colin White.
NeurIPS 2022 Tackling Climate Change with Machine Learning Workshop.
[arXiv]

Decoding and Diversity in Machine Translation
Nicholas Roberts, Davis Liang, Graham Neubig, Zachary C. Lipton.
NeurIPS 2020 Resistance AI Workshop.
[arXiv]

A Simple Setting for Understanding Neural Architecture Search with Weight-Sharing
Mikhail Khodak, Liam Li, Nicholas Roberts, Maria-Florina Balcan, Ameet Talwalkar.
ICML 2020 AutoML Workshop.
[Paper]

Weight-Sharing Beyond Neural Architecture Search: Efficient Feature Map Selection and Federated Hyperparameter Tuning
Mikhail Khodak*, Liam Li*, Nicholas Roberts, Maria-Florina Balcan, Ameet Talwalkar.
MLSys 2020 On-Device Intelligence Workshop.
[Paper]

Deep Connectomics Networks: Neural Network Architectures Inspired by Neuronal Networks
Nicholas Roberts, Dian Ang Yap, Vinay U. Prabhu.
NeurIPS 2019 Real Neurons and Hidden Units Workshop.
[arXiv]

Using Deep Siamese Neural Networks to Speed up Natural Products Research
Nicholas Roberts, Poornav S. Purushothama, Vishal T. Vasudevan, Siddarth Ravichandran, Chen Zhang, William H. Gerwick, Garrison W. Cottrell.
NeurIPS 2019 workshop on Machine Learning and the Physical Sciences.
[Paper]

Grassmannian Packings in Neural Networks: Learning with Maximal Subspace Packings for Diversity and Anti-Sparsity
Dian Ang Yap, Nicholas Roberts, Vinay U. Prabhu.
NeurIPS 2019 workshop on Bayesian Deep Learning.
NeurIPS 2019 workshop on Information Theory and Machine Learning.
[arXiv]

Model Weight Theft With Just Noise Inputs: The Curious Case of the Petulant Attacker
Nicholas Roberts, Vinay U. Prabhu, Matthew McAteer.
ICML 2019 workshop on Security and Privacy of Machine Learning (spotlight).
[arXiv]

/ Education

University of Wisconsin–Madison

Ph.D. Computer Science
Mathematics minor
August 2021 - Present

Member of the Wisconsin Triathlon Team
Member of the Hoofer Sailing Club
Scratch Club volunteer (teaching CS to Madison area 4th-5th graders)

Carnegie Mellon University

M.S. Machine Learning
August 2019 - May 2021

MSML Student Committee Leader
MSML Admissions Committee Member

University of California, San Diego

B.S. Computer Science
Mathematics minor
CSE Honors Program
September 2015 - March 2019
Magna Cum Laude
CSE Highest Distinction

Tutor for Data Science 10 (Principles of Data Science)
Tutor for Data Science 20 (Data Structures for Data Science)
Data analyst, Triton Engineering Student Council
Machine learning workshop facilitator, Data Science Student Society at UCSD
House leader, Tau Beta Pi, California Psi

Fresno City College

Computer Science
Leon S. Peters Honors Program
August 2013 - May 2015

Tutor for CIT 65 (Android Application Development)
Mathematics Tutor
Computer Science Tutor
President/Founder, Google Developer Group Fresno City College
Treasurer, Science and Engineering Club

/ Industry Experience

Meta AI

research scientist intern 🦙
(London, UK)

Llama Generative AI pretraining team with Dieuwke Hupkes
Discovered a new phenomenon: knowledge and reasoning skills have different scaling behavior
Evangelized the usage of the AUP score, originally used in the AutoML Decathlon, for use within Meta MLGym
Technologies used: Python, PyTorch

Together AI

research intern 🐍
(San Francisco, CA)

Research with Tri Dao
Analyzed the role of specific attention heads for long-range retrieval tasks in hybrid LLMs
Investigated the mechanisms behind what makes hybrid LLMs good at in-context recall
Technologies used: Python, PyTorch, HuggingFace

Microsoft Research

research intern 🦄
(Redmond, WA)

Physics of AGI research group led by Sébastien Bubeck
Developed activation function search techniques for large-scale LLM pretraining
Developed learning curve extrapolation techniques to ablate architectural choices in transformers
Technologies used: Python, PyTorch, HuggingFace

Amazon AI

applied scientist intern 
(Seattle, WA)

AWS Transcribe research group led by Katrin Kirchhoff
Researched and developed methods for hypothesis rescoring in ASR systems using neural language modeling
Identified areas for improvement in many existing ASR systems when recognizing rare or zero shot entities
Technologies used: Python, PyTorch, RWTH ASR, Kaldi, AWS

UnifyID

ai fellow 
 machine learner intern 
(Redwood City, CA)

UnifyID research lab fearlessly led by Vinay Uday Prabhu
Researched various ways in which research from network neuroscience could be applied to deep learning
Developed a novel model extraction attack against deep learning models for computer vision using just noise inputs
Technologies used: Python, Keras, PyTorch, TensorFlow, MATLAB, AWS

Intuit

software engineering intern 
(Mountain View, CA)

Intuit Technology Futures research group
Researched and implemented a novel deep learning model for controllable text generation as a service within Intuit
Developed a system for proposing alternative candidate sentences for Intuit content writers using deep learning
Investigated the use of dynamic topic models for customer support tickets to gain actionable insights over time
Technologies used: Python, PyTorch, TensorFlow, Gensim, Keras

Altum

applied scientist intern 
(La Jolla, CA)

Developed language model to extract NLP features from text data regarding cryptocurrency trading
Investigated unsupervised learning techniques for extracting sentiment data in real time from online forums
Technologies used: Python, PyTorch

Teradata

software engineering intern 
(San Diego, CA)

Developed open source Spark-Teradata connector forked from Databricks’ connector for AWS Redshift in Scala
Designed and implemented Teradata stored procedures in Java to mimic Redshift’s UNLOAD and COPY using S3
Improved training methodology and architecture of deep learning time series model used internally
Implemented system for updating the time series dataset and fine tuning the deep learning model
Technologies used: Scala, Java, Maven, Teradata SQL, AWS, Tensorflow, Flask

The Comeback Community

volunteer full stack developer 
(remote/Fresno, CA)

Developed site in Go, gohtml, and CSS on Google App Engine
Mentored new developers in web development
Technologies used: Go, Google App Engine, gohtml, HTML5, CSS3, JavaScript

Skqrl

software engineering intern 
(La Jolla, CA)

Developed web crawler to compile needfinding and product data using Scrapy and Selenium
Designed and implemented an extensible product search solution designed to handle future user search needs
Technologies used: Python, Scrapy, Selenium, Django, MySQL, JavaScript

ModSpot

software engineering intern 
(remote)

Implemented new user account, edit profile, and login designs in Objective-C for iOS application
Refactored analytics code for gathering statistics on app usage, helping designers make more informed choices
Technologies used: Objective-C, Cocoa Touch, Flurry Analytics

/ Fun

Extracurricular interests:

Aspiring triathlete
Sailing
Pottery
Guitar
Interior design
A budding interest in plants
Longboard construction and woodworking

Extra-Extracurricular interests:

Into vintage lever espresso machines, and proud owner of a 1974 Olympia Cremina (sans asbestos, of course).
Ordained Dudeist priest. I can legally officiate weddings in the US. If you’d like to book me for your wedding, please feel free to reach out.
Head Researcher of Margarita Machine Lounge Therapy at Vacation Inc. Check out our selection of luxury sunscreens today! And use my referral link!¹
Unofficial Toyota Prius landspeed record holder at Bonneville Speedway.
Trying to get into Iceboat racing. Personal goal: become the fastest Iceboat racer to ever hail from Fresno, CA (a low bar, indeed).
Did I mention I’m into espresso? I added a microcontroller to a cheap espresso machine to elevate it to the level of a $4,000 Decent.

Credentials:

Photograph of me shredding, circa 2009:

Corn, circa long ago, colorized:

Mixtape:

¹Don’t forget to check out Poolsuite FM: the ultra-summer music player for the Macintosh Computer; transporting you to a virtual vacation where the sun never sets.