Nicholas Roberts Ph.D. Candidate Department of Computer Sciences
University of Wisconsin-Madison
Advisor: Frederic Sala

I am a Ph.D. candidate in Computer Science at University of Wisconsin–Madison, advised by Fred Sala in the Sprocket Lab, where I develop methods for efficient foundation model training and adaptation.

I have completed research internships at Meta’s Llama team (working on scaling laws with Dieuwke Hupkes), Together AI (hybrid language models with Tri Dao), and Microsoft Research (Physics of AGI group with SĂ©bastien Bubeck). I received an honorable mention for the Jane Street Graduate Research Fellowship (2025) and was named an MLCommons Rising Star (2023).

My academic path began at Fresno City College before earning my B.S. at UC San Diego (working with Sanjoy Dasgupta and Gary Cottrell) and M.S. at Carnegie Mellon University (working with Ameet Talwalkar and Zack Lipton).

My research develops principled methods for training and adapting foundation models to scientific domains. I focus on three interconnected areas:

  1. discovering optimal data-compute tradeoffs during pretraining,
  2. automating model improvement beyond naive scaling, and
  3. understanding how foundation models interact with specialized data.

Through scaling laws research with Meta, I have shown that knowledge acquisition and reasoning tasks require fundamentally different compute allocation strategies—challenging the assumption of uniform scaling. I complement this with automated ML techniques that adapt general models to domains like protein folding and climate modeling, where labeled data is scarce and expert knowledge is critical.

My goal is enabling scientists to leverage state-of-the-art AI without requiring extensive ML expertise, accelerating discovery across fields from biology to physics.

Email: nick11roberts [at] cs [dot] wisc [dot] edu
Office: TBD, moving to the new building!

Fresh off the Press

Conference Publications

  • MLGym: A New Framework and Benchmark for Advancing AI Research Agents
    Deepak Nathani, Lovish Madaan, Nicholas Roberts, Nikolay Bashlykov, Ajay Menon, Vincent Moens, Amar Budhiraja, Despoina Magka, Vladislav Vorotilov, Gaurav Chaurasia, Dieuwke Hupkes, Ricardo Silveira Cabral, Tatiana Shavrina, Jakob Foerster, Yoram Bachrach, William Yang Wang, Roberta Raileanu.
    COLM 2025.
    [arXiv]

Journal Publications

  • Small Molecule Accurate Recognition Technology (SMART) to Enhance Natural Products Research
    Chen Zhang*, Yerlan Idelbayev*, Nicholas Roberts, Yiwen Tao, Yashwanth Nannapaneni, Brendan M. Duggan, Jie Min, Eugene C. Lin, Erik C. Gerwick, Garrison W. Cottrell, William H. Gerwick.
    Scientific Reports 2017.
    [Paper]

    Poster: Small Molecule Accurate Recognition Technology (SMART): A Digital Frontier to Reshape Natural Product Research
    Chen Zhang*, Yerlan Idelbayev*, Nicholas Roberts (presenter), Yiwen Tao, Yashwanth Nannapaneni, Brendan M. Duggan, Jie Min, Eugene C. Lin, Erik C. Gerwick, Garrison W. Cottrell, William H. Gerwick.
    Best Spotlight Presentation Award: Applied Machine Learning Days 2018.

Workshop Publications and Preprints

  • MoRe Fine-Tuning with 10x Fewer Parameters
    Wenxuan Tan, Nicholas Roberts, Tzu-Heng Huang, Jitian Zhao, John Cooper, Samuel Guo, Chengyu Duan, Frederic Sala.
    ICML 2024 Efficient Systems for Foundation Models (ES-FoMo) Workshop.
    ICML 2024 Workshop on Foundation Models in the Wild.
    [arXiv]

  • AutoML for Climate Change: A Call to Action
    Renbo Tu, Nicholas Roberts, Vishak Prasad, Sibasis Nayak, Paarth Jain, Frederic Sala, Ganesh Ramakrishnan, Ameet Talwalkar, Willie Neiswanger, Colin White.
    NeurIPS 2022 Tackling Climate Change with Machine Learning Workshop.
    [arXiv]

University of Wisconsin–Madison

Ph.D. Computer Science
Mathematics minor
August 2021 - Present

Carnegie Mellon University

M.S. Machine Learning
August 2019 - May 2021
  • MSML Student Committee Leader
  • MSML Admissions Committee Member

University of California, San Diego

B.S. Computer Science
Mathematics minor
CSE Honors Program
September 2015 - March 2019
Magna Cum Laude and CSE Highest Distinction honors

Fresno City College

Computer Science
Leon S. Peters Honors Program
August 2013 - May 2015
  • Tutor for CIT 65 (Android Application Development)
  • Mathematics Tutor
  • Computer Science Tutor
  • President/Founder, Google Developer Group Fresno City College
  • Treasurer, Science and Engineering Club

Meta AI

Research Scientist Intern 🩙
(London, UK)
  • Llama Generative AI pretraining team with Dieuwke Hupkes
  • Discovered a new phenomenon: knowledge and reasoning skills have different scaling behavior
  • Incorporated the AUP score, originally used in our AutoML Decathlon competition, into Meta MLGym
  • Technologies used: Python, PyTorch

Together AI

Research Intern 🐍
(San Francisco, CA)
  • Research with Tri Dao
  • Analyzed the role of specific attention heads for long-range retrieval tasks in hybrid LLMs
  • Investigated the mechanisms behind what makes hybrid LLMs good at in-context recall
  • Technologies used: Python, PyTorch, HuggingFace

Microsoft Research

Research Intern 🩄
(Redmond, WA)
  • Physics of AGI research group led by SĂ©bastien Bubeck
  • Developed activation function search techniques for large-scale LLM pretraining
  • Developed learning curve extrapolation techniques to ablate architectural choices in transformers
  • Technologies used: Python, PyTorch, HuggingFace

Amazon AI

Applied Scientist Intern
(Seattle, WA)
  • AWS Transcribe research group led by Katrin Kirchhoff
  • Researched and developed methods for hypothesis rescoring in ASR systems using neural language modeling
  • Identified areas for improvement in many existing ASR systems when recognizing rare or zero shot entities
  • Technologies used: Python, PyTorch, RWTH ASR, Kaldi, AWS

UnifyID

AI Fellow
Machine Learner Intern
(Redwood City, CA)
  • UnifyID research lab led by Vinay Uday Prabhu
  • Researched various ways in which research from network neuroscience could be applied to deep learning
  • Developed a novel model extraction attack against deep learning models for computer vision using just noise inputs
  • Technologies used: Python, Keras, PyTorch, TensorFlow, MATLAB, AWS

Intuit

Software Engineering Intern
(Mountain View, CA)
  • Intuit Technology Futures research group
  • Researched and implemented a novel deep learning model for controllable text generation as a service within Intuit
  • Developed a system for proposing alternative candidate sentences for Intuit content writers using deep learning
  • Investigated the use of dynamic topic models for customer support tickets to gain actionable insights over time
  • Technologies used: Python, PyTorch, TensorFlow, Gensim, Keras

Altum

Applied Scientist Intern
(La Jolla, CA)
  • Developed language model to extract NLP features from text data regarding cryptocurrency trading
  • Investigated unsupervised learning techniques for extracting sentiment data in real time from online forums
  • Technologies used: Python, PyTorch

Teradata

Software Engineering Intern
(San Diego, CA)
  • Developed open source Spark-Teradata connector forked from Databricks’ connector for AWS Redshift in Scala
  • Designed and implemented Teradata stored procedures in Java to mimic Redshift’s UNLOAD and COPY using S3
  • Improved training methodology and architecture of deep learning time series model used internally
  • Implemented system for updating the time series dataset and fine tuning the deep learning model
  • Technologies used: Scala, Java, Maven, Teradata SQL, AWS, Tensorflow, Flask

The Comeback Community

Volunteer Full Stack Developer
(Remote)
  • Developed site in Go, gohtml, and CSS on Google App Engine
  • Mentored new developers in web development
  • Technologies used: Go, Google App Engine, gohtml, HTML5, CSS3, JavaScript

Skqrl

Software Engineering Intern
(La Jolla, CA)
  • Developed web crawler to compile needfinding and product data using Scrapy and Selenium
  • Designed and implemented an extensible product search solution designed to handle future user search needs
  • Technologies used: Python, Scrapy, Selenium, Django, MySQL, JavaScript

ModSpot

Software Engineering Intern
(Remote)
  • Implemented new user account, edit profile, and login designs in Objective-C for iOS application
  • Refactored analytics code for gathering statistics on app usage, helping designers make more informed choices
  • Technologies used: Objective-C, Cocoa Touch, Flurry Analytics

Extracurricular interests:

  • Aspiring triathlete
  • Sailing
  • Pottery
  • Guitar
  • Interior design
  • A budding interest in plants
  • Longboard construction and woodworking

Extra-Extracurricular interests:

Credentials:







Photograph of me shredding, circa 2009:

Corn, circa long ago, colorized:

Mixtape:

1Don’t forget to check out Poolsuite FM: the ultra-summer music player for the Macintosh Computer; transporting you to a virtual vacation where the sun never sets.