Gabriel Sarch

Sept 2025

“Grounded Reinforcement Learning for Visual Reasoning” accepted at NeurIPS 2025. See you in San Diego!

Sept 2025

Gave a talk on grounded reinforcement learning for visual reasoning at the PLI Seminar.

Sept 2025

Started a Postdoctoral Research Fellow position with Princeton Language and Intelligence (PLI) at Princeton University.

Sept 2025

Serving on the AISTATS 2025 Program Committee.

Aug 2025

Completed my Ph.D. in Machine Learning and Neural Computation at Carnegie Mellon University.

Aug 2025

“Out of Sight, Not Out of Context? Egocentric Spatial Reasoning in VLMs Across Disjoint Frames” appearing at EMNLP 2025.

May 2025

Released the ViGoRL preprint on grounded reinforcement learning for multimodal reasoning. Code, models, datasets are all open-source here.

March 2025

“Reanimating Images using Neural Representations of Dynamic Stimuli” accepted as an oral presentation at CVPR 2025.

March 2025

“Multimodal Interactive Contextualized Real World Task Assistance from a Single Demonstration” published in ACL Findings 2025.

Sept 2024

“VLM Agents Generate Their Own Memories” received a NeurIPS 2024 Spotlight recognition.

Aug 2024

Joined Yutori as Technical Staff (AI) to build multimodal model infrastructure through Dec 2024.

June 2024

“Towards Unified 2D-3D Visual Scene Understanding Foundation Models” spotlighted at CVPR 2024.

May 2024

Started a research internship at Microsoft Research working on the MICA real-time assistance system.

May 2024

Gave an invited talk on task planning with LLMs at Carnegie Mellon’s Search-based Planning Laboratory.

May 2024

Presented “Open-Ended Instructable Embodied Agents” at CMU Catalyst’s LLM Agents Seminar.

May 2023

Completed an M.S. in Machine Learning Research at Carnegie Mellon University.

Jan 2024

“HELPER-X: A Unified Instructable Embodied Agent” presented at the ICLR 2024 LLM Agents Workshop.

Oct 2023

Won the Embodied AI Workshop Rearrangement Challenge at CVPR 2023.

Aug 2023

“Open-Ended Instructable Embodied Agents with Memory-Augmented LLMs” published in EMNLP Findings 2023.

Aug 2023

“Brain Dissection: fMRI-trained Networks Reveal Spatial Selectivity” accepted at NeurIPS 2023.

March 2023

“3D View Prediction Models of the Dorsal Visual Stream” presented at CCN 2023.

March 2023

“Beyond Fixation: detailed characterization of neural selectivity in free-viewing primates” published in Nature Communications 2023.

Jan 2023

Delivered the brAIn Seminar talk “Spatial Processing During Natural Scene Viewing.”

Feb 2023

Gave an invited lecture in CMU’s Biologically Intelligent Exploration course on evidence-based decision making.

Feb 2022

Runner-up in the Amazon Alexa Prize SimBot Embodied Dialogue Challenge.

March 2022

“TIDEE: Tidying Up Novel Rooms using Visuo-Semantic Common Sense Priors” accepted at ECCV 2022.

June 2021

“Move to See Better: Self-Improving Embodied Object Detection” accepted at BMVC 2021.

May 2020

Awarded the NSF Graduate Research Fellowship to support graduate research through 2025.

Aug 2020

Began Ph.D. studies in Machine Learning and Neural Computation at Carnegie Mellon University.

2019

Awarded the University of Rochester Center for Visual Science Research Fellowship.

Hi, I'm Gabe Sarch

News

Selected Publications

Grounded Reinforcement Learning for Visual Reasoning

GH Sarch S Saha N Khandelwal A Jain MJ Tarr A Kumar K Fragkiadaki

NeurIPS 2025

Grounding Task Assistance with Multimodal Cues from a Single Demonstration

GH Sarch B Kumaravel S Ravi V Vineet A Wilson

ACL 2025 findings

VLM Agents Generate Their Own Memories: Distilling Experience into Embodied Programs of Thought

GH Sarch L Jang MJ Tarr K Marino W Cohen K Fragkiadaki

NeurIPS 2024 Spotlight (Top 2%)

Open-Ended Instructable Embodied Agents with Memory-Augmented Large Language Models

GH Sarch Y Wu MJ Tarr K Fragkiadaki

EMNLP 2023 findings

🔥[NEW!] in ICLR 2024 Workshop on LLM Agents: HELPER-X achieves Few-Shot SoTA on 4 embodied AI benchmarks (ALFRED, TEACh, DialFRED, and the Tidy Task) using a single agent, with just simple modifications to the original HELPER.

Brain Dissection: fMRI-trained Networks Reveal Spatial Selectivity in the Processing of Natural Images

GH Sarch MJ Tarr K Fragkiadaki* L Wehbe*

NeurIPS 2023

3D View Prediction Models of the Dorsal Visual Stream

GH Sarch HF Tung A Wang JS Prince MJ Tarr

CCN 2023

TIDEE: Tidying Up Novel Rooms using Visuo-Semantic Commonsense Priors

2023 CVPR Embodied AI Rearrangement Challenge winner

GH Sarch Z Fang AW Harley P Schydlo MJ Tarr S Gupta K Fragkiadaki

ECCV 2022

Beyond Fixation: detailed characterization of neural selectivity in free-viewing primates

JL Yates SH Coop GH Sarch R Wu D Butts M Rucci Jude Mitchell

Nature Communications

Move to See Better: Towards Self-Improving Embodied Object Detection

GH Sarch* Z Fang* A Jain* AW Harley K Fragkiadaki

BMVC 2021

See all my publications