Publications

This is a list of my academic writings, including research papers, lecture notes, articles and slides for lectures, talks.

Code Researcher: Deep Research Agent for Large Systems Code and Commit History [arXiv]

Ramneet Singh, Sathvik Joel, Abhav Mehrotra, Nalin Wadhwa, Ramakrishna B Bairi, Aditya Kanade, Nagarajan Natarajan

See abstract.

Large Language Model (LLM)-based coding agents have shown promising results on coding benchmarks, but their effectiveness on systems code remains underexplored. Due to the size and complexities of systems code, making changes to a systems codebase is a daunting task, even for humans. It requires researching about many pieces of context, derived from the large codebase and its massive commit history, before making changes. Inspired by the recent progress on deep research agents, we design the first deep research agent for code, called Code Researcher, and apply it to the problem of generating patches for mitigating crashes reported in systems code. Code Researcher performs multi-step reasoning about semantics, patterns, and commit history of code to gather sufficient context. The context is stored in a structured memory which is used for synthesizing a patch. We evaluate Code Researcher on kBenchSyz, a benchmark of Linux kernel crashes, and show that it significantly outperforms strong baselines, achieving a crash-resolution rate of 58%, compared to 37.5% by SWE-agent. On an average, Code Researcher explores 10 files in each trajectory whereas SWE-agent explores only 1.33 files, highlighting Code Researcher’s ability to deeply explore the codebase. Through another experiment on an open-source multimedia software, we show the generalizability of Code Researcher. Our experiments highlight the importance of global context gathering and multi-faceted reasoning for large codebases.

INTERLEAVE: A Faster Symbolic Algorithm for Maximal End Component Decomposition [CAV'25 Paper, Master Thesis, GitHub]

Suguman Bansal, Ramneet Singh (Author list in alphabetic order of last name)

See abstract.

This paper presents a novel symbolic algorithm for the Maximal End Component (MEC) decomposition of a Markov Decision Process (MDP). The key idea behind our algorithm INTERLEAVE is to interleave the computation of Strongly Connected Components (SCCs) with eager elimination of redundant state-action pairs, rather than performing these computations sequentially as done by existing state-of-the-art algorithms. Even though our approach has the same complexity as prior works, an empirical evaluation of INTERLEAVE on the standardized Quantitative Verification Benchmark Set demonstrates that it solves 19 more benchmarks (out of 379) than the closest previous algorithm. On the 149 benchmarks that prior approaches can solve, we demonstrate a 3.81x average speedup in runtime.

PolyVer: A Compositional Approach for Polyglot System Modeling and Verification [arXiv]

Pei-Wei Chen, Shaokai Lin, Adwait Godbole, Ramneet Singh, Elizabeth Polgreen, Edward A. Lee, Sanjit A. Seshia

See abstract.

Several software systems are polyglot; that is, they comprise programs implemented in a combination of programming languages. Verifiers that directly run on mainstream programming languages are currently customized for single languages. Thus, to verify polyglot systems, one usually translates them into a common verification language or formalism on which the verifier runs. In this paper, we present an alternative approach, PolyVer, which employs abstraction, compositional reasoning, and synthesis to directly perform polyglot verification. PolyVer constructs a formal model of the original polyglot system as a transition system where the update functions associated with transitions are implemented in target languages such as C or Rust. To perform verification, PolyVer then connects a model checker for transition systems with language-specific verifiers (e.g., for C or Rust) using pre/post-condition contracts for the update functions. These contracts are automatically generated by synthesis oracles based on syntax-guided synthesis or large language models (LLMs), and checked by the language-specific verifiers. The contracts form abstractions of the update functions using which the model checker verifies the overall system-level property on the polyglot system model. PolyVer iterates between counterexample-guided abstraction-refinement (CEGAR) and counterexample-guided inductive synthesis (CEGIS) until the property is verified or a true system-level counterexample is found. We demonstrate the utility of PolyVer for verifying programs in the Lingua Franca polyglot language using the UCLID5 model checker connected with the CBMC and Kani verifiers for C and Rust respectively.

Lecture Slides on Incorrectness Logic [Google Drive, GitHub]

See details.

These are slides based on Prof. Peter O’Hearn’s POPL'19 Paper on Incorrectness Logic, presented over two 1.5-hour lectures for the course COL731: Advanced Compiler Techniques offered by Prof. Sorav Bansal at IITD in Sem I, 2023-24.

The slides do not cover Section 5 of the paper because we decided to skip it in class. Most of the content borrows from the paper, except a few new examples, typo-fixes and reorganisation. I am deeply grateful to Prof. O’Hearn for writing a clear, accessible and interesting paper (and of course, for the content in these slides).

Slides on Compositional Reasoning for Weak Memory Models [Google Drive, GitHub]

See details.

These slides cover the following two papers on Compositional Reasoning for Weak Memory Models:

  1. Coughlin, Nicholas, Kirsten Winter, and Graeme Smith. “Compositional Reasoning for Non-Multicopy Atomic Architectures.” Formal Aspects of Computing, December 14, 2022, 3574137.

  2. Coughlin, Nicholas, Kirsten Winter, and Graeme Smith. “Rely/Guarantee Reasoning for Multicopy Atomic Weak Memory Models.” In Formal Methods, edited by Marieke Huisman, Corina Păsăreanu, and Naijun Zhan, 292–310. Lecture Notes in Computer Science. Cham: Springer International Publishing, 2021.

Mrunmayi Bhalerao and I presented these as part of the course Special Module in Concurrency : Formal Verification of Concurrent Programs (COV889) offered by Prof. Subodh Sharma at IIT Delhi, Sem II 2022-23. Most of the content in the slides borrows from the papers. We acknowledge and thank the authors for this content, and for writing these wonderful papers.

Lecture Notes on λ-definability ≡ Recursiveness, the Scott-Curry Theorem (Undecidability of β-equality in the untyped λ-Calculus) [Google Drive]

See details.

These notes were written for the course COL832: Proofs and Types, offered by Prof. S. Arun-Kumar at IITD in Sem I, 2022-23. In this lecture, we have two main goals:

  1. Complete a proof of the equivalence between the class of partial recursive functions and the class of λ-definable partial numeric functions.

  2. Prove the Scott-Curry Theorem (which implies undecidability of β-equality in the untyped λ-Calculus).

We start by defining the class of partial recursive functions and the notion of λ-definability and then look at two theorems proving the equivalence. Then, we define an encoding for the λ-terms to Church numerals, and use that to prove the Second Recursion Theorem. Using the Second Recursion Theorem, we prove the Scott-Curry Theorem, and end with a corollary showing the undecidability of β-equality in the λ-Calculus.

Report on “Program Synthesis Meets Smart Contracts” [Google Drive]

See details.

This was written as part of an Independent Study I did on Automatic Synthesis of Smart Contracts with Prof. Subodh Sharma in Sem II, 2021-22.

The article is about applications of program synthesis techniques to the field of smart contracts. We first discuss program synthesis and outline three general challenges that any program synthesis system has to solve. Then, we give a brief introduction to smart contracts, highlighting the aspects useful for their formal specification, modelling and synthesis. Next, we illustrate one important application area, namely optimisation, through discussing the paper “Synthesis of Super-Optimized Smart Contracts Using Max-SMT”. We explain how optimisation is framed as a synthesis problem, and examine how the paper deals with two important challenges in program synthesis - target specification and program space representation. We end with some directions for future work, discussing broad challenges in applying program synthesis to smart contracts.