Welcome to Gaurav’s Homepage!
Note: This webpage was last updated on 10/20/2025.
About me
Hi folks, welcome to my personal homepage! I’m a second-year MS (thesis) student in Computer Science at Virginia Tech, and fortunately advised by Dr. Xuan Wang. I am also affiliated with the Sanghani Center for Artificial Intelligence and Data Analytics.
My research interests are in natural language processing, small and large language models, agentic systems and their applications in real-world domains, particularly in areas intersecting reasoning and efficiency. I have published papers in EMNLP main conference and other top-tier (Q1) journals. Most recently, I spent Summer 2025 at Dell Technologies’ Global Office of the CTO, where I designed an agentic AI approach to autonomous resource allocation and was the recipient of 3 Inspire Recognition Awards for positioning Dell PowerEdge as “AI-native” infrastructure.
Prior to joining Virginia Tech, I got my Bachelor’s degree in Computer Science from Manipal University Jaipur in July 2023. During my Bachelor’s program, I was fortunate to be supervised by Dr. Nitesh Pradhan and worked with Dr. Vijaypal Singh Dhaka and Dr. Mahesh Jangid. I was also the President’s Gold Medalist for Excellence in Research. After that I worked at Dell Technologies for 1 year as a Machine Learning Engineer. Before that, I spent 6 months at Swiggy’s Applied Research (Computer Vision) team.
Research Interests
I work on improving small language models in reasoning—pushing lightweight LMs to think deeper, act smarter, and collaborate like expert teams. My research spans natural‑language processing, complex reasoning, and model efficiency, all aimed at creating efficient, low‑cost AI systems. My current focus areas include:
Complex Reasoning in Small Language Models & Multi‑Agent Self‑Evolution: How far can carefully designed prompting, multi-agent debate, and iterative fine‑tuning push models with only a few billion parameters? I study emergent reasoning, chain‑of‑thought, and which facets of reasoning are kept or lost after compression—revealing when and why small models succeed or fail (see ThinkSLM). I also design systems where multiple LMs critique, refine, and distill each other’s outputs. Iteratively fine‑tuning the resulting “debate traces” lets a single model self‑evolve without human‑labeled data (see DEBATE, TRAIN, EVOLVE).
Benchmark-Free Evaluation & Overthinking in Language Models: Evaluating language models fairly is becoming harder as static benchmarks risk contamination by training data. I developed BeyondBench, an evaluation framework using algorithmic problem generation that creates mathematically grounded problems on the fly—each from a combinatorial space larger than 10^15 unique instances. I also study the accuracy-efficiency tradeoff: Do language models waste cognitive cycles on problems that humans solve almost reflexively? My work on Overthinking in LLMs reveals that reasoning models generate ~18× more tokens while sometimes achieving lower accuracy, with extended reasoning budgets yielding diminishing returns.
Agentic AI with Small Language Models: I envision a future where efficient, specialized AI agents powered by small language models can autonomously collaborate to solve complex real-world problems. My work bridges the gap between research and production by building systems that not only reason better but also act smarter—orchestrating multiple agents, managing resources intelligently, and delivering practical impact at scale with minimal computational overhead.
News
- [Oct. 8, 2025] New preprint on benchmarking the accuracy-efficiency tradeoff in language models for basic math reasoning!
- [Sep. 30, 2025] Released BeyondBench—the Benchmark-Free Reasoning Evaluation of LLMs Leaderboard!
- [Sep. 29, 2025] New preprint on benchmark-free evaluation of reasoning in language models!
- [Aug. 20, 2025] 🎉 Two of my papers (ThinkSLM and DEBATE, TRAIN, EVOLVE) got accepted to EMNLP 2025 Main Conference!
- [Aug. 13, 2024] 🎉 Received 3 Inspire Recognition Awards from Dell Technologies Global CTO’s office for my internship work in Agentic AI for Autonomous Resource Allocation!
- [Jul. 5, 2025] New preprint on basic math reasoning and overthinking in LLMs!
- [May. 27, 2025] Started my summer internship @ Dell Office of the CTO—Digital Skills Research!
- [May. 21, 2025] New preprint on self-evolution of language model reasoning!
- [May. 10, 2025] Released the LLMThinkBench Leaderboard! Currently there are 17 open-sourced and 4 proprietary models!
- [May. 4, 2025] Released the DataSense framework for data visualization and story generation using language models—install it with
pip install datasense! - [Apr. 22, 2025] Released ThinkSLM—the SLM Reasoning Leaderboard!
- [Apr. 5, 2025] Released the LLMThinkBench framework for evaluating basic‑math reasoning and over‑thinking in language models—install it with
pip install llmthinkbench! - [Feb. 17, 2025] New preprint on the reasoning abilities of small language models.
- [Oct. 9, 2024] Accepted a Summer 2025 internship offer at Dell Technologies as an AI Research Intern in the Global Office of the CTO (Round Rock, TX)!
- [Sep. 4, 2024] Joined Wang’s Group to work on reasoning, small language models, and large language models!
- [Aug. 6, 2024] Began my M.S. in Computer Science at Virginia Tech!
Honors and Awards
- 🏆 3 Inspire Recognition Awards for positioning Dell PowerEdge as “AI-native” infrastructure, Dell Technologies (2025)
- 🥇 President’s Gold Medal for Excellence in Research, Manipal University Jaipur (2023)
- 🥈 Runner-up, Dell IT Development Program (ITDP) FY’23 Hackathon, Dell Technologies (2023)
- 🪙 Ranked 13/473 globally in Bitgrit Generative AI Competition, Bitgrit (2023)
- 🪙 117/26,008, Amazon ML Challenge 2023, Amazon (2023)
- 🥇 Three-time recipient of the Student Excellence Award for publishing research, MUJ (2022 - 2023)
- 🥇 Best Research Project, Computer Science Department, Manipal University Jaipur (2022)
- 🥉 All India Grand Finalist, Precision Health Challenge 2021-22 Hackathon, Wipro GE Healthcare (2022)
- 🥉 All India Grand Finalist, India Automobile Hackathon, NEC and Mitsubishi (2022)
- 🥉 All India Grand Finalist, HACKBATTLE: Impact Through Data Hackathon, T-Systems (2022)
- 🥉 3rd Position, “Hack2Hire” Hackathon, Dell Technologies (2021)
- 🥇 Best Senior Hack, NPSiHacks, Devfolio (2021)
- 🪙 Kaggle 3X Expert (Top 20% in Competitions, Top 1% in Titanic, Digit Recognizer) (2020 - 2023)
