Abstract
This paper aims to explore fundamental questions in the era when AI coding
assistants like GitHub Copilot are widely adopted: what do developers truly
value and criticize in AI coding assistants, and what does this reveal about
their needs and expectations in real-world software development? Unlike
previous studies that conduct observational research in controlled and
simulated environments, we analyze extensive, first-hand user reviews of AI
coding assistants, which capture developers' authentic perspectives and
experiences drawn directly from their actual day-to-day work contexts. We
identify 1,085 AI coding assistants from the Visual Studio Code Marketplace.
Although they only account for 1.64% of all extensions, we observe a surge in
these assistants: over 90% of them are released within the past two years. We
then manually analyze the user reviews sampled from 32 AI coding assistants
that have sufficient installations and reviews to construct a comprehensive
taxonomy of user concerns and feedback about these assistants. We manually
annotate each review's attitude when mentioning certain aspects of coding
assistants, yielding nuanced insights into user satisfaction and
dissatisfaction regarding specific features, concerns, and overall tool
performance. Built on top of the findings-including how users demand not just
intelligent suggestions but also context-aware, customizable, and
resource-efficient interactions-we propose five practical implications and
suggestions to guide the enhancement of AI coding assistants that satisfy user
needs.
Abstract
This study investigates the implementation and efficacy of DeputyDev, an
AI-powered code review assistant developed to address inefficiencies in the
software development process. The process of code review is highly inefficient
for several reasons, such as it being a time-consuming process, inconsistent
feedback, and review quality not being at par most of the time. Using our
telemetry data, we observed that at TATA 1mg, pull request (PR) processing
exhibits significant inefficiencies, with average pick-up and review times of
73 and 82 hours, respectively, resulting in a 6.2 day closure cycle. The review
cycle was marked by prolonged iterative communication between the reviewing and
submitting parties. Research from the University of California, Irvine
indicates that interruptions can lead to an average of 23 minutes of lost
focus, critically affecting code quality and timely delivery. To address these
challenges, we developed DeputyDev's PR review capabilities by providing
automated, contextual code reviews. We conducted a rigorous double-controlled
A/B experiment involving over 200 engineers to evaluate DeputyDev's impact on
review times. The results demonstrated a statistically significant reduction in
both average per PR (23.09%) and average per-line-of-code (40.13%) review
durations. After implementing safeguards to exclude outliers, DeputyDev has
been effectively rolled out across the entire organisation. Additionally, it
has been made available to external companies as a Software-as-a-Service (SaaS)
solution, currently supporting the daily work of numerous engineering
professionals. This study explores the implementation and effectiveness of
AI-assisted code reviews in improving development workflow timelines and code.