Generative AI tools are increasingly being integrated into professional workflows and business applications. This raises questions of how such tools will affect the productivity of workers and labor demand. This paper presents results of early experimental research on these questions for software developers and discusses the results in the context of the broader automation literature. The technology industry and software development in particular were early adopters of these tools, and as such may offer a preview of impacts across the economy. We also highlight several policy implications, including the need for education and upskilling as well as ways to encourage AI development to complement and not substitute for labor. We intend this paper to be illustrative of expected changes to work and policy implications arising from the proliferation of generative AI tools.
Generative AI uses deep learning models trained on very large datasets — including text, images, video, audio — to yield systems capable of content generation, preliminary reasoning, and flexible question-answering. These systems have demonstrated professional-level performance on a number of educational and vocational exams as well as qualitative results that impress professional software developers. These capabilities and outputs have led many to speculate about their implications for knowledge work.
Software development and the technology industry are early adopters of generative AI technologies. Prominent examples include GitHub Copilot (Note that one of the authors is a GitHub employee), Amazon CodeWhisperer, and Replit Ghostwriter, among others. AI has demonstrated uses across the software development lifecycle, from brainstorming applications, suggesting code, writing documentation, to running tests against a specification, scanning code for vulnerabilities before releasing and deploying it. Some applications can produce outputs useful across multiple stages in the lifecycle, for example taking an image input, translating that into a specification, and subsequently generating code. Such applications in software development have large implications for productivity of developers and society at large.
Increasing developer productivity
Recent studies have aimed to determine the productivity effects of generative AI tools in experimental settings (Peng et al. 2023; Vaithilingam et al. 2022; Mozannar et al. 2022; Campero et al. 2022; Noy and Zhang 2023). Although this literature as it pertains to software development largely focuses on a single tool — GitHub Copilot, an AI-pair-programming application that suggests code completions as a developer writes software code or plain text comments — we expect that these studies provide valuable insights into the potential effects and benefits of AI tools in software development generally.
Similarly, Vaithilingam et al. (2022) conducted a randomized controlled experiment with GitHub Copilot, recruiting participants from two research institutions. They developed a within-subject study where each participant completed two tasks: one with access to Copilot, and another without. Unlike Peng et al. (2023), Vaithilingam et al. (2022) did not find any statistically significant effect on completion time between the treated and control groups. However, participants in the study still favored using Copilot in their programming workflow. One possible explanation for this discrepancy could be the participants’ lack of instruction about Copilot, as some participants face difficulties in understanding, editing, and debugging code snippets provided by GitHub. One conclusion from this research is the importance of familiarity with these tools and a potential learning effect: as developers use AI tools, the productivity impact can increase.
Campero et al. (2022) investigates the impact of GPT-3 on the performance of programmers and non-programmers in writing HTML code separately. In the first experiment, human programmers with expertise in HTML are asked to perform coding tasks with and without GPT-3. In the second experiment, the same design is used with 50 individuals with no programming experience. The study reveals that GPT-3 significantly enhances human performance in both experiments, with programmers achieving a 27% speed improvement and non-programmers, who could not complete the task without GPT-3, achieving performance as high as that of human programmers.
Mozannar (2022) takes a different approach and examines how software developers interact with Copilot in an experimental setting. Using detailed telemetry data, they develop a taxonomy of common programmer activities and classify them into two categories: those that require interacting with Copilot, such as “verifying suggestions” and “waiting for suggestions,” and those that do not require interaction with Copilot, such as “writing new functionality” and “thinking about new code to write.” To understand how developers allocate their time across these activities, they conducted a user study with 21 programmers solving coding tasks with Copilot. The main finding of the paper is that nearly half of the participants’ time was spent explicitly interacting with Copilot as developers double-checked and edited Copilot suggestions.
The effects of generative AI tools are not limited to software developers and evidence from other occupations can help us understand the implications. In an online experiment, Noy and Zhang (2023) show that college-educated workers that were exposed to ChatGPT completed a writing task faster (by 0.8 standard deviations) and with higher quality of output (by 0.4 standard deviations). Echoing the results by Peng et al. (2023) and Campero et al. (2022), inequality between workers decreases, as ChatGPT reduces the productivity distribution by benefiting low-ability workers more.
Considering impacts on the labor market
Earlier AI and automation technologies have primarily impacted occupations with a large proportion of routine tasks, such as machine operators and administrative assistants. In contrast, generative AI tools have the potential to impact various non-routine task occupations, such as teachers and designers (Felten et al. 2023). Software development provides an early example to demonstrate the impact of the new AI technologies on knowledge workers.
Productivity shocks through technological innovations are not new for software development. The history of computers has seen a number of waves of innovation, where new technologies transformed how software gets built. Punch cards, compilers, high-level programming languages, open source packages, integrated development environments, and code scanning are but a few examples of these technological advances. Over the years, these advances have played a major role in several trends in the technology sector: decreasing costs, increasing demand, and developers being both better off and greater in number.
First, costs associated with software and computing have fallen over time. For example, the U.S. Consumer Price Index for Information Technology, Hardware and Services has fallen 92.9% since it was introduced in 1988. Second, at the same time, demand for software has increased tremendously as digitization transforms more industries and other facets of life. The significant increase in demand for software, coupled with the declining costs, has led to an increase in demand for software developers. Third, the data show that software developers have fared well during these transitions. From 1999 to 2021, for example, Bureau of Labor Statistics (BLS) data show that people working in Computer and Mathematical Occupations increased by 77.7%. Their median wage also increased by 15.4% in real terms during this period.1 The BLS projects that this demand for developers will continue to grow in the years ahead.
Thus, the potential of generative AI to greatly increase developer productivity does not necessarily promise fewer software developers. In general, technological change may either complement or substitute for workers. Regarding substitution, the literature on automation considers how new technologies may displace specific occupational tasks. Workers displaced from these tasks may shift what they do in their daily job and in turn displace other workers, creating ripple effects in the economy. Insofar as developers may spend less time coding, they could spend additional time on other tasks, for example, consulting with customers or analyzing system performance. Acemoglu and Restrepo (2022) have shown that, historically, new technologies have displaced workers specialized in routine tasks. However, the definition of a routine task is technology-dependent. This requires an updated definition for knowledge-work tasks that can be automated with generative AI, building on recent approaches, such as those proposed by Brynjolfsson et al. (2018) for suitability for machine learning broadly and by Eloundou et al. (2023) for large language models, a type of generative AI.
Eloundou et al. (2023) and Felten et al. (2023) offer forward-looking projections of occupations’ exposure to large language models, with both indicating that software and programming-related occupations are highly exposed. The researchers make no claims as to whether exposure will substitute for or complement labor. Such projections might have us expect that developers with some tasks or task-relevant skills impacted by automation would transition to other development tasks and skills. Such shifting of professional specialties may have little impact on wages, as was seen in the sunsetting of Adobe Flash (Horton and Tambe 2020). Or task change may take the form of reduced drudgery or time searching for information. Additional experimental and observational studies of generative AI systems deployed in real-world settings will provide useful data to evaluate initial projections and understand whether AI complements or substitutes software developers.
Not all technologies are automation, however; they can also be an augmenting tool for existing tasks or for simply replacing old technologies. In the former, augmenting case, technologies complement labor by increasing the productivity of workers or leading to the creation of entirely new tasks. In the latter case, tasks may already be performed by technology, and further innovation simply represents capital deepening — with new technology supplanting old technology and representing an increase in capital per worker in the economy — with no net-new automation (Acemoglu and Restrepo 2018). For example, if a generative AI tool were to replace an automated rule-based code scanning system, this would not be expected to change net demand for human labor. One may argue that the integration of generative AI tools like GitHub Copilot into code editors that have previously supported other forms of autocomplete similarly constitutes capital deepening.
We may also expect new tasks to be created as developers work on new categories of software, as has happened in the past with the rise of personal computing, web, mobile, cloud, and more. Such tasks often result in entirely new jobs. For example, Autor et al. (2022) find that 74% of people employed in professional occupations in 2018 worked in job titles that did not exist in 1940. Looking at the history of software development and early productivity results with generative AI, we would expect this to continue: Although what they build and the tools they use will change, developers will likely be in greater demand than ever.
Generative AI-powered tools for software development are a recent invention. If history is any indication, large-scale effects will take time, possibly much longer than those building the technology anticipate (David 1990). Amid these changes, public policy will have an important role in guiding the widespread adoption and beneficial use of generative AI.
Although it is early, there is promise that these tools may boost the professional performance of developers and narrow the gap between best and worst performers. More research is warranted, but if preliminary findings hold, then these tools could empower more people to move into careers in the technology industry, helping to meet growing demand for software developers in many economies.
Software developers are constantly learning. They regularly adopt new technologies, tools, and frameworks in order to stay relevant. Skills that employers expect of software engineers change considerably over time, with more than a 30% change between 2015 and 2022 according to LinkedIn data. Thus, the learning involved in adopting generative AI tools resembles more a continuation than a discontinuity. However, there is reason to suspect that generative AI may make more professions resemble software development, where continual learning and adoption of new tools is rewarded and possibly necessary. Employers and policymakers alike should take steps to support workers in lifelong learning. In the short-term, employers should encourage their employees to experiment with generative AI tools and to share what they have learned, whether in dedicated brainstorms, hackathons, or other informal settings. Over time, as best practices and curricula on how to effectively use these tools emerge, policymakers should encourage firms to provide time and support to employees in order to build these new skills.
Generative AI may empower more people to write code, even if they don’t work in a technical role. As the economy continues its transformation towards increasing digitization, learning to code remains a valuable skill. Even as AI may generate a greater proportion of software code, the mental model of articulating a specification or prompting for specific functions will help workers make the most of generative AI tools. Policymakers should support efforts to increase educational opportunities to use these tools in school, responsibly and in acknowledgement of their limitations. This starts with ensuring basic digital equity, including that students have adequate access to broadband internet and access to computer science in primary education. Whether formally in classes or informally in hackathons or clubs, expanding the opportunities that students and non-students alike have to use these tools in supervised ways can help adapt the workforce to the age of AI.
Although there is reason to believe that the productivity effects of widespread adoption of these tools may take time to manifest, underlying generative AI technology is expected to continue to advance. Once developers adopt these tools and organizations integrate them into processes, further changes can be expected as the underlying AI models can be swapped out for improved ones. The result could be gradual adoption punctuated by a period of rapid change in professional workflows as future AI models demonstrate capability improvements.
In building new generative AI tools, it is important to do so with an understanding of their possible impacts on users’ livelihoods. Klinova and Korinek (2021) offer an initial framework for firms building AI tools to evaluate their possible effects on employment opportunities and inequality. The Partnership on AI’s Shared Prosperity Initiative is building on this work to help those developing AI tools steer development to benefit workers. Policymakers can shift incentives to support AI development that complements instead of substitutes for labor by, among other things, reversing the status quo in the U.S. of labor being taxed more heavily than capital (Brynjolfsson 2022). In the longer term, novel approaches such as a universal basic income that automatically scales with the non-labor share of national income may be warranted (Korinek and Juelfs 2022). Today, as policymakers weigh responses, job impact projections like those from Eloundou et al. (2023) and Felten et al. (2023), validated with experimental results including those summarized in this paper for software development, can provide information for targeting and prioritizing resources to vulnerable workers.
As with other general purpose technologies before it, generative AI will bring significant economic benefits as it is adopted across society. Effective policy can encourage responsible use of the technology in order to seize these benefits while ensuring that the workforce is ready to adopt and benefit from these innovations. Software developers are early adopters of these technologies and offer policymakers a leading indicator for the future of work.
Acemoglu, Daron, and Pascual Restrepo. 2018. “Artificial Intelligence, Automation, and Work.” In The Economics of Artificial Intelligence: An Agenda, 197–236. University of Chicago Press. https://www.nber.org/books-and-chapters/economics-artificial-intelligence-agenda/artificial-intelligence-automation-and-work.
Acemoglu, Daron, and Pascual Restrepo. 2022. “Tasks, Automation, and the Rise in U.S. Wage Inequality.” Econometrica 90 (5): 1973–2016. https://doi.org/10.3982/ECTA19815.
Autor, David, Caroline Chin, Anna Salomons, and Bryan Seegmiller. 2022. “New Frontiers: The Origins and Content of New Work, 1940–2018.” NBER Working Paper 30389. https://www.nber.org/papers/w30389.
Brynjolfsson, Erik. 2022. “The Turing Trap: The Promise & Peril of Human-Like Artificial Intelligence.” Daedalus 151 (2): 272–87. https://doi.org/10.1162/daed_a_01915.
Brynjolfsson, Erik, Tom Mitchell, and Daniel Rock. 2018. “What Can Machines Learn and What Does It Mean for Occupations and the Economy?” AEA Papers and Proceedings 108 (May): 43–47. https://doi.org/10.1257/pandp.20181019.
Campero, Andres, Michelle Vaccaro, Jaeyoon Song, Haoran Wen, Abdullah Almaatouq, and Thomas W. Malone. 2022. “A Test for Evaluating Performance in Human-Computer Systems.” arXiv. https://doi.org/10.48550/arXiv.2206.12390.
David, Paul A. 1990. “The Dynamo and the Computer: An Historical Perspective on the Modern Productivity Paradox.” The American Economic Review 80 (2): 355–61.
Eloundou, Tyna, Sam Manning, Pamela Mishkin, and Daniel Rock. 2023. “GPTs Are GPTs: An Early Look at the Labor Market Impact Potential of Large Language Models.” Preprint, submitted 17 March, 2023. https://doi.org/10.48550/arXiv.2303.10130.
Felten, Edward W., Manav Raj, and Robert Seamans. 2023. “Occupational Heterogeneity in Exposure to Generative AI.” SSRN Electronic Journal. https://doi.org/10.2139/ssrn.4414065.
Horton, John J., and Prasanna Tambe. 2020. “The Death of a Technical Skill.” https://www.john-joseph-horton.com/papers/schumpeter.pdf.
Klinova, Katya, and Anton Korinek. 2021. “AI and Shared Prosperity.” In Proceedings of the 2021 AAAI/ACM Conference on AI, Ethics, and Society, 645–51. Virtual Event USA: ACM. https://doi.org/10.1145/3461702.3462619.
Korinek, Anton, and Megan Juelfs. 2022. “Preparing for the (Non-Existent?) Future of Work.” NBER Working Paper 30172. https://www.nber.org/papers/w30172.
Mozannar, Hussein, Gagan Bansal, Adam Fourney, and Eric Horvitz. 2023. “Reading Between the Lines: Modeling User Behavior and Costs in AI-Assisted Programming.” Preprint, submitted 25 October, 2022. https://doi.org/10.48550/arXiv.2210.14306.
Noy, Shakked, and Whitney Zhang. 2023. “Experimental Evidence on the Productivity Effects of Generative Artificial Intelligence.” SSRN Electronic Journal. https://doi.org/10.2139/ssrn.4375283.
Peng, Sida, Eirini Kalliamvakou, Peter Cihon, and Mert Demirer. 2023. “The Impact of AI on Developer Productivity: Evidence from GitHub Copilot.” Preprint, submitted 13 February, 2023. https://doi.org/10.48550/arXiv.2302.06590.
Vaithilingam, Priyan, Tianyi Zhang, and Elena L. Glassman. 2022. “Expectation vs. Experience: Evaluating the Usability of Code Generation Tools Powered by Large Language Models.” In CHI Conference on Human Factors in Computing Systems Extended Abstracts, 1–7. New Orleans LA USA: ACM. https://doi.org/10.1145/3491101.3519665.