AI Has Learned How to Code, And That's a Good Thing for Tech Jobs
September 14, 2023
Blog
Although it's by no means advanced enough to replace human developers, AI has nevertheless learned how to code. That makes it the perfect pair programmer for professionals—and even for less technical users. ..
Way back in 2011, venture capitalist Marc Andreessen said that software is eating the world1. Flash-forward to today, and it seems that the sentiments he expressed back then have aged like a fine wine. Unfortunately, it seems that wine's going to need a few more years in the cellar before it matures.
There's no denying that much of society has shifted from a hardware-based economy to one wrapped up in software and services. The average Western household now has access to technologies and conveniences which, even fifteen years ago, would have seemed like they came right out of a science fiction novel. Multiple industries have been turned upside down by emerging trends like distributed work and Software-as-a-Service (SaaS).
We've yet to reach the pinnacle of this transition. There's still one massive obstacle to overcome: There simply aren't enough software developers. Interestingly, software itself might be the solution to this problem.
"The number of jobs requiring software developers is increasing at a rate that vastly outpaces the number of skilled professionals entering the market to fill these roles," explains Becks Simpson, Machine Learning Lead at AlleyCorp. "Even for those already in a programmer role, most of their time is not necessarily spent coding new features but rather writing tests, patching security issues, reviewing code, and fixing bugs. These two factors make it even more important to boost the productivity of those in the workforce, and the most recent improvements in AI-driven natural language processing (NLP) models are making that a reality."
Simpson is, of course, referring to the latest iteration of these NLP models, known collectively as generative pre-trained transformers (GPTs). You're likely already familiar with at least one such tool, OpenAI's ChatGPT. In addition to being able to respond to human conversation with surprising accuracy, these new NLP models can translate between multiple languages—and not just human languages.
"By virtue of their size, underlying architecture, and training data and regime, GPTs can translate between many languages, including text to code," Simpson explains. "Embedding this powerful ability into tools that developers can use is already proving invaluable at making developers better at their jobs and unlocking software production for less technical folks."
Too Much Software, Not Enough Programmers
It's a great time to be a software developer because their skills have never been in higher demand.
You've probably read at least one article about the talent shortage that's been plaguing the IT sector for more than a decade. There simply aren't enough skilled professionals to fill all the new positions that keep opening. The explosive growth of the SaaS space and the global push for digital transformation and distributed work have only exacerbated this.
To put it another way, there are more software-based use cases and business applications than ever, but there aren't enough people to actually build the software.
"A survey by Code.org in 2017 showed an estimated 500,000 open programming roles available in the United States alone," notes Simpson. "Unfortunately, many go unfilled, especially since only 43,000 graduates entered the market that year; and that number continues to decrease. On top of that, the time it takes for a developer to become skilled enough to fill most development jobs ranges from 3 to 5 years.
"According to the US Bureau of Labor Statistics, by the time a programmer is ready to enter a more senior role, the number of available roles will have increased by 28 percent," she continues. "Even once a team of developers is in place within a company, the challenges don't stop. Requirements for building software, particularly in terms of quality, security, and speed of delivery, have grown increasingly complex."
Because of this increased complexity, developers must spend progressively less time on actual development and more time addressing security issues, testing already written code, and fixing bugs. Senior developers have even more responsibilities layered on top of that, as they must spend time both mentoring their juniors and performing code reviews. The result of all this is that most developers spend only around 30–40 percent of their time developing new features or optimizing existing code2.
That's the bad news. The good news is that there may be a solution in artificial intelligence (AI). AI is a nearly perfect fit for the bulk of this additional work, and by pairing AI with human developers, we could finally help software get over its nasty case of indigestion.
Why Developers and Machines Are a Match Made in Heaven
While we're still a very long way from sentient robots, deep learning-based NLP has made remarkable strides over the past several years. Older forms of NLP were highly specialized and had to be trained for a single specific purpose, like language translation or sentiment analysis. GPT-based models, on the other hand, are pre-trained via a massive quantity of data, allowing them to develop general language processing proficiency, which can then be fine-tuned for specific purposes.
Among other things, this has led to the emergence of several AI-driven tools intended to help developers improve both productivity and code quality.
"The models these tools use can parse code to identify bugs and flaws, effectively performing some of the more tedious parts of a code review," says Simpson. "Per AI-News, a few such tools released recently, like CodeGuru and DeepCode, were able to find vulnerabilities that were difficult for humans to identify as well as find that 50 percent of the pull requests studied had issues3. Additionally, modern NLP techniques improve developers' code quality and speed up development by helping auto-complete sections of code, monitor their code output for errors, and even auto-generate unit tests."
One of the leading names in this space is OpenAI, developer of ChatGPT. Their Codex algorithm can parse code with a surprising degree of accuracy. It's even capable of generating code based on prompts from a human user.
"Codex's capabilities come from the data it was trained with, both natural language fragments and a vast amount of code," Simpson continues. "A preliminary study from GitHub on its performance showed that for a mundane task like writing an HTTP server, leveraging AI alongside a developer reduced time to completion by half. The fact that the model underpinning this tool can auto-complete entire code sections from a single comment also makes coding vastly more accessible to beginners and less technical folks."
This lowers the barriers to software development, with AI translation capabilities enabling novice users to turn human language prompts into virtually any programming language, from Python, JavaScript, and C++ to SQL, NoSQL, or PostgreSQL.
Granted, there are still some significant limitations to this entire process, for example, the language models can't make anything too complicated. While static websites, simple functions, and basic code translation are all well within the realm of possibility, anything more sophisticated is currently too complex to produce through AI.
The models aren't always 100 percent accurate, either.
"Often, the code produced by tools like Codex is mostly correct but still requires some intervention from an experienced developer," says Simpson. "In this sense, the models can boost the productivity of a human coding tutor in that the human can take over when the AI runs into issues. This can also greatly increase the productivity of junior developers while decreasing the amount of supervision and senior input they require."
Bridging the Gap Between Human and Machine
These innovative software development tools are all powered by OpenAI's GPT architecture. Trained on text taken from across the internet, ranging from open-source repositories and comments on social media to blog posts and eBooks, GPT was originally designed to support more accurate, realistic language generation. However, the training had an unexpected and rather fascinating side effect: GPT was also able to generate code, a realization that led to the development of Codex.
But how exactly did this happen? And what is it about GPT that makes it so versatile? A few things, according to Simpson:
- The amount of data used for training.
- The fact that the models were trained in a multitask setting and a self-supervised fashion—a significant departure from the supervised, single-task training undergone by most neural networks.
- A larger number of parameters, allowing them to learn more nuanced patterns and understand more complex relationships in their data.
- Their state-of-the-art underlying architecture.
"Most neural networks are made to perform a single task, and as such, take specifically labeled data to learn how to do that task," Simpson explains. "GPT-3, by contrast, was trained to predict the next word in a sequence, so the data didn't need labeling. This is the backbone of many tasks like translation, text generation, and question answering."
Owing largely to their increased sophistication, transformer models such as GPT-3 and GPT-4 outperform older NLP benchmarks in several key ways. Instead of processing language on a word-by-word basis, they process entire sentences at once and then use an attention function to help them parse those sentences. The result is a larger model that not only can learn more but also no longer forgets relationships between words or struggles with recursion or parallelization.
"Since the GPT-3 and GPT-4 are available through an application programming interface provided by OpenAI, they can be incorporated into other AI for coding products, which democratizes access to coding even further," Simpson adds.
Conclusion
Demand for software developers greatly outstrips supply and will likely continue to do so for the foreseeable future.
But this no longer has to be a bottleneck for developing and distributing new software. Through AI, developers can considerably increase both their output and the quality of their code. This, I would argue, is exactly what AI was always intended for—not as a replacement for humans, but as a partner to them.
"With the recent dramatic improvements in AI-based NLP models, the dream of an AI-powered pair programmer for human developers is becoming a reality," Simpson concludes. "With such models embedded in their everyday tools, programmers stand to gain a great deal, while even junior developers and less technical folks can benefit from the text-to-code capabilities now available. Software may not be able to eat the world alone, but AI can certainly help."
Author's Notes:
1 Marc Andreessen. “Why Software Is Eating the World.” The Wall Street Journal, August 22, 2011. https://www.wsj.com/articles/SB10001424053111903480904576512250915629460.
2 Chris Grams. “How Much Time Do Developers Spend Actually Writing Code?” The New Stack, October 28, 2021. https://thenewstack.io/how-much-time-do-developers-spend-actually-writing-code/.
3 Ryan Daws. “DeepCode Provides AI Code Reviews for over Four Million Developers.” AI News, July 21, 2020. https://www.artificialintelligence-news.com/2020/07/21/deepcode-ai-code-reviews-four-million-developers/.