The parallels of AI and open source in software development

Aug 25, 2023 | 4 min read

Table of Contents

The beginnings of open source
Open source parallels with AI
Rise of AI in Software Development webinar series

The front page news about generative artificial intelligence (GAI) taking over software development from poor human developers has waned a bit. But there is no doubt that the technology will continue to transform the software development space over time. With AI come challenges that human managers need to address; reminiscent of how the use of open source ate the software world, AI too, demands particular consideration in the software development and security spaces. As is always the case with history, there are lessons to be learned.

The beginnings of open source

Freely exchanged source code goes back to software’s earliest days. Richard Stallman wrote the first GPL license in the late ’80s, and Christine Peterson coined the term “open source” in 1998. In the early 2000s, developers began to incorporate freely available open source into software they were writing for their corporate employers. It was all very grassroots and under the radar.

The Free Software Foundation sued Cisco after the company acquired Linksys, claiming improper use of GPL-licensed Linux code. This highly publicized case prompted lawyers throughout the tech world to start wrapping their heads around these inside-out license terms. The Cisco case was ultimately settled out of court, but only after shining a light on a big challenge with this new approach to development.

By 2010, the cat was well out of the bag. Both the usage and the supply of open source were exploding. Between 2009 and 2015, the Black Duck® KnowledgeBase grew 10-fold to 1.5M open source components. On the usage side, by 2019, over half the code in an average “proprietary” application was open source. Lawyers scrambled to understand the legal risks of this ever-expanding open source usage. Bewildered companies had to run to catch up with their developers, who insisted on increasing their reliance on open source as they demonstrated its great productivity benefits.

In 2014, a new sort of open source risk hit the radar when a Codenomicon (now part of Black Duck) engineer discovered Heartbleed, a security vulnerability in OpenSSL affecting a half million webservers around the globe. This was the first of a number of named vulnerabilities to surface in open source components.

By the end of the decade, many companies had put open source program offices in place along with corporate open source policies, processes, education, and tools. In the 2020s, open source is fully mainstream and comprising more than 75% of an average application. Management is as important as ever and underscored by the rise in SBOM requirements.

Open source parallels with AI

Enter Generative AI for software development. “Those who cannot remember the past are condemned to repeat it,” said George Santayana. It all feels quite similar, but with a compressed timeline. The roots of AI go back as far as the early days of open source (and software for that matter). However, there was very little adoption of GAI in software until 2023. GitHub Copilot showed up in Visual Studio just over a year ago to little fanfare. And then the debut of ChatGPT in November 2022 seemed to set the world on fire.

However, there are more similarities than differences in a comparison of the impact of GAI and open source on software development. First, the impetus: Faster, better, cheaper. Developers are always under pressure to get more done quicker. This was the attraction of open source: faster development by not reinventing the wheel. GAI purports to create a new wheel for you. This made corporate heads spin as much as had the news of developers leveraging millions of free, source downloadable software components.

Also like open source, adoption of GAI in software has been grassroots and under the radar. Upon hearing of this new technology and the potential of having machines write code, many board rooms realized they would have to consider its future impact on their software. Only then did they learn their developers had already been leveraging AI-generated code for months.

“No, stop,” was the reaction of many companies. And with good reason. In April, unwitting Samsung engineers lost sensitive data to ChatGPT. Around the same time as Samsung’s issues, a high-profile lawsuit piled on corporate concerns. The new case is a class action suit against several companies behind GitHub Copilot, alleging software piracy. There’s a question as to whether it is legally kosher to use these tools, at least in cases where they seem to cut/paste problematically licensed code verbatim.

So, as with open source, companies are caught between the demonstrated benefits of a new way to develop software and its demonstrated risks. The lesson of open source is that the answer lies in governance and management. Every organization needed strategy, policies, process, and tools to use open source safely, and they needed to invest in educating developers about the risks, lest they circumvent controls—developers are clever. GAI, a seemingly unstoppable component of future software development, requires similar treatment.

GAI’s adoption speed makes it particularly challenging. Putting these measures in place in the face of uncertainty, with pending lawsuits and such, suggests that companies need to monitor and adapt. What is clear now, though, is that software development organizations need to track GAI use and be mindful of the limitations of the technology. And they need to use the most modern tools to test and ensure the quality and security of generated code—and ensure they are not infringing other parties’ IP.

Rise of AI in Software Development webinar series

To continue exploring this topic, our webinar series discusses issues with respect to GAI, the associated legal issues and risks, and mitigation strategies with the leading legal minds and practitioners. This series covers