Blog
White Papers

GenAI is writing your code. Are you managing the risks?

Generative AI is delivering transformative benefits for software development. Companies need to take seven steps to fully capture the benefits and manage the significant and growing risks.

Oct 26, 2023
15
min read
Share
X

Executive Summary

Generative AI (GenAI) is a transformational application of artificial intelligence to create and modify existing content across multiple formats – including not only human languages but also code. GenAI for code has already seen explosive adoption growth in the software development lifecycle, given the benefits to individual Engineers and their organizations. 

These benefits also come with six sets of risks relating to developer productivity, ethics, public policy, security, legal/intellectual property issues, and regulations. 

There are seven steps the C-Suite should take today to capture the full benefits, address current risks, and be ready for an increasingly fraught legal and regulatory environment.

Part 1: the Rise of GenAI Code

As anyone with internet access has observed, Generative AI (GenAI) usage has skyrocketed. 

This cutting-edge form of artificial intelligence can produce new content based on powerful neural networks combing through massive data sets. 

The use of GenAI doubled over the last five years, only to grow by 500% in the last year alone. 

The power of GenAI has led to explosive usage growth. McKinsey's 2022 survey found a doubling of usage vs the prior five years. But by 2023, ChatGPT, one of the leading GenAI tools, had 500% growth in less than a year and is now used by >80% of the Fortune 500. 

Readers will most likely have already experienced GenAI in the documents they have read, whether they know it or not. Generative AI can rapidly speed up outlining, drafting, editing, and more. If you're unsure, consider this piece your first confirmed exposure: Sema's research team used Perplexity.ai to conduct a literature review and summarize findings; the chatbot also cited and returned the referenced source material. This reduced research time by more than 500 percent. 

GenAI does not just work on human languages, however. It can produce imagery, audio, data – and code. GenAI can suggest new code, provide a code review, create tests, and assist with software maintenance.

As a result, generalist GenAI tools like ChatGPT and specialist tools like GitHub Co-Pilot have had significant, immediate adoption among Engineering teams. HeavyBit found that 63% of developers are using GenAI in coding tasks. Sonatype found that 97% of personnel in DevOps and Security are now using GenAI. 20,000 organizations now use GitHub Co-Pilot, a GenAI tool focused on engineers.

Part 2: the Benefits of GenAI Code

GenAI's use with coding in particular and the software development lifecycle (SDLC) in general is exploding because of the significant benefits to developers and organizations. 

For Engineers, the deeply technical advisory firm ThoughtWorks neatly summarized the three primary benefits

  • It can reduce the need for repetitive tasks, freeing up time for higher order problems
  • It can facilitate more complete planning and ideation
  • It reduces search time for information, a key component of Engineers’ work flow

As Sema's Co-Founder and Chief Scientist, Brendan Cody-Kenny observed: 

“GenAi's core benefit is that it reduces detailed development work, so developers are less likely to get stuck "in the weeds." This speeds up development but, more crucially, gives back brain space to work at a higher level, ultimately increasing innovation and value creation.”

These improvements are not theoretical but are already being realized. A GitHub CoPilot survey found that 60-75% of developers reported improved satisfaction when using its product. Specifically, 73% reported increases in staying in a flow state, and 87% reported preserving mental energy by avoiding repetitive tasks. Sema's qualitative interviews with developers reveal delight at the relatively high accuracy of GenAI results, relief at reducing time on unpleasant tasks, and excitement for faster code improvements such as more unit testing. 

With these benefits to Engineers, it is no surprise that the benefits to software and tech-enabled companies –  to say, almost all of the Fortune 10000 – are also significant. Ranges of increased developer productivity from GenAI range conservatively from 10-30% to 35-50% and up to 55%.

$900BN
Estimated Economic impact of GenAI on software development and IT

As a result, McKinsey estimated that two of the six corporate functions with the largest potential for disruption are code-related– software development and corporate IT. Their study estimated GenAI's total economic impact on these two functions to be almost $900 Billion.

Part 3: the Six Risks from GenAI Code

As with any new technology, the benefits from GenAI come hand in glove with significant risks. Six categories are worth noting here: risks related to productivity, ethics, public policy, security, legal/intellectual property issues, and regulations. 

First, the risk to productivity. If organizations do not implement GenAI well, it can have a negative impact on overall Engineering quality and effectiveness. 

One of the biggest and potentially non-obvious risks is overcoming developers' guilt about using GenAI, leading to incomplete adoption. Here's another developer from our interviews: 

"I know I should use my company's approved GenAI tool. The company bought me a license, and the CEO and the CTO both announced that experimenting with it was mandatory for devs. I've never heard the CEO announce a developer tool before. But I still feel like I'm cheating the company when I use ChatGPT… they are paying me to write code." 

Second, from an ethics perspective, the development community has raised concerns that companies are taking advantage of open-source code, which was the basis of training these models, without sufficient consent or compensation. 

Third, from a public policy perspective, GenAI could negatively affect software developer employment rates. 

This third risk may come to fruition. But previous innovations in technology generally and software development in particular have not yet dented the demand for software engineers. 

Sema is bullish on the long-term demand for software developers. As much as software has eaten the world, we believe that only a small portion of the potential gains from software have yet been realized. We expect companies to reinvest the productivity gains from GenAI into building more, better products faster, delivering greater economic value from sales growth rather than by realizing cost reductions.

Fourth, GenAI adds security risk by introducing undetected vulnerabilities, whether accidental or from bad actors. 

As Microsoft's Chief Security Advisor Terence Jackson wrote

“AI generation provides a novel extension of the entire attack surface, introducing new attack vectors that hackers may exploit. Through generative AI, attackers may generate new and complex types of malware, phishing schemes, and other cyber dangers that can avoid conventional protection measures. Such assaults may have significant repercussions like data breaches, financial losses, and reputational risks.” 

Fifth, GenAI introduces legal and intellectual property risks. 

One flavor of the intellectual property risk comes from data leakage – developers are sharing the company's proprietary code with unauthorized GenAI tools to get assistance with a coding problem. As a deep dive into one such event observed, "employees effectively leaked corporate secrets that could be included in the chatbot's future responses to other people around the world." 

Sema has collected evidence that GenAI bots are returning company-specific confidential information, as suggested, code involves terminology bespoke to specific organizations. 

A second risk comes from lawsuits and other enforcement efforts. Developers are already suing ChatGPT, arguing that as the creators of the code used to train the models, they have an ownership claim in the resulting code. 

With respect to these security and legal/intellectual property risks, in certain respects, GenAI is not so different from previous innovations. 

Consider the near-universal adoption of Open Source. Open Source code facilitates faster, more reliable development and generates security and intellectual property risks. Our recent study of codebases from $1T worth of software teams found that: 

  1. Most corporate codebases already contain 5% or more – sometimes even 25% – of code directly copied in from a third party source.
  2. Average Enterprise codebases have 1200-2000 high risk security warnings from Open Source code and 850-2300 high risk intellectual property warnings.
  3. Organizations had 77 times more hidden intellectual property risk than known IP risk (the former from in-file third-party code, the latter, referenced code).

There is a fundamental difference between GenAI versus Open Source, however. To illustrate this, recall using Perplexity.ai to research this whitepaper. That AI tool cites its results so the writers can directly review the source material to validate the results. Too, when Open Source code is used, the overwhelming majority of developers follow best practices and "cite" the original code by including its license file. 

This "traceability" is much harder for AI-generated code– it is not nearly as clear to trace the GenAI code back to its sources or even identify if it is generated. This means that the intellectual property risk is especially fraught. We expect a rise in GenAI legal compliance risk in the same order as Open Source legal compliance risk and a significant focus for insurers. 

As the National Law Review wrote: "Many companies and organizations, particularly those that develop valuable software products, cannot risk having open source code inadvertently included in their proprietary products or inadvertently disclosing proprietary code through insecure generative AI coding tools." 

The sixth and final risk is regulatory. The US government has already passed a series of AI-regulating rules and policies, with 94 more introduced in the current 118th Congress. The majority of US states have passed or are considering regulation

The European Union and the UK are preparing new regulations outside the US, and China has already adopted a comprehensive approach. 

Fundamentally, GenAI regulation is only at its onset, and companies should prepare now for the inevitable regulation, which will almost certainly involve a patchwork of state and national requirements that could come with significant usage disclosure and potentially industry-specific usage prohibition requirements. 

The practical conundrum is clear: if it is difficult to specifically demonstrate how code was generated through GenAI, users of GenAI in the software development process could face significant regulatory risks after the fact.

Part 4: What the C-Suite Needs To Do Now To Mitigate The Risk from GenAI Code?

Company leadership needs to take seven steps now to capture the benefits and minimize the risks of GenAI in software development.  

  1. Picking the right GenAI coding assistant. Not all GenAI is created equal, and Engineering, Security and Legal teams should weigh in on the suitable tool.
  2. Block non-approved tools. Organizations that manage application access would be wise to exclude alternative GenAI tools.
  3. Train, support, and reinforce appropriate developer usage. This is important both to realize the productivity and developer satisfaction gains– recall the Developer interviewed who was hesitant to use GenAI despite the C-Suite direction. And it is also critical to minimize the intellectual property and security risks, for example by preventing sharing sensitive code with the GenAI tools.
  4. Ensure security tooling and investment is sufficiently rigorous for the company’s size and stage. As noted above, despite 100% of Enterprise organizations having some security tooling, the average number of security warnings is in the thousands. Developers need proper tooling and dedicated roadmap time to remediate the findings. We know one leading enterprise software organization who made significant improvements by implementing security tooling and carrying out remediation over one quarter, leading to significant improvements. 
  5. Measure the level of generated AI in the code. Not only should the overall levels of GenAI code be measured, but it should be broken out by product, team and developer, and even GenAI included in the third party code that is adopted. In other words, create a GBOM (TM)– a GenerativeAI Bill of Materials (TM)– to complement the existing Software Bill of Materials (SBOM) for Open Source.
  6. Monitor regulatory and legal trends carefully. As noted above, we expect GenAI legal issues to mirror and potentially exceed the risks associated with Open Source. Companies need to be thinking now about how they’ll prepare for potential litigation and regulatory requirements.
  7. Be a good Open Source steward. This addresses both the legal / intellectual property risk and the ethical risk. This includes abiding by the licensing terms of Open Source projects used today, and also by contributing back to Open Source projects. That contribution can take the form of financial support and or dedicated Engineering time to support the project, at a level proportional to the organization’s stage and Open Source usage. We believe these good-faith efforts today will mitigate the legal risk tomorrow. 

Conclusion

As much as GenAI usage in code has exploded in the past 18 months, we are only at the beginning of the transformation of software development through GenAI.

By taking action now, companies can reap the rewards for their teams and their business outcomes while minimizing the known and unknown risks.

Sema is the leader in comprehensive codebase scans with over $1T of enterprise software organizations evaluated. It is now accepting pre-orders for GenAI code detection tool, AI Code Monitor.  

No items found.

Table of contents

Gain insights into your code
Get in touch

Are you ready?

Sema is now accepting pre-orders for GBOMs as part of the AI Code Monitor.

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.