Fake AI-generated bug reports are driving open source developers crazy

Artificial intelligence is not only flooding social media with garbage, it is also apparently affecting the open source programming community. And similarly, fact-checking tools like X Community Notes struggle to refute Faced with a deluge of misinformation, contributors to open source projects lament the time wasted evaluating and debunking bug reports created using AI code generation tools.

The register reported Today on concerns raised recently by Seth Larson in a blog post. Larson is a security developer in residence at the Python Software Foundation who says he’s noticed an increase in “very low quality, spammy, LLM-blown security reports for open source projects.”

“These reports appear on their face to be potentially legitimate and therefore require time to refute,” Larson added. This could potentially be a big problem for the open source projects (i.e. Python, WordPress, Android) that power much of the Internet, as they are often maintained by small groups of unpaid contributors. Legitimate bugs in ubiquitous code libraries can be dangerous because they have a potentially very wide area of impact if exploited. Larson said he only sees a relatively small number of AI-generated adverse reports, but the number is increasing.

Another developer, Daniel Sternberg, called a bug maintainer for wasting his time with a report he thought was generated using AI:

You submitted what appears to be an obvious AI “report” in which you say there is a security issue, probably because an AI tricked you into believing that. You then waste our time by not telling us that an AI did this for you and then continue the discussion with even more crappy answers – apparently also generated by the AI.

Code generation is an increasingly popular use case for large language models, although many developers are still torn about how useful they actually are. Programs like GitHub Copilot or ChatGPT’s own code generator can be very effective in producing scaffolding, the basic skeleton code to start any project. They can also be useful for finding functions in a programming library that a developer may not be familiar with.

But as with any language model, they will hallucinate and produce incorrect code. Code generators are probabilistic tools that guess what you want to write next based on the code you give them and what they’ve seen before. Developers must always fundamentally understand the programming language they are working with and know what they are trying to build; Similarly, essays written by ChatGPT must be reviewed and edited manually.

Platforms like HackerOne offer bounties for successful bug reports, which may encourage some people to have ChatGPT scan a codebase for flaws and then submit erroneous flaws in LLM returns.

Spam has always existed on the Internet, but AI makes its generation much easier. It seems possible that we may find ourselves in a situation that requires more technologies such as CAPTCHAs for login screens to be used to combat this. An unfortunate situation and a big waste of everyone’s time.