Monday, June 24, 2024

Developers are really not happy that OpenAI gets to feed their Stack Overflow posts into ChatGPT

Must read

  • For developers, Stack Overflow is a vital resource.
  • Over the years, the Q&A platform has helped them work through the ups and downs of programming.
  • Concerns are now being raised after OpenAI struck a deal to use its data to train ChatGPT.

You can bet most developers have turned to Stack Overflow at some point in their careers.

Since 2008, the site has served as a near-indispensable Q&A resource for tech workers trying to find solutions to programming problems, improve their working knowledge, or simply connect with others navigating the software business.

According to Stack Overflow, a new question is asked on the site, on average, every 14 seconds, with almost 60 million questions and answers tallied to date. In 2021, the company was acquired by European investment group Prosus for $1.8 billion.

That makes it a resource with serious value — and one that developers will, understandably, be willing to get pretty protective over.

It’s perhaps little surprise that some Stack Overflow users have started to kick up a fuss after the wealth of information they’ve contributed to the site over the years just became the target of a data-hungry company: OpenAI.

OpenAI deal stirs controversy

On Monday, the ChatGPT maker and Stack Overflow announced a partnership to give OpenAI users and its customers “the accurate and vetted data foundation that AI tools need” to solve their problems.

OpenAI noted it would “surface validated technical knowledge from Stack Overflow directly in ChatGPT” to give users “easy access to trusted, attributed, accurate, and highly technical knowledge and code backed by the millions of developers” on the almost 16-year-old site.

For OpenAI, the deal is a no-brainer.

Its AI models, like GPT-4, benefit hugely by being trained on as much data as possible. If trained on highly technical, specialized data like that found on Stack Overflow, the models will do better when responding to ChatGPT users’ prompts.

Stack Overflow is also seeking to benefit from the partnership by using OpenAI’s models in the development of OverflowAI. The product, unveiled in July 2023, was the company’s attempt to integrate generative AI features into its services.

Stack Overflow CEO Prashanth Chandrasekar is shown on stage introducing a new product called Overflow AI

Stack Overflow CEO Prashanth Chandrasekar introduces Overflow AI.

Stack Overflow

However, some developers dedicated to Stack Overflow have started to vent their frustrations.

On Mastodon, an open-source social media service, one Stack Overflow user shared that they tried to delete their “highest-rated answers” on the site to protest the deal with OpenAI.

“Stack Overflow does not let you delete questions that have accepted answers and many upvotes because it would remove knowledge from the community,” the user named Ben wrote. “So instead I changed my highest-rated answers to a protest message.”

Within an hour, the user said their attempt to change their highest-rated answers had been reverted, with their account subsequently suspended for seven days.

A screenshot of Stack Overflow’s response, shared by Ben, said: “You have recently removed or defaced content from your posts. Please note that once you post a question or answer to this site, those posts become part of the collective efforts of others who have also contributed to that content.”

Ben continued on Mastodon, suggesting that this was “a reminder that anything you post on any of these platforms can and will be used for profit,” and that “it’s just a matter of time until all your messages on Discord, Twitter etc. are scraped, fed into a model and sold back to you.”

Meanwhile, on the Stack Overflow user forum, another user based in Europe asked “where is the opt-out option so my answers don’t get used by OpenAI?” while raising the question of whether the European Union’s data-privacy rules would allow them to remove their responses from the site.

Users also weighed in on the deal on X.

Emily Bender, professor at the University of Washington, criticized the partnership on Thursday, writing: “I would like to remind the world that you actually don’t have to get into bed with OpenAI. StackOverflow was a beacon of resistance, but I guess their principles were for sale after all.”

In a post on Wednesday, Gergely Orosz, author of The Pragmatic Engineer newsletter, wrote: “What is your reaction, as a dev, when you realize your efforts to help other devs with their problems (by answering questions on StackOverflow) is now a way for StackOverflow to sell this data for OpenAI train ChatGPT to perform better?”

Stack Overflow did not respond to a request for comment.

It’s clear that people are upset about this deal, but it’s worth noting that it’s not the first time OpenAI’s use of data produced by others has raised concerns.

The company faces several lawsuits from creators, such as artists and authors, who claim OpenAI is using their work without permission to pursue profit.

With Stack Overflow and OpenAI now publicly working together, developers will have to consider how comfortable they are with ChatGPT using their insights.

Latest article