OpenAI has had a system for watermarking text generated by ChatGPT, along with a tool to detect this watermark, ready for about a year, according to a report by The Wall Street Journal. However, the company is internally divided over whether to release it. While watermarking appears to be a responsible measure, it might negatively impact the company’s financial performance.
The watermarking technique involves altering the model’s word prediction process to create a detectable pattern, simplifying the concept. For a more detailed explanation, Google’s description of Gemini’s text watermarking provides further insight.
The introduction of a detection method for AI-generated text could be particularly useful for educators seeking to prevent students from using AI for their assignments. The Wall Street Journal notes that watermarking does not compromise the quality of the chatbot’s text output. A company-commissioned survey revealed that global support for an AI detection tool outweighed opposition by a margin of four to one.
Following the Wall Street Journal’s story, OpenAI acknowledged its work on text watermarking in a blog post update spotted by TechCrunch. The company claims its watermarking method is highly accurate (“99.9% effective” according to documents seen by the Journal) and resistant to tampering, such as paraphrasing. However, it also notes that techniques like rewording using another model can easily bypass the watermark, making it vulnerable to exploitation by bad actors. Additionally, the company expresses concern about potential stigmatization of AI tools, particularly for non-native speakers.
OpenAI is also cautious about the potential user backlash, as nearly 30 percent of surveyed ChatGPT users indicated they might use the software less if watermarking were implemented. Despite these concerns, some employees believe watermarking is effective. Considering user sentiments, the Journal mentions that some have suggested exploring less controversial, albeit unproven, methods.
In its blog post update, OpenAI states it is in the early stages of exploring the embedding of metadata, a method that might be less controversial among users. Although it is still “too early” to determine its effectiveness, the cryptographically signed metadata would ensure no false positives.