A watermark for chatbots can spot text written by an AI

For example, since OpenAI’s ChatGPT chat software was launched in November, students have already started using it to cheat by writing articles for them. Use the news site CNET ChatGPT to write articles, just to release them Corrections middle accusations from plagiarism. But there is a promising way to discover AI text: by including hidden patterns that allow us to identify AI-generated text in these systems before they are released.

In studies, these watermarks have already shown that they can identify AI-generated text with almost certainty. The first, developed by a team at the University of Maryland, was able to identify text generated by the open source Meta language model, OPT-6.7B, using the detection algorithm they created. The work is described in paper that have not yet been peer-reviewed, and Code will be available Free around February 15th.

AI language models work by predicting and generating one word at a time. After each word, the watermark algorithm randomly divides the vocabulary of the language model into words in the “green list” and “red list”, and then prompts the language model to choose the words in the green list.

The more green-listed words in a passage, the more likely it is that the text will be generated by a machine. The text a person writes tends to contain a random jumble of words. For example, for the word “beautiful”, the watermarking algorithm could classify the word “flower” as green, and the word “orchid” as red. An AI model with a watermark algorithm is more likely to use the word “flower” than “orchid,” explains Tom Goldstein, an assistant professor at the University of Maryland who was involved in the research.

Leave a Comment