Watermarks in Plaintext
Combined with massive compute cycles, multiple ciphers can be embedded in Generative AI text output without the recipient’s knowledge of which ciphers are applied. One or more ciphers can be used. The more text the AI generates, the more ciphers are applied.
If a person attempts to use an AI-generated text fraudulently, such as submitting a research paper to a professor, the professor could electronically upload/scan the submitted research paper to ChatGPT, assuming ChatGPT, based on current popular opinion. This new interface will then scan their massive library of Book Cipher keys looking to detect one or more ciphers that may have been used when the text was generated. The larger number of cipher algorithms and the amount of generative AI text output becomes more secure.
Generative AI embeds multiple text watermarks that can be identified. It’s not bullet proof whereby the person using the generative AI output can attempt to move the words around, use different words, synonymous for words, and even move paragraphs around.
The output from a ChatGPT scan to search for ciphers can be used to determine the probability (%) that generative AI was used to produce the document.
The AI community, at large, can produce a centralized hub for all documents to be searched regardless of the Generative AI bot used. All the Generative AI company participants using the same hub would close the possible gap to increase the identification of Generative AI output.
Application of a Book Cipher
In the 2004 film National Treasure, a book cipher (called an “Ottendorf cipher”) is discovered on the back of the U.S. Declaration of Independence, using the “Silence Dogood” letters as the key text.
How a “Book Cipher” Works
A book cipher is a cipher in which each word or letter in the plaintext of a message is replaced by some code that locates it in another text, the key. A simple version of such a cipher would use a specific book as the key, and would replace each word of the plaintext by a number that gives the position where that word occurs in that book. For example, if the chosen key is H. G. Wells‘s novel The War of the Worlds, the plaintext “all plans failed, coming back tomorrow” could be encoded as “335 219 881, 5600 853 9315” — since the 335th word of the novel is “all“, the 219th is “plans“, etc. This method obviously requires that the sender and receiver have the exact same key book.
The Book Cipher can also be applied using letters instead of words, requiring fewer words to apply ciphers.
Solution Security
Increase the Number of Book Ciphers
The increased number of ciphers in one Generative AI “product” decreases the ability to “reverse engineer” / solve Book Ciphers. The goal is to embed ciphers in Generative AI “products”, as many as technically possible.
Increase the Complexity of Inserted Book Ciphers
Leveraging a “Word Search” like approach, the path to identify the words or letters in the Generative AI “Product” may not need to be read/scanned like English, from left to right. It may be read/scanned for cipher components from top to bottom or right to left.