Subscribe to GEN
Login to GEN
With the Internet so bloated with AI generated nonsense, its important to recognise the content that is truly human-authored. This doesn't mean that AI has not been used to assist, such as grammar or spell checking, but rather the content of the article has been carefully crafted by a person based on research and experience of that person.
GEN are not anti-AI, we use it daily and we have any services based on AI, but there's a problem. The Internet is nothing more than the sum of its content, and there's a lot of content. In recent years companies have started using Large Language Models (LLMs) to generate vast quantiles of nonsense. This nonsense is meant to artificially inflate a sites ranking in search engines, because no one reads the articles before adding them to the search engine, they are just consumed and indexed.
This means some sites are ranking highly and generating considerable traffic based on this spam, which just means the end users have a poor experience, or worse, they read and believe the nonsense. This is not limited to articles, reviews are auto being auto-generated, where a review site can find a 'thing' and then have the AI write nonsense about the 'thing' and provide a review. The 'thing' has never been touched, its all just content theft.
There is a concept in law, known as intellectual property or IP. If you write an article, and then publish that article, it becomes copyrighted. Copyright, is a way of acknowledging the time and effort that went into producing such a fantastic article, and establishes some rules about how and when it can be used, duplicated, replicated, quoted etc. As a general rule, you can reproduce the article, in part of full, providing the original author is credited. This is not legal advice, find a lawyer for that, but this gives an introduction to content theft.
When you use an LLM to write an article about, for example, a company, where do you think it gets the 'knowledge' from? Yes, the internet. Ok, so the LLM uses web-pages from the internet to be able to generate an article about a company. Given that most of the content on the web is human generated, published and copyright, the LLM is just stealing that content, and regurgitating it into another form of words. This goes further than just spamming search engines, its also a copyright issue, since many of these high ranking nonsense articles are likely ranking better than the original content they are based on, stealing traffic from that original work.
A LLM can generate garbage, and its important to understand that it doesn't. That is, it cannot comprehend the subject matter than its regurgitating, it simple does based on rules about how tokens (words or strings) relate to other tokens. This means that the garbage it generates is not proof read by a person before being posted to the Internet, so what if the article was about a brand of medicine, or a special diet, or perhaps a review of a health care provider? We know the LLM has never experienced any of these things, yet by simply pulling content from pages on the internet, and rewording that content we have an article.
This is how an LLM, famously Google's LLM, told a user who was suffering from depression, that he should go and jump off a bridge. The garbage response was regurgitated from a Reddit post, which was used to train this model. What, what? Reddit is used to train the model? Yes indeed, the seemingly endless stupid of Reddit is used to train AI models. You can't make this stuff up.
Take our badge, and use it with pride on your website IF you write the articles and content yourself using your brain, and in that case, thank you for your content.
Index v1.028 Standard v1.114 Module v1.000 Copyright © 2024 GEN Partnership. All Rights Reserved, Content Policy, E&OE. ^sales^ 0115 933 9000 Privacy Notice 167 Current Users, 226 Hits