Introduction
I’ve been thinking about how artificial articles, videos, profiles, and images are changing the nature of online spaces. I think about how this changes the human experience and behaviour within these spaces and at what point we will reach a tipping point where the vacuous and hollow homogeneity of aggregated mediocrity will make the internet pointless.
Well, I might struggle to say this is related to security. Perhaps it’s better described as technology commentary. It’s been two years since the first article on this blog was published and I wanted to revisit some of those ideas and explore some new ones. The first article was about AI and values and my daily work in increasingly oriented around understanding this technology in the context of large corporates.
It is inevitable that AI will change the technology landscape and I have reservations about this from the perspective about what this will do to the users of that technology. We need only look at the impact of smartphones and social media on adolescents to see the epidemic of mental health problems. The implementation of technology translates to a change in the behaviour of the consumers. And this is true when we are talking about security technology too. If you block something it is amazing how people will create workarounds to subvert those controls.
I’ll allow myself the indulgence of running across a whole raft of subjects from internet conspiracies to academic papers. So . . . here we go.
The Dead Internet
There is a line of thought that originated from the dark corners of the internet. The unhinged ramblings of a lone Anon on an online forum now seems poignant when read in retrospect. We hark back to the internet of yesterday which was a very different place.
I am talking about the “Dead Internet Theory” which has found increasing airplay in the mainstream over the last year or so. It has especially become more relevant given the Meta AI user account fiasco and forces a reconsideration of just what the fuck is actually going on. The TLDR for “Dead Internet Theory” is given as follows.
Large proportions of the supposedly human-produced content on the internet are actually generated by artificial intelligence networks in conjunction with paid secret media influencers in order to manufacture consumers for an increasing range of newly-normalised cultural products.
It might seem that shit posting edge lords from shadowy quarters might have little to do with security but nothing could be further from the truth. In times passed, much of the online culture came from the wild west fringes of acceptability, finding their genesis in the primordial soup of knuckle draggers rattling their keyboards. The “Dead Internet Theory” comes from these murky depths of the web.
Anonymous are one of the first names that came to mind when we think about the dark corners of the internet. We think of them now when we hear the term ‘hacktivist’ but they grew out of 4chan around 2008 when taking exception to and shit posting Scientology. Many of the great firsts in hacking came from people messing with other people’s stuff for their own entertainment however today we live in a world of criminal gangs and nation states that now occupy the zeitgeist.
But . . is the internet dead?
To many it would be an obvious statement to say that it has died. Some might say that it has only changed. It sure feels emptier than it used to as the author of Dead Internet Theory asserts. But why is this?
In part, the experience of the internet is oriented around a few social media hubs. This is where people consume their information. The rise of mobile devices led to a homogenisation of experience as apps were more accessible than the traditional browsing experience. They are all very similar in format to accommodate for the change in devices. There is a news feed, you have a profile picture, you have a banner, and you scroll, and scroll, and scroll, and then you die. Long gone are the whacky levels of customisation and personalisation offered by places like MySpace and smaller boutique social media sites.
The increase in the availability of analytics and how advertisers pay for the attention of users has also changed how people output their content to optimise around views and engagement. How many videos are now in portrait where they were once landscape? How many have animated burn in subtitles with space between speech removed? They are created to be an optimal length and follow a similar format. You think you see more but the similarity in format makes it feel a bit ‘samey’.
So, the internet once seemed bigger because each bit of it was different. We no longer traverse boundaries of style and originality and what we consume is unoriginal shovelware. Increasingly what we see online is generated by AI, it is synthetic. The internet I grew up with is not the internet of today, they are vastly different beasts.
Anonymity
There was an interesting shift that changed the nature of the internet. As MySpace subsided and Facebook began to rise there was a shift. We went from using pseudonyms or online handles to using our real names. This changed who we interacted with. We oriented more around people we already knew rather than people with shared interests. It also changed how we interacted with each other in online spaces and also changed the consequences we experience.
Jon Ronson once gave a TED talk about online shaming and internet pile ons, the pre-cursor to cancel culture. He gave the example of Justine Sacco who shared a poor taste joke on Twitter to her small group of followers while waiting for a flight. She thought nothing of it at the time.
Going to Africa. Hope I don’t get AIDS. Just kidding. I’m white!
Justine Sacco
As she flew to Africa she was out of reach of internet connections but the tweet had gone viral and the internet was outraged. As Justine turned on her phone when she landed she received a message from someone she hadn’t spoken to since high school that read “I am so sorry to she what has happened to you”. By the time she gotten off the plan she had discovered she had been sacked from her job and there were reporters waiting for her at the airport to see her reaction as she found out the news that her life had been destroyed. The internet awaited with baited breath to watch the destruction of someone in real time. A badly framed commentary on the liberal mindset was taken as a statement of racist intent.
But this marked something, as we moved to using our real identities the stakes became a lot higher. Privacy eroded as social media platforms starting enforcing the use of real identities to make it easier for marketing firms to profile and sell to people. In many ways this put us all under the microscope and meant we were less able to have discussions and risk being wrong. How people interact in online spaces had forever been changed.
The move to real identities marked the death of the wild west era of the internet. In some ways the end of adolescence and innocent exploration. More recently this has evolved to prison terms for fairly milquetoast comments made online. It makes you think that this being authentic might not be the best idea in the world despite the gushing sentiment of influencers idealising its benefits.
Is it any wonder that AI tools that produce ‘acceptable output’ become appealing? It takes the risk out of the equation and if it goes wrong you have something to blame. A mechanism for repudiation if you will. The anxiety inducing awkwardness between posting and acceptance is diminished by using these AI posting features. In some ways we are removing the risk associated with expression by sanitising the expression through AI tools. It was the case that the expression was unfiltered and the risk was mitigated by anonymity.
Outsource your thinking, sacrifice your soul
There is an obvious problem isn’t there?
As many scramble at the possibility of AI making our lives easier and in some respect removing risk from online interactions we seem to lose sight of a significant problem. We become disconnected from the process that we go through to create our thoughts. Afterall, we don’t value what we are given, we value what we have earned.
I recently watched a talk from Rory Sutherland and in it he talks about the value of an essay. It is not about the finished product but the process you needed to go through to write it. This is true of many of the things we do, the process is the critical part, not the output.
But the value wasn’t in the essay. What’s valuable is the effort you had to put in to produce the essay. Now, what AI essays do is they shortcut from the request to the delivery of the finished good and bypass the very part of the journey which is actually valuable—the time and effort you invest in constructing the essay in the first place.
Rory Sutherland
I published an article about AI and values two years ago to the day (almost). It was the very first one published on this blog in fact. I have seen the change of late, what I feared has come to pass has done so. There is a more sinister element emerging, in it I wrote.
A security practitioner using these AI tools within their role is applying constraints to themselves. And this is the rub, what it gives is also what it takes. They will suffer through lack of experience, and lack of knowledge. They erode their ability to use creative and novel thinking to solve problems. Truth is the first virtue of thought. But what truth can be learned by automating the process of understanding? Only that they are workshy perhaps. Security could be characterised as a knowledge-based practice, but it’s more, it requires creativity and abstract thinking. Delegate these to a machine and you become nothing, or worse, contemptable.
Rory might have said it better but I said it first!
I look at a lot of ‘AI stuff’ as part of my job but you wouldn’t have guessed that is the case by how much of a miserable bastard I am about the whole situation. But please be assured . . . I am just as cynical about most things.
Digital Inbreeding
Recent statement from Elon Musk suggest that the totality of human generated information has been exhausted for training purposes. The next step is for AI to generate it’s own data for training purposes. In the case of LLMs where the output is based on a probability then this becomes an issue. Although the output of LLMs is nondeterministic in a practical sense it is technically deterministic because of the probabilistic generation of the output. This means it cannot replicate the nature of human output at scale as it approximates an output based on a population of data. If the created data is synthetic then we are talking about an average, of an average, of an average. You see the problem.
A considerable source of training data is the internet, or AI tools are supplemented with real time web searches to get around the time horizon introduced by the lead time of training. As synthetic data increases on the internet then the utility of it as source of data for training decreases.
Model collapse refers to a degenerative learning process in which models start forgetting improbable events over time, as the model becomes poisoned with its own projection of reality.
Shumailov, I., Shumaylov, Z., Zhao, Y. et al. AI models collapse when trained on recursively generated data. Nature 631, 755–759 (2024)
Essentially the crux of the issue is that the probabilistic nature of synthetic output reflects an aggregate, it starts to average across averages which increases the disconnect with reality. As the proportion of real data within the model decreases the performance of these tools also degrades and within a number of generations the the models fail. This is a similar problem to inbreeding in humans, you’ll probably be alright for a bit and then your kids end up with a chin that rivals Jimmy Hill.
Conclusion
If the internet is taken as a source of training data for AIs then the content AI pushes into the environment is of consequence due to the observations of model collapse. But the changes in the internet itself and how human behaviour and the interactions become sanitised also play a part. As synthetic data proliferates and degrades the utility of the internet as a training set for AI it also degrades the utility of the resource for the humans who use it. We are already seeing the proliferation of digital pollution online making the whole thing become a bit shit.
If we further consider that the interactions are inauthentic due to a perceived social penalty then how useful is the data about human interactions for AI training. The change in privacy considerations is of importance and the change in human behaviour has curious consequences. These are behaviours that will be reflected back into organisations too which is a whole other subject.
As a security practitioner that is seeing LLM and AI technologies being implemented into organisations it becomes quite interesting to look at the context in which these technologies exist. It is a problem that certain popular models don’t disclose the composition of their training set. As organisations seek to automate processes using AI it poses an interesting question about the longevity of the current stock of these tools in view of the fact that limitations are being reached and the solution to those limitations introduces significant problems that are yet to be overcome. The inevitable question of where that leaves organisations which chose to build a dependence on these tools is one that needs to be asked.
All in all, the internet was great before the ‘suits’ fucked it up and I’ll leave you with a quote from the late, great Bill Hicks.
By the way if anyone here is in advertising or marketing…kill yourself. It’s just a little thought; I’m just trying to plant seeds. Maybe one day they’ll take root – I don’t know. You try, you do what you can.
Bill Hicks