Mechanised Morality - Part 1 - AI as the Insider Threat

What happens if an AI is the insider threat?

Jul 26, 2024

Introduction

This will run over a few parts, so if the ending is abrupt, there is more to come.

Now let’s be clear, I’m not of the ilk who would debase themselves by substituting my thinking with probabilistic language calculators however I do like to indulge myself with trivial considerations. I’ll even give an argument for the widespread adoption of AI *audible gasps*. I suppose my issue is less with the technology but the deluge of idiots posting prompt engineering cheat sheets or whatever vapid infographics are flavour of the day.

So, let’s construct a scenario for some amusement. There is an AI within an organisation, that AI is the insider threat. Its digital desires are fixated on dealing some damage. Desire might be too strong, let’s just say its operation is starting to harm the organisation irrespective of intent (although that does somewhat subvert the ‘threat’ label). This is something I have referenced in previous articles, but I wanted to extend on it a little. We love to talk about insider threats and are willing to anthropomorphise just about anything.

Let’s run with it a bit and see where we go. For these purposes, imagine a LLM type tool that has the ability to make decisions on behalf of the organisation in the same way an employee would. It is executing processes within the organisation as employees previously did.

How did it get here? What exactly is the problem? How do we even make assessments of machines that are designed to be human like? Where does it lead us? Why does it hate me? Where the hell is John Connor?

There are many questions. First we need some context.

Human out the loop

There will be a time very soon where generative AI will be used to autonomously update systems. Even at this point, human in the loop is looking to become yesterday’s news in fairly short order. Autonomous updates are even starting to be a feature on vendor roadmaps. We know AI will start making more and more autonomous decisions. Where there is a human in the loop we should consider how deeply are these responses and decisions checked.

What happens if we evaluate these decision as we would do a human’s decisions? A problem of scale presents us challenges as we cannot check every single one. Perhaps we need to sample the outputs of the machine and apply some of the old sampling methods to get a sense of the output being generated. The scale problem requires us to put higher levels of trust into these machines, ones that have not earned it I might add.

No difference between animal and machine?

Norbert Wiener poses an interesting premise within his book The Human Use of Human Beings. It is essentially that there is equivalence between the operation of animal and machine. Wiener describes it as follows.

It is my thesis that the operation of the living individual and the operation of some of the newer communication machines are precisely parallel. Both of them have sensory receptors as one stage in their cycle of operation: that is, in both of them there exists a special apparatus for collecting information from the outer world at low energy levels, and for making it available in the operation of the individual or of the machine.
Norbert Wiener – The Human Use of Human Beings

This has some interesting implications. It implies that from the perspective of an external entity the communication processing of both animal and machine is a black box situation whereby we cannot know what lies behind the veil, or at least it is of no consequence to make a differentiation.

This conceptualisation is not without problems, or at least further contextualisation. Wiener argues from the precepts of a liberal paradigm where equivalence of freedom and equality are taken a priori however there is no equality between animal and machine in this context. The machine is subordinated to the animal in every practical sense despite Wiener drawing functional equivalence between and even discussing humans as a special kind of machine. Yet it remains that the machine is always subordinated to the animal and the relationship is asymmetric despite his effort to draw an equivalence.

This perspective has a further consequence and that is the reduction of the human to an atomised unit disconnected from its inherent societal conditions. This is a valid interpretation of liberal philosophy however an incorrect interpretation of reality. Wiener does tacitly invoke a wider context by framing his dialogue in the context of communication to and from animal or machine which we might find proximal to social connections.

We understand that the machine much like the animal can only exist within the context of a hierarchical system and if we remove the differentiation in the communication then we are led to conclude that both animal and machine have agency within the confines of the system it occupies. Milgram’s Agency Theory gives us a useful definition of agency as it pertains to cybernetics.

From the standpoint of cybernetic analysis, the agentic state occurs when a self-regulating entity is internally modified so as to allow its functioning within a system of hierarchical control.
Stanley Milgram - Obedience to Authority

We are left with some aspect of a model as to how to view machines in human systems and although it is imperfect it does grant an understanding of how better contemporary thinking can be derived.

Scale is a problem, yet remains the best argument for its adoption

There are arguments for AI’s capability to solve specific issues and how this might be a benefit for humanity. Or there are arguments about increasing accuracy or reducing bias. This is the main thrust of the dialogue but these all somewhat parochial. There is a better reason, and it relates to the stability of societies. If the primary reason for deploying AI systems to realise efficiency gains this means it will handle far more of the workload than a human. We can consider the second reason for deploying an AI is capability where it exceeds the speed or ability of a human agent. If we set this against the backdrop of global sub replacement birth rates, aging populations, and the construct of our economies as they relate to growth, then the maintenance of productivity becomes a driver for AI adoption. In order to retain the levels of productivity we need to automate at a greater or similar rate to the decrease in the rate of the workforce. These reasons alone will make AI technology appealing to businesses and politicians who can see these problems arriving in the coming years and drive its widescale adoption.

If we consider our AI insider, the first problem we have is scale. It will have exponentially more interactions with systems and humans than a single human agent could undertake. The viable use cases for this technology require levels of access to that extend beyond what would be required for a human agent. It poses a question for traditional concepts of access insofar that the paradigm fails.

What do concepts like least privilege mean on an AI that has been trained on all the company data? Not a lot. AI will be exposed to an awful lot of people and a whole lot of systems and processes. When it goes wrong, it does so at scale.

How do we assess an AI model?

An AI is only as good as it’s training. We could compare the training process to an SLDC in some respects but there are fundamental differences that mean it needs to be assessed on its own merits. Unless you are training a model yourself then it’s likely that you are getting one ‘off the peg’. This means that it’s important to understand how it was trained and how it was evaluated.

Assessment of an AI is going require all the usual technical particulars that’s we’d expect to pick through for any tool. There is another aspect to this that run deeper and that is the training of the AI. A simplistic lifecycle for an AI might look like the following.

Training

We need to consider how the AI was trained and what it was trained on. Essentially, how was it educated? Now, would you consider all schools, colleges, and universities to be equivalent? Of course not, so why would we expect the same from a mechanised agent. We can understand the corpus an AI was trained on, and the methods used to affirm any learning with some fidelity. This is something we can know, and we can validate.

We should be cautious that the AI has not subsumed vast amount of information subject to copyright or we can check to determine if this was a public or private data set. We can see if this is a general set of data or a specialised one. Much like checking the education of a prospective employee we can apply the same concept to an AI. We might also ask how was the data curated? How was it privacy redacted? Answering these question will give an insight as to how the AI might behave and what values might be imparted by the developer. Ultimately it is a product of the information it has been provided and calculating responses in a probabilistic way based on that information it has subsumed.

On a technical standpoint we would see the process of tokenisation occurring in this stage where text is abstracted out to numerical values. If we wanted to go deep we might start to consider what algorithms have been used for the tokenisation process . . . but that’s a rather tedious and dry conversation.

If we consider our AI insider, then understanding this part of its history might just give us the insight to pick up early on if it might misbehave and it will certainly give us a more solid foundation to test from.

Evaluation

How the AI was evaluated by the developer becomes an interesting question. Was it assessed by another AI or a human developer? What criteria was it assessed against. This is where you might see imparted values of the developer creep in if they are using methods like reinforcement learning. It is worth understanding what has been done here and the values of the organisation who have undertaken this evaluation and under what criteria. Is it possible that our AI insider was always destined to do you harm because of what it was told was right and wrong.

The problem is that determinations of right and wrong are not politically neutral. Despite what people would like to think, these considerations are explicitly political in nature. This means that the determinations of what might be harmful or offensive are subjective assessments operating from a political paradigm. In this regard, understanding the politics of the developers can be useful in understanding the embedded values of the AI. There is a lot to be said about evaluation methods but a reasonable first step would be to understand what methods are in play.

Deployment

For the most part at this point we will need to consider its responses to input and check alignment to organisational values and norms. We are making assessments based on our values, not those outlined by any 3^rd party. A particular problem especially in security as we tend to point to external frameworks and references as the standard of adherence, but we need to start looking inwards towards what we value.

We need to be especially sensitive to changes over time for these technologies. Due consideration should be given to how far it is extensible to automate key processes using AI. Once you have hollowed out the organisation of the knowledge, experience, and skill it once relied on and replaced it with AI, when it goes wrong, you are fucked. So, checking that the AI performs as it did and in line with expectations become a critical activity. This might be checking against an established baseline of quality and content or a similar activity.

It's worth bearing in mind that there are a number of legal challenges still underway that might sweep the rug out from under you. What happens if you are told the technology you depend on is no longer legal? Can you turn to the supplier and seek recourse there, or worse, perhaps you don’t have any recourse and you are facing a significant loss. Such is the curse of the Neomaniac.

Conclusion

Even if you don’t agree, perhaps it gives you something new to think about, perhaps it doesn’t. As I have said before, we need to think differently about this stuff. The existing frameworks, standards, and principles all draw from the same shallow well. And it is one that is poisoned.

We have seen how scale is a factor and discussed that why a less supervised and autonomous adoption of this technology is almost certain. We have looked at a number of considerations that are not technological in nature but apply to AI technology as much as humans.

In the next part I’ll be asking if normative ethics can be applied to AI.