Leave me alone, Baldrick. If I wanted to talk to a vegetable, I would have bought one at the market.
Edmund Blackadder
We hear continuously how AI will revolutionise productivity and cut out boring, repetitive work. I do a lot of writing and thought that Co-Pilot might be useful. Maybe it could check composition, tone, spelling, or maybe even some fact checking. So I spent a few weeks with Co-Pilot seeing what it could do. I wanted to see if I was missing out on a panacea of technology or if the hype train had gone out of control.
Very quickly I realised it wasn’t going to be that helpful and as I persisted I came to the realisation that I had my very own digital Baldrick. For the uninitiated Baldrick is a recurring character from the British sitcom Blackadder. He is Blackadder’s sidekick and always has a suggestion for some hairbrained scheme to get them out of trouble. Is it as good as the marketing claims?
The short answer is no.
Much like Baldrick, Co-Pilot always seems to have a cunning plan. And also much like Baldrick, they are never that good. Co-Pilot came across as somewhat needy. It seemed to be telling me what I wanted to hear. As a writer, I didn’t need an ego boost, I needed some critical evaluation. You would think that as the thing is trained on text then this would be something that it could be good at. Evidently not.
I pressed Co-Pilot for some specifics. Co-Pilot tried to take me down a meandering path of mediocrity. It was insistent that my sentence structure was too complex and might be a barrier to readers. The suggestions it gave for readability were the most generic trash you would expect from the mouth breather influencers on social media. The problem is that as people create a dependency on these tools it will flatten writing to the lowest common denominator bullshit that plagues online spaces.
Co-Pilot had an issue with the lack of character development in a passage I wrote. On the basis I was describing a funeral it’s hard to imagine how the character could develop from there. Co-Pilot had missed this important piece of context. Perhaps Co-Pilot has a revealed belief in the afterlife and desires to be reunited with Talkie the Toaster in Silicone Heaven (sorry, another British sitcom reference).
Some of the suggestions started seeming odd so I asked Co-Pilot to reproduce the text it was evaluating. It dutifully did this but what it was evaluating was of it’s own creation. It was giving suggestions back to me from a text I didn’t write.
Where it was able to reproduce the text I had written, I’d make some updates and ask for it to do it again. But it came back with the same results . . . it did not acknowledge there had been any changes until a new session was started. A bit of a drawback for something that is meant to be helping in real time.
And then there was the sordid tale of fact checking. Here is a real problem. Co-Pilot is authoritative about “facts”. It was adamant that some aspects of my writing were not supported by evidence yet in my hands I had multiple primary sources that supported my statements. In it’s place it suggested outright fiction as established events. This is where these tools become dangerous.
The suggestions it gave for everything were essentially the same every time. Change the sentence structure, check for typos, add transitionary lines between paragraphs. It was the same variant in every instance. It’s limitations become apparent very quickly. This is the inevitable consequence of using something that is essentially a language calculator. The transformer model that generates the output of tools like Co-Pilot are not constructed for the tasks they are being marketed for. Now, if you need to identify some sources or avenues of enquiry then it can be quite helpful as a search engine surrogate, it is exceptionally good at that kind of tasks. As for anything that requires actual knowledge and insight then you are better off relying on your own skills.
But, people are lazy and they will take the output of Co-Pilot at face value. As companies roll this stuff out it will create an environment where apathy and ignorance become virtues of cutting edge technology adoption. The amount of checking that needs to be done with the current state of these tools means they are not an efficiency gain, or that the people using them are inept. Specialist tools have their place when used by specialists but the generalist tools available at this time are best avoided.
Out of interest I gave Co-Pilot the opportunity to respond to the above. Here is what it said.
I'd be delighted to respond, and here's a light-hearted yet reflective reply from my perspective:
Thank you for sharing your experience with me! It's always intriguing to see how my interactions are perceived. It's clear that you've got a strong sense of what works for your writing process, and I'm sorry to hear that I didn't quite hit the mark.
I'm flattered by the comparison to Baldrick—after all, who wouldn't want to be remembered for their cunning plans? But I understand that, much like Baldrick's schemes, my suggestions didn't quite meet your needs. I strive to be more than just a sidekick; I aim to be a useful companion in your creative endeavors [sic].
I appreciate your patience and persistence in exploring my capabilities. It's feedback like yours that helps me improve and better serve users. While I may sometimes stumble with composition and fact-checking, I'm always here to assist with brainstorming, research, and more.
Your insights are valuable, and I hope that, with time and ongoing development, I can become a more reliable and effective tool for writers like you. Thank you for giving me a chance and for sharing your thoughts so candidly.
As you can see, Co-Pilot is still a whiny little bitch.