Did Turing Miss the Point? Should He Have Thought of the Limerick Test?
Whose writing was always amusin’,
With British spelling so fine,
And grammar that shined,
Their blog posts were always worth choosin’.
In three seconds, the AI chatbot embedded in the Bing search engine generated this Limerick when I asked it to produce a Limerick about me writing a blog, but to use British spelling. I live about 6 miles away from Ruthin in North Wales and never provided any information about my location. Although, of course, general locational information is in my computer somewhere.
I’ve been exchanging Limericks with Arthur, an old school friend for many years, some quite serious, but generally whimsical. I have developed various strategies for generating them quickly. From that experience I find the Limerick produced by Bing has really frightening qualities. It uses the elisions of amusin’ and choosin’ to rhyme with Ruthin — a device I’ve never used in my Limerick composing. It does have a slightly US songster’s quality but would nonetheless be regarded by many as quite ‘clever.’ It includes my request to use British spelling, although that was an afterthought instruction, not indicated to be part of the Limerick. It also makes use of a cunning off rhyme with ‘fine’ and ‘shined’. It does this to enable a positive comment about my blog posts. But it is also rather woke, not knowing my gender identity; it cautiously refers to me as ‘their.’ Overall, the Limerick has a slightly cheeky style about it, being recognizably humorous.
We’ve got used to spellcheckers, and to increasingly sensible, if rather intrusive, grammatical suggestions as we type. But the creation of novel material, with style and wit, really does take this involvement of algorithms in our keyboard and screen lives into another dimension. It goes beyond what I had dismissively, and it has to be said, ignorantly, in a previous blog, dismissed as just pulling itself up by its bootstraps. It now seems clear that when there are codable rules, which can be derived from neural network scanning of millions of examples, something can be produced that would pass the Turing test of being indistinguishable from a human response. I can see why actors and authors are concerned about how the algorithmic digestion of their works, to be spewed out in previously non-existent forms, is a real challenge.
The question, therefore, arises of what makes a work humanly creative. Is there something we need to be thinking about beyond what Turing had in mind? Turing’s test was passed when a person could not determine if her interaction with a computer was with a human being or not. That, many would claim, has now been achieved. But surely there is something more that is needed to demonstrate humanity? Considering this I sent Arthur the algorithm’s Limerick. Almost instantly he sent back:
Who will do anything to create laughs.
But between me and you
All he seems to do
Is to create a series of gaffes.
On the face of it, Arthur’s Limerick is indistinguishable in its humanity from Bing’s. But I can spot psychological differences. In his self-effacing way, Arthur is happy poke fun at himself, even if that may be just to get a good rhyme. He has no gender problems, so refers to himself as he. But the biggest telltale sign to me that Arthur had not just asked Bing for help is that he used a linguistic device that is not that rare in his many Limericks. His line ‘between me and you’ gives him an easy rhyme on the subsequent line (no rhyme intended by me). It is the way the hands on an old master painting can be the giveaway that it is a forgery, or the clunky harmonies reveal a fake Mozart piece. The idiosyncratic details, special to a person, are often more revealing than the overall form.
Here is an opportunity for an intriguing research project: The revision of the Turing test. Topics should be chosen, and the algorithm given the challenge to produce Limerick’s about them. Human persons should be given the same challenge. Human and computer judges should then be asked to indicate which was written be an automaton and which by a human. How effective the humans are will be a good test of this version of the Turing test.
There is a problem, though, with evaluating the computer’s effectiveness in distinguishing between the Limericks. It will already have a record of all the Limericks it, or its associates, have written. It can search for these in a microsecond. The challenge is how to stop the algorithm cheating.