This coming weekend at Austin's annual convention of the Association for the Advancement of Artificial Intelligence, a group of AI experts will attend a workshop in hopes of hammering out a new-and-improved version of the Turing Test.
The Turing Test, as a means to determine if a machine is truly thinking, is wholly inadequate in its current form. This was made painfully clear last year when a Russian-designed chatterbot named "Eugene Goostman" convinced 33% of judges that it was human. It managed to do so by taking on the persona of a 13-year-old, non-native-English speaking Ukranian boy, which made its mistakes and shortcomings more believable.
The incident riled up a number of AI theorists and computer scientists, including Gary F. Marcus, a New York University research psychologist.
"It's too easily gamed," he told io9, "you can pretend to be a 13-year-old boy from Ukraine and joke your way out of answering anything substantive. People within the field have realized the original test proves little, and that there is a need for a better target that might really get at intelligence." The 60-year-old test, he says, has long served as a popular "signpost" for research in AI, but due to its inadequacies, it's in "desperate need for a "refresh."
To that end, Marcus and his colleagues have organized the AAI workshop, titled "Beyond the Turing Test." Specifically, they're looking to create new Turing-inspired challenges and evaluate the efficacy of existing Turing Test challenges in terms of science and implementation.
"Working together with the AI community, we are hoping to devise a new set of tests, that really advance the field toward genuine intelligence and beyond superficial data crunching," says Marcus.
Here a taste of what they're hoping to achieve:
With the help of the workshop participants, we envision the support and definition of at least two events. The first, recently sponsored by Nuance, will be the Winograd Schema Challenge, proposed by Hector Levesque, which tests the ability of machines to resolve linguistic antecedents in contexts in which common-sense knowledge is critical. See The Winograd Schema Challenge for details about this challenge. The second, recently suggested by the workshop cochair, Gary Marcus, in an essay in the New Yorker, will focus on the comprehension of novel materials, such as videos, texts, photos, and podcasts. As an example, Marcus suggested a competition in which programs might be asked to watch "any arbitrary TV program or YouTube video and answer questions about its content — Why did Russia invade Crimea? or Why did Walter White consider taking a hit out on Jessie?"
Several leading researchers are going to attend, and a good number have joined its advisory board, including Guruduth Banavar, Ned Block, Ernest Davis, Oren Etzioni, Ken Forbus, Hiroaki Kitano, Danica Kragic, Leora Morgenstein, Charles Ortiz, Stuart Shieber, Moshe Vardi, and Patrick Winston.
"We have had an amazing response," Marcus told io9. "All the top people in the field that we invited are coming to speak, and we have had a ton of interest and lots of media — which is very rare before an academic workshop."
Ultimately, the experts are hoping that an annual (or semi-annual) Turing Championship will "simultaneously generate public interest and serve as benchmarks that guide important and foundational AI research."
Top image: agsandrew/shutterstock