Computers still struggle to master the Turing test, but that doesn’t matter for low demands on writing quality. In today’s sampling of publishing news we learn that this includes social networks, professional journalism, and scientific conference proceedings.
The Scientific Bot
In February, science publishers Springer and IEEE were forced to retract over 120 papers from their subscription services. They were all computer-generated gibberish “composed by a piece of software called SCIgen, which randomly combines strings of words to produce fake computer-science papers.” Embarrassingly, neither the publishers nor their readers (were there any?) noticed the fakes until they were tipped off by Cyril Labbé of Joseph Fourier University in Grenoble, France, who had written his own SCIgen detection software.
This is another blow against paywall science publishing in its losing battle with open access services. After all, diligent editing is supposed to be the primary value of expensive subscriptions. In April, Springer reacted by promising “intensified” editorial processes… and an automatic SCIgen detection system! Evidently they don’t really trust their newly “intensified” editing. Or perhaps they follow the same procedure as IEEE’s (cited in a comment): “many” conference papers have “peer-review procedures.” In other words, some unknown number of papers in paid subscriptions are not reviewed at all.
The Socializing Bot
Bots for sale on Twitter and other social networks have long been a booming business, and it just keeps getting bigger. Nick Bilton reports that bots are now employed en masse to influence political opinions in Mexico, Syria, and Turkey where
[…] an investigation found that every political party was controlling bots that were trying to force topics to become trends on social sites that favored one political ideal over another. The bots would also use a political group’s slogan as a hashtag, with the intent of fooling people into believing it was more popular than it really was.
Bots keep evolving to stay ahead of the networks’ spam filters. They use real-sounding names, simulate human wake-sleep cycles, and recycle bits of conversation from real users. Thousands of fake followers cost mere dollars. It seems the only solution would be to abandon online anonymity altogether and tie accounts to some real-life personal identity. More likely human users will further retreat behind privacy guards while the public view is dominated by the steady background noise of bot traffic.
The Journalistic Bot
The sad residues of professional journalism still trying to survive on the Internet are now adding “Around the Web” sections to their pages. These contain article links that look like editorial recommendations for related content, but are in fact paid links generated by bots. The quality of this quasi-advertising is generally pathetic, with titillating headlines leading to weaponized clickbait. The good news is that users seem to be learning to avoid these latest cesspools. The main articles are usually better but no longer necessarily written by humans, either.
In February, the Los Angeles Times made headlines with its robot-written earthquake report. The LA Times uses research bots extensively, although human writers still assemble the final article. Given the stereotypical nature of most news reporting this won’t last, and a company called Narrative Science is already producing purely synthetic journalism. Its software writes many thousands of articles in fields such as sports and finances, for a variety of Internet outlets including Forbes.
Readers can’t tell the difference – not so surprising as the typical baseball or earnings report is a predictable set of phrases decorating a computer-friendly set of numbers. Narrative Science also integrates the necessary domain expertise to add flavor and avoid errors. Co-founder Kristian Hammond claims that in 20 years, “there will be no area in which Narrative Science doesn’t write stories.” Maybe they could write better fake conference papers, too.