Mathew Sparkes in The Telegraph:
Scientists have developed an algorithm which can analyse a book and predict with 84 per cent accuracy whether or not it will be a commercial success. A technique called statistical stylometry, which mathematically examines the use of words and grammar, was found to be “surprisingly effective” in determining how popular a book would be. The group of computer scientists from Stony Brook University in New York said that a range of factors determine whether or not a book will enjoy success, including “interestingness”, novelty, style of writing, and how engaging the storyline is, but admit that external factors such as luck can also play a role.
By downloading classic books from the Project Gutenberg archive they were able to analyse texts with their algorithm and compare its predictions to historical information on the success of the work. Everything from science fiction to classic literature and poetry was included. It was found that the predictions matched the actual popularity of the book 84 per cent of the time. They found several trends that were often found in successful books, including heavy use of conjunctions such as “and” and “but” and large numbers of nouns and adjectives. Less successful work tended to include more verbs and adverbs and relied on words that explicitly describe actions and emotions such as “wanted”, “took” or “promised”, while more successful books favoured verbs that describe thought processes such as “recognised” or “remembered”. To find “less successful” books for their tests, the researchers scoured Amazon for low-ranking books in terms of sales. They also included Dan Brown’s The Lost Symbol, despite its commercial success, because of “negative critiques if had attracted from media”.