From Creating Better Content Analysis to Predicting the Next Bestseller

In the last several years, there has been a “data explosion.” With almost half of the world’s population online, posting to social media and creating new video and text posts, there is more content created daily than there was in recorded history prior to the year 2000.

Data expert and author Brendan Marr wrote in his piece “Big Data Overload: Why Most Companies Can’t Deal With The Data Explosion” for Forbes that “[P]eople and companies already can’t cope with the data they have today, let alone the data that is around the corner…Experts are predicting a 4,300 percent increase in annual data production by 2020…[I]f a company is already struggling to store and analyse its own data now, it will be drowning in data in the next few years.”

For publishers, the increase in content not only means more submissions but also new data about what their customers want and how to connect with them on a deeper, more meaningful level (thus selling more books). Without adding overhead costs with new staff to analyse this content and develop systems, how can publishers possibly keep up with the influx of data and content and take advantage of these opportunities for better relationships with their customers? That is where predictive technology comes in.

Solutions for What to Do With All That Data

Several companies in the publishing space are offering solutions for how publishers can make the most out of the influx of data and create a smarter business strategy going forward.

Intellogo uses machine learning to better understand the user’s interests or goals and, through analysing not only the metadata of a text but also the mood and tone, can provide a more accurate assessment of the full content rather than just keywords entered into a search function. Intellogo’s applications could be used for publishers for comparison titles for acquisition or identifying backlist books to market against current trends, self-publishing platforms to ward against plagiarism and provide editing tools, or online booksellers for more precise recommendations for customers.

Inkubate is a members-only social network for publishers and content creators to discover, connect, and collaborate on projects. Inkubate’s ScoreIt!™ Analysis compares a writer’s uploaded manuscript to published books in order to provide accurate comparative titles, illustrate potential sales and how well a comparable author’s title has sold in the retail trade book market (print and ebooks) over the course of their career. And, PitchIts!™ offers writers the ability to compose a “mini- pitch” which is shared with agents and publishers who have subscribed for that feature.

In the academic space, Knewton offers publisher partners an approach to adaptive learning by measuring a student’s understanding of concepts and tailoring a specific framework for that student’s learning. Knewton’s adaptive learning platform draws on decades of research into psychometrics, item response theory, cognitive learning theory, and intelligent tutoring systems.

Can an Algorithm Predict a Bestseller?

Two books coming out this fall, The Bestseller Code: Anatomy of the Blockbuster Novel by research team Jodie Archer and Matthew L. Jockers (St. Martin’s Press) and Streaming, Sharing, Stealing: Big Data and the Future of Entertainment by professors of information systems at Carnegie Mellon University Michael D. Smith and Rahul Telang (MIT Press), delve into this topic and stress the importance of using technology to optimise the publishing industry.

The authors of The Bestseller Code created an algorithm that identifies the literary elements that predict whether a book will be a bestseller, while the authors of Streaming, Sharing, Stealing delve into how entertainment industries can take advantage of Big Data—for publishers, that means bundling content. As Smith and Telang wrote in their August 26th piece in Publishers Weekly, “Publishers can invest in a platform that provides direct access to customer behaviour and the ability to promote content directly to the right audience. Such a service will allow publishers to continue to create direct connections between the people who matter most: authors and their readers.”

Though many critics of predictive technology claim that these companies that have created algorithms and systems to help tackle the mountain of data are eliminating the “romance” of publishing, they are actually providing a way for publishers to consider even more stories that might captivate their imagination and delight readers in the marketplace.

By David Montgomery, CEO, Ingenta

In September, 2015, David assumed the role of Ingenta’s CEO. He was previously Chief Technology Officer, where he was responsible for driving all aspects of the company’s IT strategy, including its vision, innovation and roadmap. In addition to defining the technical architecture and development of the company’s core products, David continues to manage their testing, rollout, and on-going support, working in close collaboration with the company’s customers to ensure that product strategy and development is aligned and with client requirements. Prior to Ingenta, David was Managing Director of Software Operations at Inspired Thinking Group (ITG), a Tech Track 100 company, where he was responsible for overseeing software hosting, application management, software development and customer services. Prior to that, he held various senior positions, including Chief Innovation Officer, at software company Atex, 10 years as Director of Technology at 5 Fifteen and spent 9 years as Director of Technology at Anite plc (previously Autofile).