Since my last post on estimating the size of the Long Tail, there's been some more useful discussion of the book tail in particular. Books are one of the toughest industries to analyze, in part because the industry is so archaic and good data is so hard to come by. But it is in sclerotic industries where the Long Tail potential is often largest, so it's worth rolling up the sleeves and digging a bit further into what we know.
The two big statistical questions are:
1) How long is the book tail?
2) How big is it (as a proportion of total book sales), or might it become?
Let's start with the first. Nat Torkington at O'Reilly has an interesting post on the economics of the book business that includes some some stats about the head-heavy nature of the book industry (the comments on that post are worth reading, too; especially Peter K's). Here's the important one:
93% of all ISBN's sold fewer than 1,000 units and accounted for 13% of all sales. (By comparison, in the tech book market 85% of titles make up 10% of sales.)
So does that mean that the book tail is only 13% of the business? Well, we generally define the dividing line between head and tail as the average inventory of the largest offline distributor in a market. In this case, that's a large Barnes and Noble superstore, which carries about 100,000 books. According to Nielsen Bookscan, 1.2m titles sold at least one copy last year, so that 100,000 is about 8% of the total books that are actually in the market. This means that 92% of the books in the market aren't in a B&N superstore, which also happens to correspond to the number of book that sell fewer than 1,000 units (above).
If that were the end of the story, I'd be comfortable sizing the Tail of books at 13% of the business. But it's not. The data above comes from Nielsen Bookscan, which only tracks new book sales. Yet used books represent 8.4% of the overall business. Most of that is still textbooks, but the online side, which is heavily focused on the Long Tail (especially out-of-print titles), is the fastest growing part and now represents more than a quarter of the business. At current growth rates, that will be a half in a few years.
Including used books, my best guess of the business right now is
sales of titles not available in a B&N superstore represent about
15%-16% of the total business--the 13% of new sales, plus another 2%-3%
for used sales. (That's for the industry as a whole; for online
retailers such as Amazon, it's higher--between 20% and 30%--as previously discussed.)
But that's going to rise, as more previously unavailable books become available again. As Tim O'Reilly notes in this fascinating post, the number of out-of-print titles is huge. There are 32m unique books out there, of which only 4% are in print. The picture looks like this:
Until recently, when a book went out of print, it was essentially unavailable. But now, between all those now-online used bookstores, print-on-demand and the book-scanning efforts of Google, Yahoo, Amazon, Microsoft, the Internet Archive and others, the concept of letting a book disappear is itself going away. Out of print is out of style. And those older books that are now becoming available again will make up a growing part of the tail's volume.
(BTW, the robots that do the scanning are very cool. Check out the video here)
So now we have the beginnings of an answer to our two questions:
Q: How long is the book tail?
A: About 31,900,000 titles long.
Q: How big is it (as a proportion of total book sales), or might it become?
A: Today: about 15%. Tomorrow: more.