In a post a few weeks ago, I hinted at some work I'm doing quantifying how the shape of demand of Long Tail markets differs from their bricks-and-mortar competitors.
This isn't as easy as it sounds: it's often hard to do a proper apples-to-apples comparison between the two. For instance, the traditional retail market for music is for CD albums, while the online market is mostly for digital singles, which often include just a few tracks from an album. But there are ways to adjust the data to compensate for those differences, using such techniques as grouping online singles by album and using an average of their rank as a proxy for the album's rank online. It's not good enough to publish yet, but I'm comfortable that it's at least directionally correct at this point.
Here are two data sets that illustrate this offline/online difference in demand patterns. The first is DVDs, comparing Nielsen DVDScan offline point-of-sale data (a proxy for Blockbuster) with Netflix. I've just shown it through the top 100, since that's where the differences are most visible. But the trend continues throughout the catalog--hits are a far larger part of the business offline than they are online. Or, to put it another way, Netflix's demand is more evenly spread between hits and niches than Blockbuster's.
The next chart does the same for music, comparing Nielsen's Soundscan (CD retail) data with Rhapsody downloads. To avoid over-compressing the vertical scale, I've started this one at rank 100 and to make the curve easier to observe I've only shown the top 5,000 albums. As you can see, online music is even less hit-centric than online DVDs, which is saying something.





I reckon a problem in the way you calculate the rank position for online single sales. Normally only one or two songs from an album are very popular, whereas the rest of them are not known. If you calculate the rank for a complete album as the average of every song ibn the album, this will be significantly lowered because these ~10 'unpopular' tracks, but in fact the popularity of the album is much higher because people by it just because of the two songs they know.
In other words, in the traditional market, people don't "make an average" of their liking for every song, simply because they don't know them yet.
Posted by: JaviC | May 08, 2006 at 02:17 AM
Two possibly related observations:
(1) The Netflix curve never "goes steep" on the left.
(2) Netflix now advertises on their mailing envelopes - for movies in theaters.
The Netflix curve thus represents only a part of the overall movie-demand curve for consumers. The missing "steep part" is for movies where the consumer gains significant additional pleasure from seeing it before the DVD release, i.e. in a theater. Other parts are purchases of DVDs, borrowing DVDs, watching movies on cable or TV - but none of these are easily measured or likely to deviate from the overall curve. Netflix subscribers love movies, and will pay to go to the theater if the movie is right; movie executives think this way, that's why they advertise on Netflix envelopes.
Then why does Blockbuster have the "steep part"? Consumer-capture, i.e. if you step into Blockbuster you will rent at least one movie. If you can't find a movie you want and haven't seen, you settle for something you already saw and figure you will enjoy again. Look closely, the "steep part" is a muted one because some hits are only worth seeing once. A marketing survey of folks leaving Blockbuster on a Friday night could provide some backup for this; movie executives think this way, that's why mystery movies get a DVD release with "never before seen alternative endings".
Disclaimer: I am a Netflix user, a former Blockbuster user, and a very occasional movie theater user.
Posted by: Dan Theunissen | May 08, 2006 at 07:12 AM
Could the data be normalized and presented as a fraction of total albums/songs sold? That would make it easier to examine the shape of the curves. In the graphs on this post, it's difficult to see anything other than the fact that offline sales simply out-pace online sales at all levels of popularity.
This is obviously true, but it's not the effect you're looking to demonstrate.
Posted by: TWAndrews | May 08, 2006 at 07:12 AM
JaviC,
Actually, we only average the top 5 tracks, because we find that the average album in the top 1,000 only has five tracks in the top 10,000 tracks. It's a rough way to avoid the problem you rightly anticipate.
Chris
Posted by: chris anderson | May 08, 2006 at 07:22 AM
TWAndrews,
The data *is* presented as a fractions of total albums/songs sold for each of those domains. By expressing it as a percentage, there should be no distortions from the different size of the two markets. Please look again and tell me if that doesn't makes sense.
Chris
Posted by: Chris Anderson | May 08, 2006 at 07:26 AM
Jumping in here.. that the two curves are percentages of their respective sales makes sense, but it's hard to believe from the graphs. The tail for online music is *really* *really* flat. At about what rank do the curves cross?
Posted by: Helen Cook | May 08, 2006 at 10:08 AM
Rather than starting the second graph at 100, you might want to consider presenting the full range using a logarithmic scale. I admit this may lose some viewers, but it would give a peek at the interesting left edge of the figure.
Posted by: JD | May 08, 2006 at 11:48 AM
Netflix has, for us, replaced any need to ever go to Blockbuster, but hasn't in any way (I don't think) affected our theater-going frequency. It's still fun to see something when it comes out, on the big screen (we still have a small tv). But the big differences between Netflix and Blockbuster are (to Netflix's advantage): a.) the lack of any deadline in watching the DVD, and as such no silly late fees, etc., and b.) the fact that Netflix has movies that the people working at Blockbuster have never even heard of, like small independent releases, documentaries, etc.
Posted by: David | May 09, 2006 at 08:56 AM
These are the best illustrations of the Long Tail yet, Chris. More convincing than any of your previous graphs.
Posted by: Kevin Kelly | May 09, 2006 at 05:35 PM
Chris,
My mistake. Apparently I was suffering from a complete mental lapse.
Posted by: TWAndrews | May 09, 2006 at 07:48 PM
The music sales curves say very little to me. The scale is way wrong, I can barely say something about the slope, and the missing information about the beggining of the curve could be deceiving. I could imagine the online sales going almost flat until 10 and peaking more abruptly than the off-line sales.
A log plot would surely give you a good scale, and no, you won't loose audience. Don't ever underestimate the intelligence of the readers please, that only makes for lousy content. As it is, it is a very uninformative graph, with lots of empty space and little density of information. Even worst, it is confusing as you can see in the comments, people saying that sales are less, or that the curve is really flat. I am *sure* the curve is not flat, so you need to show that to avoid misguiding.
The movie sales graph is better, but if you are talking about the long tail I'd like to see it please. I know that the graphs are normalized, but how does it behave far to the right? In fact, you could easily include some research that describes the curves (sorry I haven't read your log too much to know if you posted about it). I am guessing that a log plot or a log-log plot can show you very interesting information, like an exponential (offline) versus power-law (online) decay of the curves.
Posted by: fercook | May 14, 2006 at 05:48 PM
Sorry I forgot to say something, if you are publishing a book you can certainly use a specialist on presenting quantitative information. I always see nice (although a little overdone, which leads to confusion) graphs in Wired mag. Get somebody to help you if you are not an especialist, it's a shame to see many stories by journalists ruined by lousy (and sometimes deceiving) graphs.
Posted by: fercook | May 14, 2006 at 05:53 PM
Chris, not to pile on, but if as you say, the curves are "fractions of total albums/songs sold for each of those domains", then the long-tail curves simply MUST cross above the short-tail curves somewhere to the right. Yet neither of the curves demonstrates that. Help! Of course, your argument is well-reasoned. But show us the inflection points so the proof will leap off the pages for us!
Posted by: Jorray | June 09, 2006 at 11:17 AM
I do think we need to see more of the details to know if this comparison is apt. For example, if the Neilsen data is taken week-by-week, then the "top N" movies will be a rotating set of titles that change through the year. If the Netflix data are reported on a yearly basis, the "top N" will be just those top few movies. The differences in the way the data are counted could make a huge difference.
BTW, I don't know that the data are recorded that way, just that they could be.
Posted by: tom s. | March 30, 2007 at 07:07 PM
Well i really appreciate your work by graphical representation which gives the complete analysis of the current market.....
Posted by: fm transmitter | November 15, 2009 at 11:30 PM