What's going on here?
Because Amazon provides sales rank information, it's possible to take a guess at how many books they sell on a given topic on a given day. This tool invites you to suggest a keyword or phrase (no quotes neccesary, no booleans currently supported) and a level of precision and receive guesses at how many books were sold and what they cost.
How does it work?
Amazon is extremely good about letting programmers build applications using their database - they provide an API (application programming interface) that allows programmers to build "web services" using their information. For a given search term, this program searches Amazon, retrieves the sales rank and price of items, and guesses at how many sales those ranks represent. It makes its estimate by using an equation guessed at my Amazon-watchers and some known statistics about actual book sales and their corresponding sales ranks.
How accurate is it?
In absolute terms - number of books sold, their price, etc. - probably not very. We're interpolating from a small number of data points and some of the data is fairly old. There's a good chance we're in an order of magnitude, but a very low chance that we're reporting exact figures.
In relative terms, results should be fairly accurate as we believe the general model relating sales rank to actual sales is pretty accurate. So if we report that term "A" sells twice as many books as term "B", that's probably believable, even though "A" selling 32 and "B" selling 16 may not be.
Why's it so slow?
This script makes multiple calls to Amazon, sometimes dozens for a high precision search on a popular term. For reasons of politeness, the script pauses for a second between searches. Depending on network traffic, the load on Amazon's servers and the phase of the moon, each call to Amazon takes from 1 to 10 seconds. Do the math, and it can take quite a while to complete a search.
It doesn't work.
Yeah, well, life is hard. If you get blank results, or a 500 error in the middle of a results page, it means that Amazon refused the connection. Compose a haiku in your head about the experience, reload the form page and try again. If you just get a blank page or a 500 error at the top of the page, it's my fault. Feel free to let me know and I'll try to fix it.
Well, it's fun, don't you think? I think it's really interesting that the Vietnam War outsells the Korean War by a factor of 20, and that both are outsold by the Civil War. Or that Amazon shoppers appear to spend $120,000 on diet books per day.
No, really - why bother writing this tool?
Well, I needed it for my research. I'm interested in media coverage of different nations - my research suggests that American newspapers and TV pay more attention to rich nations than to poor nations. This raises an interesting question - are media outlets meeting their readers' needs, or is there a huge, unmet need for content on developing nations?
Studying Amazon provides an interesting way to answer that question. If we assume Amazon purchasing behavior is a proxy for consumer behavior as a whole (and there's really good reasons why that may not be true, it gives us the chance to compare what media producers provide and what consumers say they want, voting with their dollars.
Who's responsible for this? I'd like to learn more about your research/suggest improvements to this tool/send you a cease and desist letter?
I'm Ethan Zuckerman and I'm a researcher at the Berkman Center for Internet and Society at Harvard Law School, who generously provide server space for odd projects like this. You can read more about my research on media attention on my page on the GAP project - I hope to publish some Amazon results there soon. You can send me email if you've got questions or threats of legal action.