One of my favorite activities in my free time is playing guitar. I already have more guitars in my apartment than someone with my modest skills should reasonably own (eight at the time of writing), so clearly the next step is to buy even more.

As another of my passions is programming and data analysis I set out not only to buy a nice new steel string guitar, but to also learn something from the process: I decided to analyze the market while I was at it.

The Setup

My go-to merchant for musical equipment has been the German company Thomann, who are currently Europe’s largest online merchant in the field. They list around 500 different models that would match my criteria, far too much to look at by hand. So wanting to play around with some web-scraping, I fired up my editor and got to work.

Of course there are some things that I’d like to get from the upgrade:

  • Steel string all the way!
  • Different body shape than the Dreadnought I currently have
  • Within my budget: 400€ to 1000€
  • Integrated pickup

The result is a scraper that goes through the individual listings and pulls information like the model name, manufacturer, model characteristics and sales data. The source code is hosted on my github.

A quick overview of the price distribution yields:

This still leaves a couple hundred potential matches, so I’ll have to dig deeper. This post is the first in a series of posts exploring the dataset.

Buying a Guitar with Python – Part 1: Data Acquisition

Leave a Reply

Your email address will not be published. Required fields are marked *