Getting the years of your mp3 library

I've been trying to get my music from apple music into mp3s and get all my very old unordered mp3s sorted out. One question I wanted to find out was what years my music library covered.

I've got another post about how I listen to music over at Music Setup - MPD, NCMPCPP, MPDScribble, Last.fm.

I used a combination of exiftool, grep ruby scripting, and Numbers.app.

The raw data looks like this (the empty line is missing data):

...
2006-07-03
======== /Users/useruser/music-2/Protomen, The/[2005] The Protomen/06 The Stand (Man Or Machine).mp3
2006-07-03
======== /Users/useruser/music-2/Mozart/Requiem - Chœur des Marais/CDM-Requiem_Mozart-05-Rex tremendae.mp3

======== /Users/useruser/music-2/Mozart/Requiem - Chœur des Marais/CDM-Walter-Ave Maria.mp3
2000
...
    

So to get all the data into a file I could script against (in ruby) I ran the following commands against my current music directory:

exiftool -r -s -s -s -year -RecordingTime -ext mp3 ~/music-2 | grep --invert-match '=====' > ~/music-years.txt

music-years.txt ends up looking like this (the empty line is missing data):

...
2006-07-03
2006-07-03

2000
...
    

I wrote a small ruby script to count all that data (because I'm not good enough at shell scripting to pull it off there) and spit it out as CSV:

year_lines = File.readlines("/Users/useruser/music-years.txt")
year_buckets = Hash.new(0)
(1950..2024).each do |year|
  year_buckets[year] = 0
end
year_lines.each do |year_line|
  match = /(19|20)\d{2}/.match(year_line)
  if match
    year = match[0].to_i
    year_buckets[year] = year_buckets[year] + 1
  else
    year_buckets[-1] = year_buckets[-1] + 1
  end
end

puts "Year,Count"
year_buckets.keys.sort.each do |year|
  puts "#{year},#{year_buckets[year]}"
end
    

It looks a little weird in the regex because I had some music with dates of 1045 which… I probably don't have any music that old in my mp3 library. And I bucketed all those anomolies (there were only 3 or 4) and all the unknowns into a -1 bucket for easy analysis.

Anyway, I finally end up with a big list of counts of my songs that have a year or recording time tag for that year looks like this:

Year,Count
-1,1473
...
2001,177
2002,250
2003,182
2004,355
2005,542
2006,1027
2007,319
2008,235
2009,434
2010,642
...
    

So I dropped that into numbers and graphed it linearly and logarithmetically.

two graphs showing data distribution across years peaking from 2005-2013 with much lower graphed amounts of data in other years

Here is a collection of general information I used to figure this out: