How I got a gift from WXPN with a bit of code
I received an envelope in the mail a few weeks ago from the radio station WXPN. Inside were stickers and the recent CD from Alvvays. It was a gift from sending an entry into their weekly Name Chain Game.
Name Chain Game
When I was taking classes for grad school, I’d commute early in the morning and tune into the station. The morning show would have a feature called the Name Chain Game where listeners would send a series of artist names such that each name would lead into the next. It was fun to speculate who each next artist would be.
I thought it would be fun to submit my own, and I figured I could use my programming background to do it even better. But first, I needed a list of every artist.
Collecting Data
Before I could start matching names I needed all of the names. While music streaming services do have APIs, they tend to be focused on customer data. They don’t really give you access to their entire catalog. One website, Discogs, does provide a catalog API. To make it easier, they regularly pre-compile XML files for you to download. With roughly 400 megabytes of artists, I figured it would be easy to write a quick script in Node.js.
I did quickly realize that the Node environment was having memory issues. Even after raising its memory it would crash before I could read the entire file.
To mitigate this, I wrote a pre-processor which would try to filter out all of the non-essential data and return a small list.
WXPN doesn’t play every type of music out there, and the world of rock, pop, and folk is more than enough. I take a look at the band’s profile to get an idea of the kind of music they play. To make sure the band has some fame, those without a profile or website are ignored entirely. I make sure the name of the band is at least two words, so that they can be chained together. I also check that the last word isn’t a number, as that produced a number of false positives the first time I ran it.
This allowed me to go from 7 million artists to just 70 thousand.
The output is written as a series of lines with only the key data separated by vertical bars.
I then wrote a simple query library to let me easily scan through the file from my algorithm. Basically it allows me to stream this file and incrementally obtain results since I couldn’t store the whole thing in memory at once.
Algorithm
I took a brute-force approach, starting with a given artist name. If they are “Andrew Bird”, the algorithm would then take the last word and search the smaller dataset for artists that start with “Bird”. Then it would take the last word of that artist and keep going deeper and deeper until a given set of artists were exhausted.
Then the search would traverse upwards to the one before last and search, and that would continue until every single option had been traversed. All of this data would be sent to a text file so that I could review afterwards. This is admittedly a brute-force approach and I’m sure there could’ve been some improvements to matching that would go faster. But I was aiming to generate a definitive list.
The bottom is where I start the main
function. I initiate the traversal by passing in an array with one entry, my starting band, and the last word of that. Then each step would query my list for the first name entry using the current last name. This would continue until we hit some defined CHAIN_MIN
where it would start writing that to my output file.
I wrote this to have some flexibility. I wanted to make it easy to configure the start conditions so that it could run in the background without manual intervention.
I executed the script and quickly saw chains getting added to my file. I used streams to save memory.
Once I saw it working, I set the CHAIN_MIN
and CHAIN_MAX
to 11
and the start to Andrew Bird and started it.
I refreshed every few minutes. It kept going on, longer than twenty-four hours and more than a hundred thousand chains.
Results
I started from the top, going into each child of the chain and sampling some of their music. If I didn’t like it, I could discard that line and every other suggestion along that chain. Some were not found on Spotify, suggesting they had few fans. After an hour, I had come up with my shortlist:
- “Manifest”, Andrew Bird
- “Lanterns”, Birds of Tokyo
- “BRAND NEW STORY”, Tokyo Performance Doll
- “Mark My Words”, Doll Skin
- “Storybook”, Skinny Living
- “Blow the House Down”, Living in a Box
- “Ice Machine”, Box and the Twins
- “Five Seconds”, Twin Shadow
- “A Moment of Happiness”, Shadows of Earth
- “September”, Earth, Wind & Fire
- “Rewind”, Fire in the Radio
Andrew Birds of Tokyo Performance Doll Skinny Living in a Box and the Twin Shadows of Earth, Wind & Fire in the Radio
The number of songs in the chain was an artificial limitation to make things simpler. In theory this could keep going almost indefinitely.
Wrap up
They never did run my suggestion, and that’s entirely understandable. Though I came up with a list of songs that was in-genre, the bands still were very obscure. I had not heard of them before this project and even Spotify listens were minimal.
I could’ve kept going, twiddling down the number of artists to those who were notable, but I figured I had already accomplished what I set out to do.
I got some stickers and a CD, which was very cool, and I’m definitely going to continue listening in to WXPN.