Poudro's blog
CTO / Data Scientist / Problem Solver - Consultant

Suggestion Tree - poudro.com développeur web freelance

This is a fairly old experiment, it’s been in this state for two years (at least) at time of writing, finally thought I should bring it out of the cupboard and into the open. Here is my “Suggestion Tree”.

The idea behind this is to try to find a “novel” (at the time) way of exploring artist similarities in particular (all similarities in general) using a visual representation on a graph.

There are three sub-experiments to this one page : a crawler, an autocomplete search engine and the visual explorer in itself.

Check it out here.

The crawler

The data I used for this was extracted from Last.fm via their API. It’s the same data that was used for my Mapping the universe of Music experiment (and was originally extracted for this :) ).

The search

The work originally done here was also, in part, used for another experiment : Integrating Full-Text search into WordPress using Xapian. All the work relating to Xapian and trigrams in that post was originally devised to give the best user experience for the Tree. I am particularly fond of what happens when you type in “chili” (for “Red Hot Chili Peppers”) or “head” (for “Talking Heads”). Between the search itself and the UX of the search bar, in retrospect, this could have been a blog post in itself. :)

The visual explorer

Once you have searched for an artist in the search field, there should appear a set of “bubbles” where the center bubble is the artist you searched for and surrounding it is a subset of the similar artists as given by Last.fm (again, the snapshot for this is old, it is outdated now). There are two main things that have already happened, the first one being the selection of the subset which is done using a Binary Search algorithm on all the results and selecting at most 10 (this is very similar to what I described in this post). The second step, is the visualisation itself which is a force directed graph. For each similar artist returned from the AJAX call, is also returned the “similarity index” (a number between 0 and 1 from “not very similar” to “almost exactly the same”) between all the returned artists. This index is used to set the visualisation graph. The strength of the link between two artists on the graph is directly linked to the index (the wider the line the more similar both artists are according to the source). This link is then modelled as using Hooke’s law where k is determined by the similarty index. Repulsion between all bubbles is the same and is governed by Coulomb’s law. This graph is updated in real time in the browser in a canvas. By clicking on another artist, you can reload the graph according the similar artists of this artist.

Finally, there it is. The whole story. I’m glad to have finally written about this project. :)

28 Mar 2013