I was going to write this blog about feature hashing, a massively useful trick when building classifiers and predictive models. It saves the time and complexity of building a dictionary and allows the hashed feature vector to be smaller than the number of possible features. It is smaller because the number of actual features in a particular data set is often much smaller than the number of possible features if you had to enumerate them all.
But instead, I want to write about recommendation systems, where the recommendation algorithm is only one part, and where merchandizing, presentation and psychology also play a role.
The Layers of Recommendations
A typical recommendation system has lots of parts, but the three main parts when serving recommendations are the recommendation algorithm, the merchandising layer and the presentation layer. All three are important to the success of the system. The recommendation algorithm provides the basis, identifying the potential items to show to the current user. The merchandising layer allows the website owner to customize the recommendations, applying filters, blacklists and pins to modify the output from the recommendation algorithm to fit a set of desired marketing criteria. Finally, the presentation layer puts the recommendations in front of the user, at which point psychology takes over.
So let’s look at each of these parts in turn, and then I’ll discuss some really interesting results we’ve had recently which show how they interact in unexpected ways. (See part two of this piece)
Key Performance Indicators
As written in earlier blog posts, I’ve been developing a set of recommendation algorithms that explicitly target customer Key Performance Indicators (KPIs). So, for example, a customer could specify that they wanted recommendations that optimize for conversions, or more typically, revenue, and the algorithm would generate a set of items with the highest expected revenue when presented to the current user. This is very different from more typical recommendation algorithms, which tend to find items that the current user would rate highly, or which users similar to the current user have bought — the hope being that the user would then go on to buy one of these items. My new algorithm does not just hope, it makes the KPI explicit.
The merchandising layer is, from a data science viewpoint, a bit of an oddity. We would hope that the data should tell us what the best items are, but often a website will want a little more control over the users’ experience. They may want to ensure that the items shown on a page are alternatives to the current item, or are a mix of alternatives and complimentary items. Or they might want to pin things like gift cards at particular times of year.
The presentation layer is the website, and it makes a difference where on the page the recommendations are presented, and in what format. But, beyond discussing best practices with the website owner, there’s not too much we can do on the presentation end of things.
It is the combination of all three of these that is finally shown to the user, and that’s where the psychology comes in.
We’re running A/B tests of the new KPI-optimizing algorithm vs a more traditional item-based collaborative filtering algorithm on a number of sites and we have found some interesting differences. Read my next blog post to find out how KPI optimization differs from collaborative filtering later this week.