Most introductory textbooks on supervised learning algorithms examine the algorithm from a single classification point of view, meaning we only need to decide if an item belongs to a certain class or not. Multi-class classification, an instance of the same problem is typically done by training multiple such binary classifiers for each pair of classes. There are various improvements over these techniques, but very few papers examine how to get meaningful probability estimates out of a learning algorithm.
At Sidelines, we use the probability output of some of our classifiers to predict how relevant a sports news article is to a specific team. We then use that input as part of our ranking algorithm. For example, this article is mostly about the Boston Celtics and it should rank high in a Celtics feed, but Mavericks’ fans would also be somewhat interested although it should not rank as high in their feed. Continue reading