Learning to Rank for Web Search
I will give an overview of the machine
learning techniques used to rank web search results in Bing, a commercial
search engine. Learning to rank in this context presents some steep challenges:
how do you rank tens of billions of documents in milliseconds? How do you
train models, given that your quality measure depends only on the ranked order
of the results, thus requiring an objective function that is either flat or
discontinuous everywhere? How can you incorporate higher level semantic
information, like clicks? We certainly have boatloads of data - how might
one go about visualizing it?