Deep Search Relevance Ranking in Practice

KDD-2020 HandsOn Tutorials

Abstract

Data science techniques for developing industry-scale search engines have long been a prominent part of most domains and their online products. Search relevance algorithms are key components of products across different fields, including e-commerce, streaming services, and social networks. In this tutorial, we plan to give a introduction to such large-scale search ranking systems, specifically focusing on deep learning techniques in this area. The topics we plan to cover are the following: (1) Overview of search ranking systems in practice, including popular techniques such as page rank algorithm and BM25; (2) Introduction to sequential and language models in the context of search ranking; (3) Knowledge distillation approaches for this area. For each of the aforementioned sessions we plan to first give an introductory talk and then go over an hands-on tutorial to really hone in on the concepts. We plan to cover fundamental concepts using demos, case studies, and hands-on examples, including the latest Deep Learning methods that have achieved state-of-the-art results in generating the most relevant search results. Moreover, we plan to show example implementations of these methods in python, leveraging a variety of open-source machine-learning libraries as well as real industrial data or open-source data.

Outline

No	Robust Anomaly Detection	Topics	Notebook	Video
1	Deep learning Preliminaries for Anomaly Detection	Anomaly Detection Deep Models for Anomaly Detection		Introduction
2	Autoencoder for Anomaly Detection	Autoencoder Adversarial Auto-Encoders (AAE) Variational Auto-Encoders (VAE) Wasserstein Auto-Encoders (WAE) Deep Suport Vector Data Description (SVDD) One-Class Neural Network (OCNN)	Notebook	1.Setup 2. Slide Talk 3. Code Walk
3	Robust, Deep and Inductive Anomaly Detection	Robust Autoencoder Experiments	Notebook	1.Setup 2. Slide Talk 3. Code Walk
4	Real World UseCases	Motivation Use Cases Experiments Results	Notebook	1.Setup 2. Slide Talk 3. Code Walk

Presenters

Personal WebSite

Sanjay Chawla

QCRI’s Data Analytics department.

Sanjay Chawla is Research Director of QCRI’s Data Analytics department. His research is in data mining and machine learning with a specialization in spatio-temporal data mining, outlier detection, class imbalanced classification, and adversarial learning.

Raghav Chalapathy

CSIRO Data61, Research Fellow in the Analytics and Decision Sciences Program at Data61 (CSIRO).

Dr. Raghav Chalapathy is involved with the research in Structural Health Monitoring and data analytics for asset management and energy demand forecasting. He holds a Ph.D. in data mining from the University of Sydney. His research interests are machine learning and data mining with a focus on anomaly detection, deep learning, online learning, concept drift anomaly detection, bayesian deep learning, clustering and predictive modelling. He has been driving several industrial projects investigating machine learning for real-world problems such as damage detection in civil structures (including the iconic Sydney Harbour Bridge) and energy demand forecasting for a multilevel model predictive controller used in optimal scheduling of air conditioning systems with renewable energy resource and thermal storage.

Dr. Khoa Nguyen

CSIRO Data61, Senior Research Scientist in the Analytics and Decision Sciences Program at Data61 (CSIRO).

Dr. Khoa Nguyen is a research team leader and a senior research scientist in Analytics and Decision Sciences Program at Data61 (CSIRO). He is leading a research team working on predictive analytics for asset management (including Structural Health Monitoring) and energy demand forecasting using machine learning techniques. He holds a PhD in computer science from the University of Sydney. His research interests are machine learning and data mining with a focus on tensor analysis for data fusion, anomaly detection, forecasting, imbalanced classification, data clustering and dimensionality reduction. He has been driving several industrial projects that investigate machine learning for real-world problems such as damage detection in civil structures (including the iconic Sydney Harbour Bridge), pothole detection in road pavements, fault detection and diagnosis for HVAC systems and energy demand forecasting. These project clients include government entities or private companies in transports and energy sectors (such as Transport for NSW, Transurban, Boeing, Department of the Environment and Energy and Australian Energy Market Operator).