This site is no longer being updated.

This is a static view of this site as it appeared in January 2013. Please visit us at Haptics Symposium 2014!

You are here

Machine Learning Methods for Human-Computer Interaction

Tutorial

Abstract

In this tutorial, I will cover various machine learning methods for pattern recognition at an overview level illustrated with case studies mostly taken from haptics applications, and further lay out the space covered by other methods without reviewing them specifically. I will only talk about basic statistical pattern recognition methods applied for supervised learning; namely, Bayesian decision theory, linear discriminant, and k-nearest neighbor methods; emphasizing the distinction between generative and discriminative approaches. I will close by mentioning commonly used extensions of the introduced methods and by providing resources for the participants to follow up with. I will also provide some guidelines on parameter selection and optimization for the classifiers, which is still a research problem in pattern recognition. 

Motivation

With the development of on-body computing devices such as smartphones, human-computer interaction (HCI) has become one of the popular research areas in computer science. Such devices have several built-in sensors that can acquire data about the motion patterns of the user. On a smaller scale, user hand gestures provide a means to communicate with the computer. Similar means of communication also exist in haptics. For example, human communication patterns with a robot through a haptic channel can provide the robot with past/future states of the user. Through intelligent processing of human motion data, both in large scale (body motion) and small scale (hand gestures), the computer/robot can either make decisions on the past activity state of the user or predict his/her future intentions. For this purpose, knowledge of machine learning methods is essential.

Learning Objectives

At the end of the tutorial, I expect that the participants will:

  • have learned the concepts of feature and feature space, and how to extract relevant features from data
  • be familiar with the frequently used features in haptics
  • be aware of the generative and discriminative approaches to pattern recognition; their advantages and disadvantages over one another
  • have an idea about parametric and non-parametric probability density function (PDF) estimation and how to use PDFs for pattern recognition
  • have an idea about decision boundaries and how they can be used to segment the feature space into different decision regions
  • have learned about the “curse of dimensionality” and available basic methods for dimensionality reduction
  • have knowledge on basic statistical cross validation methods
  • be able to evaluate and compare classifiers for two-class problems

Target Audience

This tutorial is intended for people in HCI and/or haptics community who have little or no background in machine learning. It can be appealing to graduate students who have already taken 1-2 graduate level courses in machine learning and pattern recognition but have not applied the methods in their research. Roboticists from engineering disciplines who have no machine learning knowledge are also a potential target. Participants from neuroscience, psychology/cognitive science, and other relevant disciplines are also welcome. Some knowledge of linear algebra and probability theory is essential, so I will assume second year university level background in these fields.

Tutorial Outline

The feature extraction, reduction, and selection methods, as well as the classifiers to be covered in the tutorial are listed and briefly explained below. I will support the applied methods and ideas with examples and demonstrations taken from either real cases from the haptics/robotics/HCI literature or fictitious scenarios.

1. Introduction and Motivation: Introduction to machine learning and pattern recognition.  Pattern recognition as a special case of machine learning. Discussion on possible pattern recognition applications in haptics and HCI.

2. Pattern Recognition Application Examples: Example problems on pattern recognition applications, with emphasis on haptics. Discussion with participants on how these problems can be solved.

3. Features and Feature Extraction: Concept of a “feature” and “feature space.” Temporal vs. spatial data. Signal processing: feature extraction from temporal data (e.g., sensor signals) and spatial data (e.g., images). Examples.

4. The Generative Approach to Pattern Recognition: Review of the multivariate Gaussian probability density. Mean and variance. Conditional probability and the Bayes rule. Bayesian decision theory. Parametric class-conditional probability density function models: the multivariate Gaussian case. Maximum likelihood estimation. Bayesian estimation. Non-parametric models: histogram, kernel density estimation.

5. The Discriminative Approach to Pattern Recognition: Non-Bayesian classifiers. Concepts of “decision boundary” and “decision region.” The linear discriminant classifier. The nearest neighbor and k-nearest neighbor classifiers.

6. Equivalence of the Two Approaches: Cases where the two approaches are equivalent. Quadratic decision boundaries and the Bayesian decision method.

7. Back to Pattern Recognition Application Examples: Brief discussion of the examples back in Section 2 after introducing the lecture material.

8. Feature Reduction and Selection: The “curse of dimensionality.” Principal components analysis. Sequential feature selection methods (greedy search).

9. Statistical Cross Validation: Training and test sets. Repeated random subsampling, K-fold cross validation, leave-one-out cross validation.

10. Evaluating Classifier Performance: The two-class confusion matrix. Decision thresholds. Type I and type II errors. Precision-recall and receiver operating characteristics.

11. Further Resources: Brief discussion of extensions of the introduced methods: Gaussian mixture models, quadratic and other nonlinear discriminants, support vector  machines. Parameter selection and optimization. Combining different classifiers.

 

Duration: 3 hours. There will be a single 15-minute break after about 75-90 minutes into the tutorial. If time permits, I will continue with a brief overview of artificial neural networks.