DIDI

Discriminative Dimensionality Reduction

Term: 06.2012 - 09.2015
 

Abstract

The amount of electronic data available today increases rapidly, such that people rely
on automated tools which allow them to intuitively scan data volumes for valuable information.
Dimensionality reducing data visualization, which displays high dimensional
data in two or three dimensions, constitutes a popular tool to directly visualize data sets
on the computer screen. Dimensionality reduction is an inherently ill-posed problem,
and the result of a dimensionality reduction tool largely varies depending on the chosen
technology, the parameters, and partially even random aspects for non-deterministic al-
gorithms. Often, the reliability and suitability of the obtained visualization for the task
at hand is not clear at all since a dimensionality reduction tool might focus on irrelevant
aspects or noise in the data. The goal of this project is to enhance dimensionality
reducing data visualization techniques by auxiliary information in the form of
class labeling of the data. This way, the visualization can concentrate on the aspects
relevant for the given auxiliary information rather than potential noise. 

Research Questions and Methods

The focus of the project lies on:
  1. The investigation of principled techniques to extend dimensionality reduction tools to class-discriminative visualization
  2. The experimental and theoretical evaluation and comparison of the approaches
  3. The extension and adaptation of discriminative dimensionality reduction to deal with large data sets.

Outcomes

Visualization of a protoypebased classifierEigendirections of the Fisher information matricesposterior probability

A visualization of a high-dimensional classification model together with the training data based on a discriminative dimensionality reduction is shown in the left image. The middle image illustrates the technique for computing discriminative dimensionality reduction mappings using the Fisher metric formalism for an artificial data set: the length of paths are computed inside a Riemannian manifold and the Eigendirections of the metric tensor are shown locally with arrows. The right plot depicts the posterior probability density which is used for the previous estimations.

In the first half of the project duration, we have addressed two of the three main targets of this project. The first one being the processing of big data, in this case understood as data sets with very many instances. Dealing with such data comes in reach due to the introduction of the method named kernel t-SNE. This technique equips the non-linear dimensionality reduction approach t-SNE with a parametric mapping, allowing visualizations in linear time and, hence, opening the way towards life-long learning or online visualization. Furthermore, the principled approach of including supervised information via the metric has been investigated for various approaches. The resulting Fisher information metric has been integrated into several methods such as Isomap, MVU, t-SNE, kernel t-SNE, SOM and GTM. The obtained projections have been utilized for the special application of classifier visualization. In this scenario, several experiments have shown the clear superiority of supervised techniques. Moreover, recent work has addressed the important question of feature relevance for a given projection. Since the role of individual features in non-linear projections obtained by non-parametric methods is usually unknown, practitioners often prefer rather simple methods such as PCA over more complex and, hence, more powerful ones. Our recently proposed approach allows to judge the importance of individual features and, such, might improve the usefulness of supervised as well as unsupervised non-parametric dimensionality reduction techniques.

Publications

Efficient approximations of robust soft learning vector quantization for non-vectorial data

Hofmann D, Gisbrecht A, Hammer B (2015)
Neurocomputing 147: 96–106.
Journal Article | Published | Quality Controlled | English

Link: http://pub.uni-bielefeld.de/publication/2695196
 

Learning interpretable kernelized prototype-based models

Hofmann D, Schleif F-M, Paaßen B, Hammer B (2014)
Neurocomputing 141: 84–96.
Journal Article | Published | Quality Controlled | English

Link: http://pub.uni-bielefeld.de/publication/2678214
 

Relevance learning for dimensionality reduction

Schulz A, Gisbrecht A, Hammer B (2014)
In: ESANN, 22nd European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning. Verleysen M (Ed); Bruges, Belgium: i6doc.com: 165–170.
Conference Proceeding / Paper | Published | Quality Controlled | English

Link: http://pub.uni-bielefeld.de/publication/2673557
 

Parametric nonlinear dimensionality reduction using kernel t-SNE

Gisbrecht A, Schulz A, Hammer B (2015)
Neurocomputing 147: 71–82.
Journal Article | Published | Quality Controlled | English

Link: http://pub.uni-bielefeld.de/publication/2671047
 

Sparse approximations for kernel learning vector quantization

Hofmann D, Hammer B (2013)
In: ESANN.
Conference Proceeding / Paper | Published | Quality Controlled | English

Link: http://pub.uni-bielefeld.de/publication/2625199
 

Applications of discriminative dimensionality reduction

Hammer B, Gisbrecht A, Schulz A (2013)
Presented at the ICPRAM 2013, Barcelona, Spain
Conference Proceeding / Paper | Quality Controlled | English

Link: http://pub.uni-bielefeld.de/publication/2622454
 

Using Nonlinear Dimensionality Reduction to Visualize Classifiers

Schulz A, Gisbrecht A, Hammer B (2013)
In: IWANN(1). Rojas I, Joya G, Gabestany J (Eds); Lecture Notes in Computer Science, 7902 Springer: 59–68.
Conference Proceeding / Paper | Published | Quality Controlled | English

Link: http://pub.uni-bielefeld.de/publication/2622456
 

Classifier inspection based on different discriminative dimensionality reductions

Schulz A, Gisbrecht A, Hammer B (2013)
In: Workshop NC^2 2013. TR Machine Learning Reports.
Conference Proceeding / Paper | Published | Quality Controlled | English

Link: http://pub.uni-bielefeld.de/publication/2622467
 

Discriminative probabilistic prototype based models in kernel space

Hofmann D, Gisbrecht A, Hammer B (2012)
In: Workshop NC^2 2012. TR Machine Learning Reports.
Conference Proceeding / Paper | Published | Quality Controlled | English

Link: http://pub.uni-bielefeld.de/publication/2671172
 

Learning vector quantization for (dis-)similarities

Hammer B, Hofmann D, Schleif F-M, Zhu X (2014)
NeuroComputing 131: 43–51.
Journal Article | Published | Quality Controlled | English

Link: http://pub.uni-bielefeld.de/publication/2615730
 

Efficient Approximations of Kernel Robust Soft LVQ

Hofmann D, Gisbrecht A, Hammer B (2012)
In: WSOM.
Conference Proceeding / Paper | Published | Quality Controlled | English

Link: http://pub.uni-bielefeld.de/publication/2625238
 

Discriminative Dimensionality Reduction Mappings

Gisbrecht A, Hofmann D, Hammer B (2012)
In: Advances in Intelligent Data Analysis XI - 11th International Symposium, IDA 2012, Helsinki, Finland, October 25-27, 2012. Proceedings. Hollmén J, Klawonn F, Tucker A (Eds); Lecture Notes in Computer Science, 7619 Springer: 126–138.
Conference Proceeding / Paper | Published | Quality Controlled | English

Link: http://pub.uni-bielefeld.de/publication/2625247
 

Kernel Robust Soft Learning Vector Quantization

Hofmann D, Hammer B (2012)
In: Artificial Neural Networks in Pattern Recognition - 5th INNS IAPR TC 3 GIRPR Workshop, ANNPR 2012, Trento, Italy, September 17-19, 2012. Proceedings. Mana N, Schwenker F, Trentin E (Eds); Lecture Notes in Computer Science, 7477 Springer: 14–23.
Conference Proceeding / Paper | Published | Quality Controlled | English

Link: http://pub.uni-bielefeld.de/publication/2625254
 

How to visualize a classifier?

Gisbrecht A, Schulz A, Hammer B (2012)
In: Proceedings of the Workshop - New Challenges in Neural Computation 2012. Villmann T, Schleif F-M (Eds); Machine Learning Reports: 73–83.
Conference Proceeding / Paper | Published | Quality Controlled | English

Link: http://pub.uni-bielefeld.de/publication/2622449
 

How to Visualize Large Data Sets?

Hammer B, Gisbrecht A, Schulz A (2012)
Presented at the Workshop Advances in Self-Organizing Maps (WSOM), Santiago, Chile
Conference Proceeding / Paper | Published | Quality Controlled | English

Link: http://pub.uni-bielefeld.de/publication/2622453