You are here

Session Details

Matrices, Graphs and Metrics

Thursday, 13 December
13:30 – 15:30
Room: Rembrandt & Permeke
Session Chair: Jan Ramon

13:30 Bounded Matrix Low Rank Approximation short_paper) echo(" (Short)");?>
Ramakrishnan Kannan, Mariya Ishteva, and Haesun Park

Matrix lower rank approximations such as non-negative matrix factorization (NMF) have been successfully used to solve many data mining tasks. In this paper, we propose a new matrix lower rank approximation called Bounded Matrix Low Rank Approximation (BMA) which imposes a lower and an upper bound on every element of a lower rank matrix that best approximates a given matrix with missing elements. This new approximation models many real world problems, such as recommender systems, and performs better than other methods, such as singular value decompositions (SVD) or NMF. We present an efficient algorithm to solve BMA based on coordinate descent method. BMA is different from NMF as it imposes bounds on the approximation itself rather than on each of the low rank factors. We show that our algorithm is scalable for large matrices with missing elements on multi core systems with low memory. We present substantial experimental results illustrating that the proposed method outperforms the state of the art algorithms for recommender systems such as Stochastic Gradient Descent, Alternating Least Squares with regularization, SVD++, Bias-SVD on real world data sets such as Jester, Movie lens, Book crossing, Online dating and Netflix.

13:50 Robust Matrix Completion via Joint Schatten p-Norm and lp-Norm Minimization short_paper) echo(" (Short)");?>
Feiping Nie, Hua Wang, Xiao Cai, Heng Huang, and Chris Ding

The low-rank matrix completion problem is a fundamental machine learning problem with many important applications. The standard low-rank matrix completion methods relax the rank minimization problem by the trace norm minimization. However, this relaxation may make the solution seriously deviate from the original solution. Meanwhile, most completion methods minimize the squared prediction errors on the observed entries, which is sensitive to outliers. In this paper, we propose a new robust matrix completion method to address these two problems. The joint Schatten p-norm and lp-norm are used to better approximate the rank minimization problem and enhance the robustness to outliers. The extensive experiments are performed on both synthetic data and real world applications in collaborative filtering and social network link prediction. All empirical results show our new method outperforms the standard matrix completion methods.

14:10 Sequential Alternating Proximal Method for Scalable Sparse Structural SVMs short_paper) echo(" (Short)");?>
P. Balamurugan, Shirish Shevade, and T. Ravindra Babu

Structural Support Vector Machines (SSVMs) have recently gained wide prominence in classifying structured and complex objects like parse-trees, image segments and Part-of-Speech (POS) tags. Typical learning algorithms used in training SSVMs result in model parameters which are vectors residing in a large-dimensional feature space. Such a high-dimensional model parameter vector contains many non-zero components which often lead to slow prediction and storage issues. Hence there is a need for sparse parameter vectors which contain a very small number of non-zero components. L1-regularizer and elastic net regularizer have been traditionally used to get sparse model parameters. Though L1-regularized structural SVMs have been studied in the past, the use of elastic net regularizer for structural SVMs has not been explored yet. In this work, we formulate the elastic net SSVM and propose a sequential alternating proximal algorithm to solve the dual formulation. We compare the proposed method with existing methods for L1-regularized Structural SVMs. Experiments on large-scale benchmark datasets show that the proposed dual elastic net SSVM trained using the sequential alternating proximal algorithm scales well and results in highly sparse model parameters while achieving a comparable generalization performance. Hence the proposed sequential alternating proximal algorithm is a competitive method to achieve sparse model parameters and a comparable generalization performance when elastic net regularized Structural SVMs are used on very large datasets.

14:30 A New Anomaly Detection Algorithm based on Quantum Mechanics short_paper) echo(" (Short)");?>
Hao Huang, Hong Qin, Shinjae Yoo, and Dantong Yu

The primary originality of this paper lies at the fact that we have made the first attempt to apply quantum mechanics theory to anomaly (outlier) detection in high-dimensional datasets for data mining. We propose Fermi Density Descriptor (FDD) which represents the probability of measuring a fermion at a specific location for anomaly detection. We also quantify and examine different Laplacian normalization effects and choose the best one for anomaly detection. Both theoretical proof and quantitative experiments demonstrate that our proposed FDD is substantially more discriminative and robust than the commonly-used algorithms.

14:42 A Semi-Definite Positive Linear Discriminant Analysis and its Applications short_paper) echo(" (Short)");?>
Deguang Kong and Chris Ding

Linear Discriminant Analysis (LDA) is widely used for dimension reduction in classification tasks. However, standard LDA formulation is not semi definite positive (s.d.p), and thus it is difficult to obtain the global optimal solution when standard LDA formulation is combined with other loss functions or graph embedding. In this paper, we present an alternative approach to LDA. We rewrite the LDA criterion as a convex formulation (semi-definite positive LDA, i.e., sdpLDA) using the largest eigen-value of the generalized eigen-value problem of standard LDA. We give applications by incorporating sdpLDA as a regularization term into discriminant regression analysis. Another application is to incorporate sdpLDA into standard Laplacian embedding, which utilizes the supervised information to improve the Laplacian embedding performance. Proposed sdpLDA formulation can be used for both multi-class classification tasks. Extensive experiments results on 10 multi-class datasets indicate promising results of proposed method.

14:54 Towards Active Learning on Graphs: An Error Bound Minimization Approach short_paper) echo(" (Short)");?>
Quanquan Gu and Jiawei Han

Active learning on graphs has received increasing interest in the past years. In this paper, we propose a nonadaptive active learning approach on graphs, based on generalization error bound minimization. In particular, we present a data-dependent error bound for a graph-based learning method, namely learning with local and global consistency (LLGC). We show that the empirical transductive Rademacher complexity of the function class for LLGC provides a natural criterion for active learning. The resulting active learning approach is to select a subset of nodes on a graph such that the empirical transductive Rademacher complexity of LLGC is minimized. We propose a simple yet effective sequential optimization algorithm to solve it. Experiments on benchmark datasets show that the proposed method outperforms the state-of-the-art active learning methods on graphs.

15:06 Rough Set Subspace Error-Correcting Output Codes short_paper) echo(" (Short)");?>
Mohammad Ali Bagheri, Qigang Gao, and Sergio Escalera

Among the proposed methods to deal with multi-class classification problems, the Error-Correcting Output Codes (ECOC) represents a powerful framework. The key factor in designing any ECOC matrix is the independency of the binary classifiers, without which the ECOC method would be ineffective. This paper proposes an efficient new approach to the ECOC framework in order to improve independency among classifiers. The underlying rationale for our work is that we design three-dimensional codematrix, where the third dimension is the feature space of the problem domain. Using rough set-based feature selection, a new algorithm, named "Rough Set Subspace ECOC (RSS-ECOC)" is proposed. We introduce the Quick Multiple Reduct algorithm in order to generate a set of reducts for a binary problem, where each reduct is used to train a dichotomizer. In addition to creating more independent classifiers, ECOC matrices with longer codes can be built. The numerical experiments in this study compare the classification accuracy of the proposed RSS-ECOC with classical ECOC, one-versus-one, and one-versus-all methods on 24 UCI datasets. The results show that the proposed technique increases the classification accuracy in comparison with the state of the art coding methods.

15:18 Metric Learning From Relative Comparisons by Minimizing Squared Residual short_paper) echo(" (Short)");?>
Eric Yi Liu, Zhishan Guo, Xiang Zhang, Vladimir Jojic, and Wei Wang

Recent studies [1]-[5] have suggested using constraints in the form of relative distance comparisons to represent domain knowledge: d(a,b) < d(c,d) where d(·) is the distance function and a, b, c, d are data objects. Such constraints are readily available in many problems where pairwise constraints are not natural to obtain. In this paper we consider the problem of learning a Mahalanobis distance metric from supervision in the form of relative distance comparisons. We propose a simple, yet effective, algorithm that minimizes a convex objective function corresponding to the sum of squared residuals of constraints. We also extend our model and algorithm to promote sparsity in the learned metric matrix. Experimental results suggest that our method consistently outperforms existing methods in terms of clustering accuracy. Furthermore, the sparsity extension leads to more stable estimation when the dimension is high and only a small amount of supervision is given.