Conference Paper Supervised Machine Learning for Tensor Structured Models with Scaled Latent Trace Norm Regularization

Wimalawarne, Kishan  ,  Wimalawarne, Kishan

In machine learning the structure of the data and structure of relationships among learning problems can play an important role. As the popularity of machine learning increases more and more challenging complex data structures are becoming available and required to be analysed. In this thesis we study the importance of learning by preserving structure of data and better ways of modelling relationshipsamong related learning tasks.We focus on higher dimensional arrays or tensors that are frequently found in many application domains. As with matrices, one of the important features of a tensor is the multilinear rank. Estimation and exploitation of the multilinear rank of tensors would allow us to build good learning models for tensors especially if the tensor is low rank. Yet compared to matrices exploiting the low rankness of atensor is difficult due to the high dimensional structure of tensors. We look into existing low rank inducing tensor norms such as the overlapped trace norm and the latent trace norm that have been previously used to regularize learning models to understand their limitations. We find that both of these norms have a limitation that they do not consider the relative rank compared to mode dimensions. We propose a new norm called the scaled latent trace norm which explicitly takes the relative rank compared to mode dimensions when regularized. The first problem that is investigated in this thesis is the fundamental question of identifying the optimal way to learn with tensor data. We challenge the common approach of converting data into vectors in order to use ordinary vector based learning models. We demonstrate using simple regression and classification models that by learning directly with tensor data without converting them to vectors and by applying low rank regularization methods we can outperform existing vector based learning models. To do this we extend regression and classification models with different tensor norms such as the overlapped trace norm, the latent trace norm and the scaled latent trace norm. Our theoretical analysis based on the excess risk bounds for each of these tensor norms allows us to infer how the the excess risk for each tensor norm is related to the multilinear rank of the weight tensor. We propose to solve these regularized tensor learning problems using the state of art optimisation method of alternating direction method of multipliers. Through toy experiments and real world experiments we demonstrate that our theoretical results match with our experimental results and that the direct learning with tensors is better than the vectorised learning.The second topic that is studied in this thesis is multilinear multitask learning. In this topic we investigate how to structure multiple related tasks together in tensor format such that the information sharing among the related tasks leads to better performances among individual tasks. In order to study multilinear multitask learning deeply we extend multilinear multitask learning with the latent trace norm and the scaled latent trace norm regularizations. We derive excess risk bounds to show how the multilinear rank of the task weight tensor is related to the excess risk for each of the tensor norm regularizations. Using the alternating direction method of multipliers we propose to solve multilinear multitask learning problems. Through experiments on toy and real world problems, we show that the scaled latent trace norm is more capable of giving better performances. We believe that the research described in this thesis can lead to more interesting research directions in the future. After stating our conclusions, we provide many possible future investigations that can be interesting to many researchers.

Number of accesses :  

Other information