||Supervised Machine Learning for Tensor Structured Models with Scaled Latent Trace Norm Regularization
Supervised Machine Learning for Tensor Structured Models with Scaled Latent Trace Norm Regularization
Wimalawarne, KishanWimalawarne, Kishan
In machine learning the structure of the data and structure of relationships amonglearning problems can play an important role. As the popularity of machine learn-ing increases more and more challenging complex data structures are becomingavailable and required to be analysed. In this thesis we study the importance oflearning by preserving structure of data and better ways of modelling relationshipsamong related learning tasks.We focus on higher dimensional arrays or tensors that are frequently found inmany application domains. As with matrices, one of the important features of atensor is the multilinear rank. Estimation and exploitation of the multilinear rankof tensors would allow us to build good learning models for tensors especially ifthe tensor is low rank. Yet compared to matrices exploiting the low rankness of atensor is difficult due to the high dimensional structure of tensors. We look intoexisting low rank inducing tensor norms such as the overlapped trace norm andthe latent trace norm that have been previously used to regularize learning modelsto understand their limitations. We find that both of these norms have a limitationthat they do not consider the relative rank compared to mode dimensions. Wepropose a new norm called the scaled latent trace norm which explicitly takes therelative rank compared to mode dimensions when regularized.The first problem that is investigated in this thesis is the fundamental ques-tion of identifying the optimal way to learn with tensor data. We challenge thecommon approach of converting data into vectors in order to use ordinary vectorbased learning models. We demonstrate using simple regression and classifica-tion models that by learning directly with tensor data without converting themto vectors and by applying low rank regularization methods we can outperformexisting vector based learning models. To do this we extend regression and classification models with different tensor norms such as the overlapped trace norm,the latent trace norm and the scaled latent trace norm. Our theoretical analysisbased on the excess risk bounds for each of these tensor norms allows us to inferhow the the excess risk for each tensor norm is related to the multilinear rank ofthe weight tensor. We propose to solve these regularized tensor learning prob-lems using the state of art optimisation method of alternating direction method ofmultipliers. Through toy experiments and real world experiments we demonstratethat our theoretical results match with our experimental results and that the directlearning with tensors is better than the vectorised learning.The second topic that is studied in this thesis is multilinear multitask learn-ing. In this topic we investigate how to structure multiple related tasks togetherin tensor format such that the information sharing among the related tasks leadsto better performances among individual tasks. In order to study multilinear mul-titask learning deeply we extend multilinear multitask learning with the latenttrace norm and the scaled latent trace norm regularizations. We derive excess riskbounds to show how the multilinear rank of the task weight tensor is related tothe excess risk for each of the tensor norm regularizations. Using the alternatingdirection method of multipliers we propose to solve multilinear multitask learningproblems. Through experiments on toy and real world problems, we show that thescaled latent trace norm is more capable of giving better performances.We believe that the research described in this thesis can lead to more inter-esting research directions in the future. After stating our conclusions, we provide many possible future investigations that can be interesting to many researchers.