Doctor of Philosophy (PhD)


Computer Science

Document Type



The performance of deep learning methods is heavily dependent on the quality of data representations. A simple model exploiting better data representation can outperform complicated. However, getting good data representations is not straightforward and is dependent on the application areas. In some scenarios such as multiple instance learning (MIL), objects have multiple representations available, but are lack of the proper way to utilize them. Some other problems, for example, few-shot learning (FSL), are naturally difficult in finding the most representative features to facilitate the model learning. In certain case as the tumor progression prediction, multiple complementary inputs yet with different characteristics and dimensions should be integrated for a better representation. Three novel methods of learning better representations are proposed for these problems. In MIL, the problem of predicting Twitter users’ demographics using tweets was considered. Each user was an object with labels and their tweets were instances. A deep neural network with a neural-attention mechanism was introduced. The model can learn the relevance of each individual tweet and make a prediction based on the selected relevant information. Experiment results shown that the proposed model outperformed other baseline models. To facilitate FSL, multi-level contrastive learning was proposed. It utilized the lower-level representation from CNN. Thus, an ensemble method was exploited, which created an en- semble of models, each taking a representation from a different layer of a CNN as input. Experiments shown that the ensemble achieved the new state-of-the-art results. Lastly, pre- dicting tumor progression was used by magnetic resonance imaging (MRI), hyperpolarized magnetic resonance imaging (HPMRI) and nuclear magnetic resonance (NMR) techniques, which produced three complementary types of data in different features. To integrate them, separate encoders for different types were built. 3D encoder and attention module were adopted for 3D MR images. The raw data of HPMRI is time series 2D signals, which were processed by 2D encoder and RNN. A deep neural network consisting of the encoders to generate representations was constructed. The final tumor progression prediction was made based on them. The experimental results shown that the model predicted tumor progression earlier than other approaches.

Committee Chair

Zhang, Jian

Available for download on Thursday, May 10, 2029