Abstract
Feature extraction algorithms are an important class of unsupervised methods used to reduce data dimensionality. They have been applied extensively for time-of-flight secondary ion mass spectrometry (ToF-SIMS) imaging─commonly, matrix factorization (MF) techniques such as principal component analysis have been used. A limitation of MF is the assumption of linearity, which is generally not accurate for ToF-SIMS data. Recently, nonlinear autoencoders have been shown to outperform MF techniques for ToF-SIMS image feature extraction. However, another limitation of most feature extraction methods (including autoencoders) that is particularly important for hyperspectral data is that they do not consider spatial information. To address this limitation, we describe the application of the convolutional autoencoder (CNNAE) to hyperspectral ToF-SIMS imaging data. The CNNAE is an artificial neural network developed specifically for hyperspectral data that uses convolutional layers for image encoding, thereby explicitly incorporating pixel neighborhood information. We compared the performance of the CNNAE with other common feature extraction algorithms for two biological ToF-SIMS imaging data sets. We investigated the extracted features and used the dimensionality-reduced data to train additional ML algorithms. By converting two-dimensional convolutional layers to three-dimensional (3D), we also showed how the CNNAE can be extended to 3D ToF-SIMS images. In general, the CNNAE produced features with significantly higher contrast and autocorrelation than other techniques. Furthermore, histologically recognizable features in the data were more accurately represented. The extension of the CNNAE to 3D data also provided an important proof of principle for the analysis of more complex 3D data sets.