(The associated paper can be found here)
Neural models achieve state of the art accuracies on various Machine Learning, Computer Vision and Natural Language Processing tasks. At the core of these approaches lie rich, dense and expressive representations. While these representations are able to capture the underlying complexity, they are far from being interpretable. It is hard to understand what the raw values in a dense representations signify. Interpretability in a neural network pipeline would not just help us reason about the outcomes that they predict, but would also provide us cues on how to design them better. Hence, we tackle the following problem:
Given any learnt representation, transform the representations such that they are interpretable while maintaining performance on downstream tasks
There has been a recent spurge of interest in making models interpretable and explainable, and hence there have been various attempts that try to make representations interpretable. It has been observed that sparse and positive representations are often more interpretable. We exploit these findings and devise a novel extension of the ksparse autoencoder that are able to enforce stricter sparsity constraints. The neural autoencoder is highly expressive and facilitates non linear transformations in contrast to existing linear matrix factorization based approaches. Furthermore, our formulation allows for seamless integration into a general neural network pipeline.
<img dataaction="zoom" src="/projects/autoencoder.png" style="width:50%;"></img>
<figcaption> A ksparse autoencoder. For an input X, an autoencoder attempts to construct an output X' at its output layer that is close to X. In a ksparse autoencoder, only a few hidden units are active for any given input (denoted by the colored units in the figure).</figcaption>
I would leave the reader with a preview of some key results. In short, we find that our formulation results in embeddings that are highly interpretable (see Table 1) and also perform competitively well on the suite of downstream tasks.
Through large scale human evaluation, we report that our resulting word embedddings are much more interpretable than the original GloVe, word2vec embeddings and other prior work [1]. Moreover, our embeddings outperform existing popular word embeddings on a diverse suite of benchmark downstream tasks. Curious about the details? Check out our paper, accepted at AAAI 2018, here.
[1]: Faruqui, Manaal, et al. “Sparse overcomplete word vector representations.” arXiv preprint arXiv:1506.02004 (2015).
