A Joint Learning Approach for Semi-supervised Neural Topic Modeling
Jeffrey Chiu*, Rajat Mittal*, Neehal Tumma*, Abhishek Sharma, and Finale Doshi-Velez
In Proceedings of the Sixth Workshop on Structured Prediction for NLP, 2022
Oral Presentation
Topic models are some of the most popular ways to represent textual data in an interpret-able manner. Recently, advances in deep generative models, specifically auto-encoding variational Bayes (AEVB), have led to the introduction of unsupervised neural topic models, which leverage deep generative models as opposed to traditional statistics-based topic models. We extend upon these neural topic models by introducing the Label-Indexed Neural Topic Model (LI-NTM), which is, to the extent of our knowledge, the first effective upstream semi-supervised neural topic model. We find that LI-NTM outperforms existing neural topic models in document reconstruction benchmarks, with the most notable results in low labeled data regimes and for data-sets with informative labels; furthermore, our jointly learned classifier outperforms baseline classifiers in ablation studies.