Topic-Transformer for Document-Level Language Understanding
- 1 Hassan II University, Morocco
Abstract
As long as natural language processing applications are considered prediction problems with insufficient context, usually referred to as a single sentence or paragraph, this does not reveal how humans perceive natural language. When reading a text, humans are sensitive to much more context, such as the rest or other relevant documents. This study focuses on simultaneously capturing syntax and global semantics from a text, thus acquiring document-level understanding. Accordingly, we introduce a Topic-Transformer that combines the benefits of a neural topic model that captures global semantic information and a transformer-based language model, which can capture the local structure of texts both semantically and syntactically. Experiments on various datasets confirm that our model has a lower perplexity metric compared to standard transformer architecture and the recent topic-guided language models and generates topics that are conceivably coherent compared to those of regular Latent Dirichlet Allocation (LDA) topic model.
DOI: https://doi.org/10.3844/jcssp.2022.18.25
Copyright: © 2022 Oumaima Hourrane and El Habib Benlahmar. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
- 2,746 Views
- 1,324 Downloads
- 0 Citations
Download
Keywords
- Neural Topic Model
- Neural Language Model
- Topic-Guided Language Model
- Document-Level Understanding
- Long-Range Semantic Dependencies