Abstract
Augmenting Language Models (LMs) with structured knowledge graphs (KGs) aims to leverage structured world knowledge to enhance the capability of LMs to complete knowledge-intensive tasks. However, existing methods are unable to effectively utilize the structured knowledge in a KG due to their inability to capture the rich relational semantics of knowledge triplets. Moreover, the modality gap between natural language text and KGs has become a challenging obstacle when aligning and fusing cross-modal information. To address these challenges, we propose a novel knowledge-augmented question answering (QA) model, namely, Graph Reasoning Transformers (GRT). Different from conventional node-level methods, the GRT serves knowledge triplets as atomic knowledge and utilize a triplet-level graph encoder to capture triplet-level graph features. Furthermore, to alleviate the negative effect of the modality gap on joint reasoning, we propose a representation alignment pretraining to align the cross-modal representations and introduce a cross-modal information fusion module with attention bias to enable cross-modal information fusion. Extensive experiments conducted on three knowledge-intensive QA benchmarks show that the GRT outperforms the state-of-the-art KG-augmented QA systems, demonstrating the effectiveness and adaptation of our proposed model. Copyright © 2024 Association for the Advancement of Artifcial Intelligence (www.aaai.org). All rights reserved.
Original language | English |
---|---|
Title of host publication | Proceedings of The Thirty-Eighth AAAI Conference on Artificial Intelligence (AAAI-24) |
Place of Publication | USA |
Publisher | AAAI press |
Pages | 19652-19660 |
ISBN (Print) | 9781577358879 |
DOIs | |
Publication status | Published - 2024 |