Abstract
A critical and general problem of large language models (LLMs) is that they may hallucinate, generating specious answers, especially when they interpret certain concepts expressed in the given natural language queries using incorrect (sub)domain knowledge they possess. Still, LLMs are widely used by people. To embrace LLMs in education, students can adopt them to seek quick feedback on their questions to enhance their inquisitive approaches to self-learning, while there is a guard against hallucination. This paper introduces DomainProbe, the first approach to domain-level hallucination detection that leverages metamorphic testing for addressing the test oracle problem to improve the trustworthiness of the feedback generated by LLMs. Given a question posed by a student, DomainProbe prompts the LLM to extract key topical terms from the question and provide explanations for each. The student then evaluates whether there is any inconsistent term-explanation pair. If such an inconsistency is identified, the corresponding answer for the question from the LLM is flagged as untrustworthy. We show DomainProbe to achieve promising results by evaluating it on MMLU, a widely used question-answer benchmark dataset. We further discuss our vision on the approach to promote students' learning objectives and outline future work for the metamorphic relations formulated in DomainProbe. Copyright © 2025 IEEE.
| Original language | English |
|---|---|
| Title of host publication | Proceedings of 2025 International Symposium on Educational Technology, ISET 2025 |
| Editors | Kwok Tai CHUI, Chaiporn JAIKAEO, Jitti NIRAMITRANON, Wattana KAEWMANEE, Kwan-Keung NG, Pornthipa ONGKUNARUK |
| Place of Publication | Danvers, MA |
| Publisher | IEEE |
| Pages | 190-195 |
| ISBN (Electronic) | 9798331595500 |
| DOIs | |
| Publication status | Published - 2025 |
Citation
Wei, Z., Lee, V. C. S., & Chan, W. K. (2025). A pilot study of probing before trusting large language models in self-learning. In K. T. Chui, C. Jaikaeo, J. Niramitranon, W. Kaewmanee, K.-K. Ng, & P. Ongkunaruk (Eds.), Proceedings of 2025 International Symposium on Educational Technology, ISET 2025 (pp. 190-195). IEEE. https://doi.org/10.1109/ISET65607.2025.00046Keywords
- AI hallucination detection
- Metamorphic testing
- Test oracle
- Educational technology
- Generative AI