FCCA: Hybrid code representation for functional clone detection using attention networks

Wei HUA, Yulei SUI, Yao WAN, Guangzhong LIU, Guandong XU

Research output: Contribution to journalArticlespeer-review

71 Citations (Scopus)

Abstract

Code cloning, which reuses a fragment of source code via copy-and-paste with or without modifications, is a common way for code reuse and software prototyping. However, the duplicated code fragments often affect software quality, resulting in high maintenance cost. The existing clone detectors using shallow textual or syntactical features to identify code similarity are still ineffective in accurately finding sophisticated functional code clones in real-world code bases. This article proposes functional code clone detector using attention (FCCA), a deep-learning-based code clone detection approach on top of a hybrid code representation by preserving multiple code features, including unstructured (code in the form of sequential tokens) and structured (code in the form of abstract syntax trees and control-flow graphs) information. Multiple code features are fused into a hybrid representation, which is equipped with an attention mechanism that pays attention to important code parts and features that contribute to the final detection accuracy. We have implemented and evaluated FCCA using 275 777 real-world code clone pairs written in Java. The experimental results show that FCCA outperforms several state-of-the-art approaches for detecting functional code clones in terms of accuracy, recall, and F1 score. Copyright © 2020 IEEE. 

Original languageEnglish
Pages (from-to)304-318
JournalIEEE Transactions on Reliability
Volume70
Issue number1
Early online dateJul 2020
DOIs
Publication statusPublished - Mar 2021

Citation

Hua, W., Sui, Y., Wan, Y., Liu, G., & Xu, G. (2021). FCCA: Hybrid code representation for functional clone detection using attention networks. IEEE Transactions on Reliability, 70(1), 304-318. https://doi.org/10.1109/TR.2020.3001918

Keywords

  • Attention mechanism
  • Code clone detection
  • Code representation
  • Deep neural network (DNN)

Fingerprint

Dive into the research topics of 'FCCA: Hybrid code representation for functional clone detection using attention networks'. Together they form a unique fingerprint.