NaturalCC: An open-source toolkit for code intelligence

Yao WAN, Yang HE, Zhangqian BI, Jianguo ZHANG, Yulei SUI, Hongyu ZHANG, Kazuma HASHIMOTO, Hai JIN, Guandong XU, Caiming XIONG, Philip S. YU

Research output: Chapter in Book/Report/Conference proceedingChapters

6 Citations (Scopus)

Abstract

We present NaturalCC, an efficient and extensible open-source toolkit for machine-learning-based source code analysis (i.e., code intelligence). Using NaturalCC, researchers can conduct rapid prototyping, reproduce state-of-the-art models, and/or exercise their own algorithms. NaturalCC is built upon Fairseq and PyTorch, providing (1) a collection of code corpus with preprocessing scripts, (2) a modular and extensible framework that makes it easy to repro-duce and implement a code intelligence model, and (3) a benchmark of state-of-the-art models. Furthermore, we demonstrate the usability of our toolkit over a variety of tasks (e.g., code summarization, code retrieval, and code completion) through a graphical user interface. The website of this project is http://xcodemind.github.io, where the source code and demonstration video can be found.

Original languageEnglish
Title of host publicationProceedings of 2022 ACM/IEEE 44th International Conference on Software Engineering: Companion Proceedings, ICSE-Companion 2022
PublisherIEEE
Pages149-153
ISBN (Electronic)9781665495981
DOIs
Publication statusPublished - 2022

Citation

Wan, Y., He, Y., Bi, Z., Zhang, J., Sui, Y., Zhang, H., Hashimoto, K., Jin, H., Xu, G., Xiong, C., & Yu, P. S. (2022). NaturalCC: An open-source toolkit for code intelligence. In Proceedings of 2022 ACM/IEEE 44th International Conference on Software Engineering: Companion Proceedings, ICSE-Companion 2022 (pp. 149-153). IEEE. https://doi.org/10.1145/3510454.3516863

Keywords

  • Code intelligence
  • Deep learning
  • Code representation
  • Code embedding
  • Open source
  • Toolkit
  • Benchmark

Fingerprint

Dive into the research topics of 'NaturalCC: An open-source toolkit for code intelligence'. Together they form a unique fingerprint.