Code Summarization: Generating Summary of Code Snippets

Authors

Shreya. R. Mehta
Department of Computer Engineering College of Engineering Pune India.
Sneha. S. Patil
Department of Computer Engineering College of Engineering Pune India.
Nikita. S. Shirguppi
Department of Computer Engineering College of Engineering Pune India.
Vahida Attar
Head of Computer and IT Department College of Engineering Pune India.

Synopsis

Source Code Summarization refers to the task of creating understandable natural language summaries from a given code snippet. Good-quality and precise source code summaries are needed by numerous companies for a platitude of reasons - training for newly joined employees, understanding what a newly imported project does, in brief, maintaining precise summaries on the evolution of source code (using git history), categorizing the code, retrieving the code, automatically generating documents, etc. There is a considerable distinction between source code and natural language since source code is organized, has loops, conditions, structures, classes, and so on. Most of the models follow an encoder-decoder structure, we propose an alternative approach that uses UAST(Universal Abstract Syntax Tree) of the source code to generate tokens and then use the Transformer model for a self-attention mechanism which unlike the RNN method is helpful for capturing long-range dependencies. We have considered Java code snippets for generating code summaries.

WREC21
Published
September 22, 2021
Online ISSN
2582-3922