The purpose of the comparative case study based project was to discern the better model between models based on Gated Graph Neural Networks (GGNN) and Relational Graph Convolutional Networks (RGCN) on the Variable Misuse Task, a prediction task involving discerning the correct variable to be used in a particular spot amongst all variables of the same type in the particular scope. The comparison between GGNN and RGCN models involved computing the test accuracy on three experiments the source data of which is obtained by downloading the source code of the top 25 trending C# repositories on Github. These three experiments involved training and obtaining the test accuracy of all the repositories, an esoteric and popular repository to deduce which model was more performant across different types of source code.
The overarching goal for this project was to discover which model would generalize and perform better in the Static Analysis tooling space that’s typically rule based by inculcating the representational power of Deep Learning to solve more state-of-the-art problems. The Data Science concepts related to this capstone are machine learning (specifically deep learning),
hyperparameter optimization, visualization techniques, graph theory and natural language processing. Also, putting together the final report involved a lot of the writing techniques to present the data in a simple yet compelling manner.
The results from the data highlighted that the Relational Graph Convolutional Network outperformed the Gated Graph Neural Network across all experiments, although, within a margin of 5%. Training on more data resulted a higher test accuracy for both models and a smaller difference between the two.