•  
  •  
 

Abstract

Software quality is strongly correlated with the quantity and severity of bugs. While there are a variety of tools, techniques, and practices to aid production of robust and resilient code, low quality software is draining trillions of dollars from organizations annually. Meanwhile, debugging and fixing coding errors consumes upwards of half of developer labor. To say this situation is untenable is an understatement. Fortunately, automated software repair offers a possible solution. The literature around automated code fixing has been expanding with a variety of implementations ranging from genetic programming, code translation, and various machine learning algorithms. All report positive results, however there has not yet been a dedicated effort to measure to what extent the various implementations are generalizable. Accordingly, we sought to replicate a prominent study in the field in two parts. The first part consisted of replicating the training of the machine learning model using the source study materials. We found training to be impossible at first due to package dependencies and missing package files. However, we were able to replicate the self-repair evaluation. The results were identical to the source study. Later, using a Docker compose file obtained from the original authors, we were able to replicate BIFI model training and again match outcomes. Overall, based on the replication outcomes, we offer future leaning recommendations and ideas for future work.

Share

COinS