Productivity estimation for development teams based on Git metadata
Supervisors: Moreno Colombo, Edy Portmann
Student: Stefan Stojkovski
Project status: Finished
Year: 2022
Developer’s productivity is crucial to the success of getting one product to market on time. Many software companies tend to improve the productivity of their developers by using specific timeframes or metrics. They would set goals of what they would like to follow for a specific timeframe. However, until today, there is no specific to set the standards across the industry. The goals are usually set by specific standards inside a company. This paper specifies productivity benchmark metrics that could be used as a standard comparison across the industry. The productivity metrics are based on software commits which play an important spot in collaborative software development teams. In this work, we have conducted an empirical analysis with commits to analyze different committing behaviors among authors. We compare the analysis and produce productivity metric benchmarks using statistics across the software development industry. The benchmark metrics are based on a dataset that contains 3.5M commits, 47’318 unique authors with 828’990 unique author-days, 9’609 development days, and 5’751 repositories from 25 systems. The commits from the dataset begin from early 1990 until early 2022 and it includes 10 different programming languages. We are able to specify the good and bad commit behavior with commit analysis. We benchmarked commit efficiency and commit/merge ratio among the industry. The final result is shown on a web application that contains the benchmark metrics, together with different charts that visualize the commit history. A survey and on-hand evaluation were conducted with architects from the analyzed systems and the results showed that the system is easy to understand and correct. We proved that commit benchmarks help the industry to place the borders of good and bad commit behavior.
Keywords: Developer productivity, Benchmark, Software metrics, Data mining, git