Work Practices and Challenges in Pull-based Development – Gousios et al. 2015
In the recent years, we are witnessing that collaborative, lightweight code review is increasingly becoming the default mechanism for integrating changes, in both collocated and distributed development. Effectively, the pull request (in various forms) is becoming the atomic unit of software change.
How do the maintainers of large projects deal with a flood of incoming pull requests? And what should you do if you want to maximise your chances of getting a pull request accepted? Gousios et al. share the results of surveying 749 integrators on projects with active pull requests and analysing the results.
The role of the integrator is crucial. The integrator must act as a guardian for the project’s quality while at the same time keeping several (often, more than ten) contributions “in- flight” through communicating modification requirements to the original contributors. Being a part of a development team, the integrator must facilitate consensus-reaching discussions and timely evaluation of the contributions. In Open Source Software (OSS) projects, the integrator is additionally taxed with enforcing an online discussion etiquette and ensuring the project’s longevity by on-boarding new contributors.
The study looks at five key questions:
- How do integrators use pull-based development in their projects?
- How do integrators decide whether to accept a contribution?
- How do integrators assess the quality of a contribution?
- How do integrators prioritize the application of contributions?
- What key challenges do integrators face when working with the pull development model?
How is pull-based development being used?
80% of integrators use pull requests for doing code reviews, and for resolving issues. Over 50% use them to discuss new features. Two additional use cases come to light through the open comments:
Respondent R635 mentions that they use pull requests in “every commit we make. We have a policy of having every commit, even bumping up version number for next release, coming in on a PR.”. The project has effectively turned pull requests into a meta-version control system, one that only allows reviewed code to be merged. This merging behaviour is also in place within Microsoft  and in the Android project . Another integrator is using pull requests as a time machine mechanism: R521: “Ideally, any change, because using PRs makes it easier to rollback a change if needed”.
Bug fixes are the most common type of pull request (73% of projects surveyed receiving at least one bug fix pull request per week), followed by new features (50%) and refactorings (33%). Large projects receive bug fixes through pull requests almost daily…
75% of projects do explicit code reviews on all incoming contributions. Most integrators use inline code comments in the pull request for this.
… at least 20 integrators reported that multiple developers are required to review a pull request to ensure high quality. Typically this is 2 reviewers, e.g. R481: “We have a rule that at least 2 of the core developers must review the code on all pull requests.”. Rigby and Bird also report a similar finding in Gerrit-based industrial projects.
The vast majority of integrators are using the GitHub web interface to perform the merge. Integrators prefer commit metadata preserving merges.
Factors determining whether a contribution will be accepted
The most important factor leading to acceptance of a contribution is its quality. Quality has many manifestations in our response set; integrators examine the source code quality and code style of incoming code, along with its documentation and granularity: “Code style and whether or not it matches project style. Overall programming practice, lack of hacks and workarounds.” (R32). At a higher level, they also examine the quality of the commit set and whether it adheres to the project conventions for submitting pull requests.
Integrators also look at project fit – is it in line with the goals of the project – and technical fit. There is also an element of timeliness to this: does it fit with the current priorities?
Tested code is another strong signal:
Apart from assessing the quality of contributions using higher level signals, integrators also need to assess whether the contributed code actually works. Initially, integrators treat the existence of testing code in the pull request as a positive signal. Success of test runs by a continuous integration system also reinforces trust in the code: “All tests must pass integration testing on all supported platforms. . . ”(R94).
Contributions that are ultimately rejected are rejected based on technical quality (85%), test failures (55%) and failures to follow project conventions (48%).
Reviewer availability and contributor responsiveness both contribute to the overall time taken to resolve a pull request.
How is the quality of a contribution assessed?
Integrators often relate contribution quality to the quality of the source code it contains. To evaluate source code quality, they mostly examine non-functional characteristics of the changes. Source code that is understandable and elegant , has good documentation and provides clear added value to the project with minimal impact is preferred.
Beyond the source code, clear pull request documentation and well organised commits, “well written commit messages; one commit about a single subsystem — each commit compiles separately,” as well as overall size all influence quality perception.
Testing plays an important role in evaluating submissions. Initially, the very existence of tests in the pull request is perceived as a positive signal…. Moreover, in performance-critical code, performance degradation is frowned upon and in some cases, integrators require proof that performance is not affected by the proposed change, e.g. in R72: “Performance related changes require test data or a test case”.
Most projects (75%) are using CI, but relatively few use any other code quality or metric tools (15%), or code coverage reports (18%). Nearly all of those that do, run them through continuous integration.
How are contributions prioritized?
The first thing that integrators examine is the contribution’s urgency. In case of bug-fixing contributions, the criticality of the fix is the most important feature to prioritize by. Integrators examine at least the following factors to assess criticality: i) the contribution fixes a security issue, ii) the contribution fixes a serious new bug, iii) the contribution fixes a bug that other projects depend upon, and iv) number of issues blocked by the unsolved bug.
For new features, integrators assess whether the feature is user facing / requested or an internal feature.
Several integrators also mentioned that they just examine the type of the contribution before its criticality; it is usually project policy to handle bug fixing contributions before enhancements, as is the case with R446: “Bug fixes first, then new features. Only if all bug fix pull requests are treated.”
Easy to integrate contributions (smaller size, lower complexity) also tend to handled more quickly. In the absence of any other strong signals, FIFO ordering is common:
The pull request age plays an important role in prioritization for integrators. It is interesting to note that many integrators prefer a first-in, first-out treatment of the pull requests before applying other prioritization criteria.
What key challenges do integrators face?
Integrators report challenges maintaining quality and velocity:
…maintaining quality is what most integrators perceive as a serious challenge. As incoming code contributions mostly originate from non-trusted sources, adequate reviewing may be required by integrators familiar with the project area affected by it. Reviewer availability is not guaranteed, especially in projects with no funded developers. Often, integrators have to deal with solutions tuned to a particular contributor requirement or an edge case; asking the contributor to generalize them to fit the project goals is not straightforward. A related issue is feature isolation; contributors submit pull requests that contain multiple features and affect multiple areas of the project.
The volume of incoming requests can be hard to keep up with, and aging pull requests break flow which makes it even harder…
Integrators of popular projects mentioned that the volume of incoming contributions is just too big (e.g. Ruby on Rails receives on average 7 new pull requests per day) consequently, they see triaging and work prioritization as challenges. As requests are kept on the project queue, they age: the project moves ahead in terms of functionality or architecture and then it is difficult to merge them without (real or logical) conflicts. Moreover, it is not straightforward to assess the impact of stale pull requests on the current state of the project or on each other.
Lack of responsiveness on behalf of the contributor also hurts the code review process and consequently flow. The constant context switching between tasks also places a burden on integrators.
Finally, integrators struggle with the social aspects of managing contributions:
Integrators often have to make decisions that affect the social dynamics of the project. Integrators reported that explaining the reasons for rejection is one of the most challenging parts of their job as hurting the contributor’s feelings is something they seek to avoid. As R255 explains: “Telling people that something is wrong without hurting their feelings or giving them an incorrect idea of my intentions.”. Similarly, integrators find that asking for more work from the contributors (e.g. as a result of a code review) can be difficult at times, as they “. . .worry about alienating our valued contributors” (R635). Motivating contributors to keep working on the project, even in the face of rejected contributions, is not easy for integrators either.
A Final word on Quality
Throughout our analysis, the issue of quality evaluation was recurring. The respondents directly linked quality with acceptance while also described maintaining quality as a big challenge. According to integrators, quality emerges from attention to detail; code style, documentation, commit formatting and adherence to project conventions all help to build confidence in the contribution… An open question is how to efficiently automate the quality evaluation for pull requests. While tools that automate the evaluation of many tasks that the developers do to determine quality (e.g. code style analyzers, test coverage, metrics for software quality, impact analysis etc.) do exist, we have seen that developers go little beyond testing and continuous integration.