Code reviews
Code reviews are an effective method for improving software quality. McConnell
(2004) suggests that unit testing finds approximately 25% of defects, function
testing 35%, integration testing 45%, and code review 55-60%. While that
means that none of these methods are good enough on their own, and that they
should be combined, clearly code review is an essential tool here.
Code review also improves the development
process. By reviewing new additions for quality, less technical debt is
accumulated, which helps long-term maintainability of the code. Reviewing lets
developers learn from each other, and spreads knowledge of the code around
the team. It is also a good means of getting new developers up to speed.
The main downside of code reviews is that they take time and effort. In
particular, if someone from outside the project does the reviewing, they’ll
have to learn the code, which is a significant investment. Once up to speed,
the burden is reduced significantly however, and the returns include a much
smaller amount of time spent debugging later.
Approach
It’s important to distinguish between semi-formal code reviews and formal
code inspections. The latter involve “up to six participants and hours of
meetings paging through detailed code printouts” (SMARTBEAR 2016). As this
extra formality does not seem to yield better results, we limit ourselves to
light-weight, informal code reviews.
We haven’t yet decided on how to integrate code reviews into our working
process. While that gets hashed out, here is some general advice from various
sources and experience.
Review everything, nothing is too short or simple
Try to have something else to do, and spread the load throughout your
working day. Don’t review full-time.Don’t review for more than an hour at a time, after that the success rate
drops quite quicklyDon’t review more than 400 lines of code (LOC) at a time, less than 200
LOC is betterTake the time, read carefully, don’t review more than 500 LOC / hour
Prerequisites
Before handing over a change or a set of code for review, the following items
should be there for the reviewer to work with:
- Documentation on what was changed and why (feature, bug, issue #, etc.)
- Comments / annotations by the author on the code itself
- Test cases
Also, before doing a code review, make sure any tools have run that check
the code automatically, e.g. checkers for coding conventions and static
analysis tools, and the test suite. Ideally, these are run as part of the
continuous integration infrastructure.
In all cases, the
goal is to use your brain and your programming experience to figure out how
to make the code better. The lists are intended to be a source of inspiration
and a description of what should be best practices in most circumstances.
Some items on this list may not apply to your project
or programming language, in which case they should be disregarded.
The following items are part of a software quality check, but are better done
by an automated tool than by a human. As such, they’ve been excluded from this
checklist. If tools are not available, they should be checked manually.
- Coding conventions (e.g. PEP 8)
- Test coverage
All code should be level 3 or 4.
- level 2 implies that the features in level 1 are not present, level 4 implies that the features in level 3 are also present
This rubric is based on:
Here is a list of things to consider when looking at the program as a whole,
rather than when looking at an individual file or change.
Documentation
Documentation is a prerequisite for using, developing and reviewing the
program. Here are some things to check for.
- Is there a description of the purpose of the program or library?
- Are detailed requirements listed?
- Are requirements ranked according to MoSCoW?
- Is the use and function of third-party libraries documented?
- Is the structure/architecture of the program documented? (see below)
- Is there an installation manual?
- Is there a user manual?
- Is there documentation on how to contribute?
- Including how to submit changes
- Including how to document your changes
Architecture
These items are mainly important for larger programs, but may still be good
to consider for small ones as well.
- Is the program split up into clearly separated modules?
- Are these modules as small as they can be?
- Is there a clear, hierarchical or layered, dependency structure between
these modules?- If not, functionality should be rearranged, or perhaps heavily
interdependent modules should be combined
- If not, functionality should be rearranged, or perhaps heavily
- Can the design be simplified?
Security
If you’re making software that is accessible to the outside world (e.g. a web
application), then security becomes important. Security issues are defects,
but not all defects are security issues. A security-conscious design can help
mitigate the security impact of defects.
- Which modules deal with user input?
- Which modules generate output?
- Are input and output compartmentalised?
- If not, consider making separate modules that manage all input
and output, so validation can happen in one place
- If not, consider making separate modules that manage all input
- In which modules is untrusted data present?
- The fewer the better
- Is untrusted data compartmentalised?
- Ideally, validate in the input module and pass only
validated data to other parts
- Ideally, validate in the input module and pass only
Legal
“I’m an engineer, not a lawyer!” is an oft-overheard phrase, but being an
engineer doesn’t give you permission to ignore the legal rights of the
creators of the code you’re using. Here are some things to check. When in
doubt, ask your licensing person for advice.
- Are the licenses of all modules/libraries that are used documented?
- Are the requirements set by those licenses fulfilled?
- Are the licenses included where needed?
- Are copyright statements included in the code where needed?
- Are copyright statements included in the documentation where needed?
- Are the licenses of all the parts compatible with each other?
- Is the project license compatible with all libraries?
When you’re checking individual changes (e.g. pull requests) or files, the
code itself becomes the subject of scrutiny. Depending on the language, files
may contain interfaces, classes or other type definitions, and functions. All
these should be checked, as well as the file overall:
- Does this file contain a logical grouping of functionality?
- Is it easy to understand?
- Can any of the code be replaced by library functions?
Interfaces
- Is the interface documented?
- Does the concept it models make sense?
- Can it be split up further? (Interfaces should be as small as possible)
Classes and types
- Is the class documented?
- Does it have a single responsibility? Can it be split?
- If it’s designed to be extended, can it be?
- If it’s not designed to be extended, is it protected against that? (e.g. final declarations)
- If it’s derived from another class, can you substitute an object of this class for one of its parent class(es)?
- Is the class testable?
- Are the dependencies clear and explicit?
- Does it have a small number of dependencies?
- Does it depend on interfaces, rather than on classes?
Function/Method declarations
- Are there comments that describe the intent of the function or method?
- Are input and output documented? Including units?
- Are pre- and postconditions documented?
- Are edge cases and unusual things commented?
Function/Method definitions
- Are edge cases and unusual things commented?
- Is there incomplete code?
- Could this function be split up (is it not too long)?
- Does it work? Perform intended function, logic correct, …
- Is it easy to understand?
- Is there redundant or duplicate code? (DRY)
- Do loops have a set length and do they terminate correctly?
- Can debugging or logging code be removed?
- Can any of the code be replaced by library functions?
Security
- If you’re using a library, do you check errors it returns?
- Are all data inputs checked?
- Are output values checked and encoded properly?
- Are invalid parameters handled correctly?
Tests
- Do unit tests actually test what they are supposed to?
- Is a test framework and/or library used?
Providing feedback
The main purpose of a code review is to find issues or defects in a piece of
code. These issues then need to be communicated back to the developer who
proposed the change, so that they can be fixed. Doing this badly can quickly
spoil everyone’s fun.
Perhaps the most important point in this guide therefore is that the goal of a
code review is not to provide criticism of a piece of code, or even worse,
the person who wrote it. The goal is to help create an improved version.
So, when providing feedback, stay positive and constructive. Suggest a better
way if possible, rather than just commenting that the current solution is bad.
Ideally, submit a patch rather than an issue ticket. And always keep in mind
that you’re not required to find anything, if the code is fine, it’s fine. If
it’s more than fine, file a compliment!
Most of our projects are hosted on GitHub, so most results will be
communicated through pull requests and issues there. However, if you find
something particularly bad or weird, consider talking in person, where a
lengthy, complicated, or politically sensitive explanation is easier to do.
If you are reviewing a pull request on Github, comments should be added in the
section, so they can be attached to a particular line of code.
Make many small comments this way, rather than a big ball of text with
everything in it, so that different issues can be kept separate. Where
relevant, refer to existing Issues and documentation.
If you’re reviewing existing code rather than changes, it is still handy to
use pull requests. If you find an issue that has an obvious fix, you can
submit a pull request with a patch in the usual way.
If you don’t have a fix, you can add an empty comment to the relevant line,
and create a pull request from that as a patch. The relevant line(s) will then
light up in the pull request’s Files changed
overview, and you can add your
comments there. In this case, either the pull request is never merged (but the
comments processed some other way, or not at all), or the extra comments are
reverted and replaced by an agreed-upon fix.
In all cases, file many small pull requests, not one big one, as GitHub’s
support for code reviews is rather limited. Putting too many issues into a
single pull request quickly becomes unwieldy.
Atwood, Jeff (2006) Code Reviews: Just Do It
Burke, Kevin (2011)
McConnell, Steve (2004) Code Complete: A Practical Handbook of Software Construction, Second Edition. Microsoft Press. ISBN-13: 978-0735619678
SMARTBEAR (2016) Best practices for code review.