When talking about writing good code, the first words which come to mind are “data structures and algorithms”. Next comes a strong foundation in object-oriented programming, as taught in any introductory software engineering course.
But what actually constitutes good code? How can you identify it, quantify it, and compare it to another snippet of code which solves the same problem? With multiple, equally efficient solutions to a problem, how do you choose the best one?
This is a question which stuck in the back of my mind as I read through 99 Bottles of OOP: A Practical Guide to Object-Oriented Design by Sandi Metz and Katrina Owen, a book recommended to me by my lead at Shopify.
This book opens by asking the reader to write some code which prints the lyrics to the song “99 Bottles of Beer”. It then carries this example throughout the book, walking through various implementations of this code, highlighting good and bad software design decisions.
I’ve extracted the main lessons I took away from this book in this article, though the book discusses each of these concepts in detail with a concrete example. I warn you now, these concepts are better understood in the context of the concrete example in Metz’s and Owen’s book. More of these lessons stuck with me because I tried implementing my own solution first, then realized how bad it was in comparison to the examples presented.
The most pronounced theme throughout this book is rediscovering simplicity. This refers to the idea that sometimes it’s okay to repeat yourself in favour of code that is easy to read. Code will be read many more times than it is written, so it should be easy to understand. Ideally, code should be easy to write, easy to understand, and cheap to change.
After learning the basics of object-oriented programming, it’s easy for a young developer to jump to abstracting code wherever possible. Sometimes… concrete code is actually better!
note: abstractions here generally refers to extracting parts of code into a new method or variable.
Abstractions sacrifice simplicity to extract duplication. When you define and call a method instead of directly implementing that behaviour, you add a level of indirection, making the code harder to understand. Your code should be as simple as it can be for what it needs to do right now, as this will make it the easiest to read. Your code should make it obvious when functionality stays the same and why it would change. Simple code which follows the Don’t Repeat Yourself (DRY) principle only extracts key concepts into methods or variables — good design is knowing when to stop abstracting.
Don’t make early abstractions. I’m guilty of making abstractions before they’re necessary, over-anticipating changes to be made later on. You don’t know what the requirements of any future changes will be, so you don’t know if your abstractions now will make a positive difference later. Resist abstractions until you must make them! Don’t put all your focus in trying to reduce future costs.
Name methods after what they’re for, not how they work. A function should conceal its implementation from a user, only revealing its purpose. Method names should be one level of abstraction higher than its implementation — this isolates the code from implementation details. For example, a method whose purpose is to find an item might be called find_item, as opposed to binary_search, which reveals one of the ways this method might be implemented.
Limit the number of dependencies. Knowledge that one object has about another creates a dependency. This can be, for example, passing a parameter into a method, or conditionally formatting the output of a function. The more dependencies, the more one object knows about how another works. A function should only reveal its purpose!
code should be straightforward and intention-revealing. it doesn’t concern itself with changeability or future maintenance.
slow down and test first
Perhaps the most applicational strategy analyzed in this book is the Red, Green, Refactor Method for test-driven development. The concept behind test-driven development (TDD) is straightforward: write tests first, then write the code which makes these tests pass.
When writing tests, focus on the overall solution you’re trying to achieve. When writing code, pretend to know nothing other than the requirements specified by the tests. The general strategy is as follows:
- Write one test which tests for the simplest thing which will show your code is doing something right.
- Write just enough code to get this one test passing — for the sake of the strategy, this might mean hardcoding the output.
- Write a second test which tests the simplest but most useful thing to show your existing code is incorrect.
- Write some more code, and repeat…
as tests get more specific, the code should stay equally specific. there should never be a time when the code could do something that’s not being tested for.
Tests should test what your code does without any knowledge of how it does it. You know your tests are too tightly coupled to your code if you change an existing implementation detail and your tests fail. It must be difficult to get your tests passing with an incorrect implementation; on the flip side, having tests which pass don’t guarantee the best expression of code.
When in a hurry to get a task done, I used to hack away at what looked like a functioning solution, then wrote tests as a form of documentation for the next developer. Although TDD feels like a slow way to develop, when followed precisely, it takes much less thought, time, and error to write better code.
quantifying good code
So far it seems like the answer to good code is whatever is easiest to read and write. For the sake of quantification, there are two algorithms mentioned which attempt to measure the complexity of code.
cyclomatic complexity — counts the number of unique execution paths and gives the minimum number of tests needed to cover all the logic in the code. This helps identify code that is difficult to test or maintain.
assignments, branches, conditions (ABC) metric — counts the number of assignment variables (A), control flow branches (B), and conditional logic paths (C). This can be used as an indicator for code complexity, where the higher the score, the greater the complexity. Solutions with more lines of code tend to have greater scores.
It’s important to remember that these algorithms only approve code structure, and not variable naming, so metrics may overstate the quality of code.
As interesting as these algorithms are, I’m curious to see what it would look like to continually run these tests in a development environment. I’d imagine the algorithms would only run through methods and classes affected by new changes, making it relatively efficient to run each time a change is made. However, if that’s the case, how come I haven’t seen these in use or mentioned more frequently? Perhaps the measurements are too unreliable or inaccurate in practise. 🤔
The second half of the book discussed refactoring methods, however I found the content to be much less tangible, so I didn’t bother to recap those chapters. Overall I found the lessons in the book eye-opening and pragmatic — I can already see a difference in the quality of my code just by using test driven development strategies and keeping simplicity at front of mind.
I highly recommend reading 99 Bottles of OOP: A Practical Guide to Object-Oriented Design by Sandi Metz and Katrina Owen for more specific examples of how to apply these concepts!