Documenting your code might seem like a time-wasting distraction. It’s very easy to convince yourself in the moment “I’ll totally remember how this works” or “it should be obvious what this line does.” I’ve done it myself, and then been totally confused by my own code when I go back to it a few months later. Comments within the code itself are an important form of documentation, but this post will discuss a different form: the README.
What is a README?
You’ve definitely already seen a README: they’re the text files that introduce a project and briefly explain how it works. Whenever you go to a git repo on Github or Gitlab (like uni10 or Firefox) the README serves as the landing page, but you’ll also see README files almost any time you download or install software. The README is the first stop for anyone that will use your code. If you design it well, it might be the last stop too:
Your documentation is complete when someone can use your module without ever having to look at its code. … Remember: the documentation, not the code, defines what a module does. –Ken Williams (source)
Why do I need to write a README?
Code is only as useful as its documentation. As scientists, we have an extra responsibility to document our code because our code is basically like part of an experiment. It is crucial that your code is an accurate record of what you did, even if you think you are the only person that will ever use your program. In our group the primary problem is ensuring that your code is usable by someone else after you leave. Perhaps another student will need it, or we might need to recheck some calculation.
Of course, often other people will use your code, and a well-written README can help them understand how it works, what its limitations are, and how to use it. The selfish reason to write one is for yourself in the future. Perhaps you know there is some issue with your program, like the parameter J cannot be set to a number greater than 2. That might be fine right now, but will you remember that when you have to rerun your code a year from now? Another good example are dependencies, especially if you had to install anything special. If you write those down right away it will be easy to get your code running on a different machine or after an operating system upgrade.
How to I write a README?
There is no single best answer here. The most important thing is that you do write one. You might think that its obvious how to compile and run your code, but someone new might be totally lost. Little things like compiler flags can really trip a new user up.
What should it include?
In the following subsections I provide some examples of sections you may want to include in your README file.
There is no shortage of README templates out there. These are a good place to start, but computational physics codes have unique needs, so I posted a template on Gitlab you can use as a starting point if you like.
At the very top you should have a brief introduction that includes:
- Program name
- Name of author(s)
- Contact information (email, website)
- A brief description of what your program does
- What technique it uses
- The model it studies
- What it’s for (why would someone want to use this program?)
- Copyright notice. I’m not an expert on licenses, so I don’t have much advice here. To keep it simple you can just say “Copyright YOUR NAME YEAR”
- Links to more extensive documentation elsewhere (papers, examples, etc).
Instructions to quickly get started with your program, e.g. how to compile and install with default settings and how to run it (it’s useful to include an easy way to test if they compiled and installed it correctly.
What does your program assume about the system? You don’t necessarily need to test rigorously, but you can at least say what system you developed your code on and what compiler you used. Write these down as you write your code so you don’t have to remember them lately. Other examples of dependencies:
- Libraries (especially anything you had to install, like MPI)
- Anything platform-dependent
- A specific compiler required
- Other external programs
List your source code files and any auxiliary files, where they are located and what they are for.
What inputs does your program require? Do you specify the parameters of the simulation as command line arguments or are they in a file? What is the format of the file? Which parameters are optional? What are the default values?
If it runs properly your program probably produces data. Is that data displayed on screen? Written to disk? Returned by a function? What format is the data in? How is it normalized?
This section definitely distinguishes computational physics code from ordinary programming. You don’t need to provide a full biography, but some links to papers that describe the method you’re using in detail, or papers you wrote using this program would be great.
Is there anything about your code that doesn’t work? Maybe J cannot be set to zero, maybe there’s a small memory leak, or a segfault that occurs under specific conditions. Depending on the bug, you might not really need to fix it, but it’s definitely important to tell the user so they aren’t caught by surprise (in some cases the user may be you in the future).
Formatting with Markdown
You will probably want to write your README in Markdown. Markdown is a simple way to format a text document (as opposed to markUP, get it? 🤔). The idea of Markdown is to allow minimal formatting while while remaining readable as plain text. You’re probably already familiar with Markdown since it is commonly used for simple text formatting on platforms like Slack. Both Github and Gitlab support Markdown. If you use a mac you can even get “Quick Look” (using the spacebar to view a preview of a file) to display correctly formatted markdown text using my guide here.
I’m not going to do a Markdown tutorial here (instead see the links in the resource section), but as a very quick introduction:
Enclose *italic* text within asterisks and **bold** text within double asterisks. You can include inline code using single backticks `like this` or sections of code with triple backticks like so:
cout << x << endl;
A Markdown-formatted file will typically end in ‘.md’. A quick Google search will return countless examples of Markdown editors for any platform.