Principles#
Here, I outline what the files included in the project are for, along with some design philosophies I’ve learned along the way. I’ve found a good project should have several things: a good workflow, good code, good documentation, and good testing. I hope that this page serves as a guide for future modifications to the project.
Workflow#
Before making any changes, you should create a fork of this repository and then create a branch to work on the feature. The branch should be self-contained to one particular topic or feature addition. When you develop code, you should exclusively push commits to this branch (i.e. don’t commit directly to the “master” branch). When you’re ready to merge your changes into the “master” branch, you should submit a pull request. This will allow Github to run some automated workflows (more on this below). If everything looks good, you can then merge the branch into the “master” branch.
By doing this, the latest commits on the “master” branch will always be stable. In addition, it will allow you to conceivably work on several issues at once, by just switching which branch you’re working on. You’ll also allow Github to avoid merging code that will cause errors throughout the project (again, more on this below).
Git commits#
When making commits, I start the message in a way that will complete this sentence: “If I accept this commit, I will ___”. That way, the commits are stylistically consistent and therefore a bit easier to read.
Code#
All the code for the project lives in the pyrt directory. I believe that the
most user friendly way to structure it is to make sure the user does not have
to know about the module structure; therefore I chose to import all the classes and
functions in __init__.py. I also find that sometimes it’s desirable to call
functions with either ints, floats, lists, or ndarrays. Thus, I chose to
type hint all functions as ArrayLike and I frequently try to convert all inputs
into ndarrays. I believe this makes the code user friendly, while still allowing
the maintainer to expect a common data type to the core algorithms.
Anytime you write or update a function/class, please put the associated tests
in the tests folder on the same same level as the module. That folder should
have a module called test_<name_of_module> (for example, there’s a module
called angles.py and on the same level, there’s a tests directory
containing test_angles.py and that file contains the unit tests for code
in angles.py.
Documentation#
No one will use the code unless it’s documented, so you should document every function/class that a user might realistically want to use. I chose to document everything with numpy-style. docstrings for consistency. I believe that a realistic example is worth 1024 words, so I chose to put examples in the docstrings throughout. These have an additional bonus when it comes to testing (more on this below).
Sphinx#
Sphinx is a great tool to automatically generate documentation as files you
can view in a browser. Sadly, I found it was confusingly documented for getting
started, so I give some pointers here. First, I’ve configured the project such
that the files that tell Sphinx how to generate the documentation are in the
docs_source directory. These are mostly .rst files. When Sphinx generates
documentation, it should put them in the docs directory. I made this choice
because Github will host documentation for projects like this one, but they
insist that all the html/css files are specifically in a directory named
docs. To generate the documentation, you’ll need the Sphinx and
pydata-sphinx-theme packages (you can just install the project with the
“dev” option to automatically download these).
Sphinx is designed to be a CLI but I dislike this choice, because, as far as I
can tell, Sphinx needs to be installed by the Python version it compiles docs
for. So, if you have projects generated by different Python versions, or even
update your Python interpreter without updating Sphinx, it will cause errors.
Consequently, I recommend running Sphinx with this syntax:
~/repos/pyRT_DISORT/venv/bin/python -m sphinx.cmd.build -b html
~/repos/pyRT_DISORT/docs_source ~/repos/pyRT_DISORT/docs -E.
I briefly discuss this on the notes page of the documentation.
Note that I never figured out a good way to have Sphinx automatically generate documentation when I push to Github (I hear it’s possible though). Consequently, you have to diligent about regenerating the documentation every time you push to “master”, otherwise the docs can get out of sync with the source code.
Testing#
I think it’s very important for all the code to be tested. When possible,
put examples of functions into the docstrings. When it comes time to test the
code (which is usually before submitting a pull request), you can have pytest
run a test suite on your code. If you’re in the pyRT_DISORT directory, this
is simply python -m pytest pyrt. I included a pytest.ini file on the main
level of the directory, which will tell pytest that it should run unit tests
on everything under the “Examples” header of docstrings, in addition to running
tests on all functions within the tests directory (the last part is pytest
default behavior). It will let you know what functions aren’t working before
attempting to merge a branch into the “master” branch.
To provide further assurances that the code works, I added a file named
ci.yaml within the .github/workflows directory. This file defines
what actions that Github should take when an action happens. In this case,
I’ve told it that any time someone tries to merge a branch into “master” or
directly pushes to “master”, it will run a test suite. This involves installing
pyRT_DISORT and running pytest on the docstrings and unit tests. Github runs
this on multiple Python versions and operating systems, so ideally the user will
know that this code is installable and works as advertised on MacOS 11, 12, 13,
etc. and Ubuntu 20.04, 22.04, etc. on Python 3.10, 3.11, 3.12, etc. In my
opinion, this is one of the nicest features that Github offers. If all tests
pass, the project will have a green check mark on the home page and the README
file will note that the CI is passing; oterwise, there will be red X.
Linter#
I also have the CI run a linter and save the results. To see the automated linting results, first go to the “Actions” tab on the project homepage. Then, click on the workflow run. Click one of the jobs that ran and then look for the command “Lint with pylint”. There, you can see the output of pylint on the code.
Miscellaneous#
Oddball files#
There are a number of files in the main level of the project that I haven’t discussed.
The .gitignore file is simply a list of files that should not be tracked by git.
The .pylintrc file tells pylint how to behave when it runs. Practically speaking, this defines what pylint outputs when it runs the CI.
The CITATION.cff file is a way to describe how to cite this repository. It’s not clear to me if this should be removed if you publish each release with zenodo.
The CONTRIBUTING.md file tells potential contributors how they should contribute to the project.
LICENSE.txt is the license.
pyproject.toml is the current best way to define an installation. It’s in this file where I put the project metadata and tell pip how I want it to install the project.
README.rst is the file that lives on the Github homepage.
Github files#
I defined a number of files for Github integration, all within the .github
directory.
PULL_REQUEST_TEMPLATE.md is the template that I asked users to fill out when making a pull request. It just makes it easy for me to have some consistency when looking at pull requests.
I also have bug_report.yaml, documentation.yaml, and feature_request.yaml, which all define a template I ask users to fill out when asking for changes to the project. These are under the
ISSUE_TEMPLATEdirectory. I thought these are the 3 types of issues I was most likely to encounter, but you can of course add more if it becomes useful.