Principles#

Here, I outline what the files included in the project are for, along with some design philosophies I’ve learned along the way. I’ve found a good project should have several things: a good workflow, good code, good documentation, and good testing. I hope that this page serves as a guide for future modifications to the project.

Workflow#

Before making any changes, you should create a fork of this repository and then create a branch to work on the feature. The branch should be self-contained to one particular topic or feature addition. When you develop code, you should exclusively push commits to this branch (i.e. don’t commit directly to the “master” branch). When you’re ready to merge your changes into the “master” branch, you should submit a pull request. This will allow Github to run some automated workflows (more on this below). If everything looks good, you can then merge the branch into the “master” branch.

By doing this, the latest commits on the “master” branch will always be stable. In addition, it will allow you to conceivably work on several issues at once, by just switching which branch you’re working on. You’ll also allow Github to avoid merging code that will cause errors throughout the project (again, more on this below).

Git commits#

When making commits, I start the message in a way that will complete this sentence: “If I accept this commit, I will ___”. That way, the commits are stylistically consistent and therefore a bit easier to read.

Code#

All the code for the project lives in the pyrt directory. I believe that the most user friendly way to structure it is to make sure the user does not have to know about the module structure; therefore I chose to import all the classes and functions in __init__.py. I also find that sometimes it’s desirable to call functions with either ints, floats, lists, or ndarrays. Thus, I chose to type hint all functions as ArrayLike and I frequently try to convert all inputs into ndarrays. I believe this makes the code user friendly, while still allowing the maintainer to expect a common data type to the core algorithms.

Anytime you write or update a function/class, please put the associated tests in the tests folder on the same same level as the module. That folder should have a module called test_<name_of_module> (for example, there’s a module called angles.py and on the same level, there’s a tests directory containing test_angles.py and that file contains the unit tests for code in angles.py.

Documentation#

No one will use the code unless it’s documented, so you should document every function/class that a user might realistically want to use. I chose to document everything with numpy-style. docstrings for consistency. I believe that a realistic example is worth 1024 words, so I chose to put examples in the docstrings throughout. These have an additional bonus when it comes to testing (more on this below).

Sphinx#

Sphinx is a great tool to automatically generate documentation as files you can view in a browser. Sadly, I found it was confusingly documented for getting started, so I give some pointers here. First, I’ve configured the project such that the files that tell Sphinx how to generate the documentation are in the docs_source directory. These are mostly .rst files. When Sphinx generates documentation, it should put them in the docs directory. I made this choice because Github will host documentation for projects like this one, but they insist that all the html/css files are specifically in a directory named docs. To generate the documentation, you’ll need the Sphinx and pydata-sphinx-theme packages (you can just install the project with the “dev” option to automatically download these).

Sphinx is designed to be a CLI but I dislike this choice, because, as far as I can tell, Sphinx needs to be installed by the Python version it compiles docs for. So, if you have projects generated by different Python versions, or even update your Python interpreter without updating Sphinx, it will cause errors. Consequently, I recommend running Sphinx with this syntax: ~/repos/pyRT_DISORT/venv/bin/python -m sphinx.cmd.build -b html ~/repos/pyRT_DISORT/docs_source ~/repos/pyRT_DISORT/docs -E. I briefly discuss this on the notes page of the documentation.

Note that I never figured out a good way to have Sphinx automatically generate documentation when I push to Github (I hear it’s possible though). Consequently, you have to diligent about regenerating the documentation every time you push to “master”, otherwise the docs can get out of sync with the source code.

Testing#

I think it’s very important for all the code to be tested. When possible, put examples of functions into the docstrings. When it comes time to test the code (which is usually before submitting a pull request), you can have pytest run a test suite on your code. If you’re in the pyRT_DISORT directory, this is simply python -m pytest pyrt. I included a pytest.ini file on the main level of the directory, which will tell pytest that it should run unit tests on everything under the “Examples” header of docstrings, in addition to running tests on all functions within the tests directory (the last part is pytest default behavior). It will let you know what functions aren’t working before attempting to merge a branch into the “master” branch.

To provide further assurances that the code works, I added a file named ci.yaml within the .github/workflows directory. This file defines what actions that Github should take when an action happens. In this case, I’ve told it that any time someone tries to merge a branch into “master” or directly pushes to “master”, it will run a test suite. This involves installing pyRT_DISORT and running pytest on the docstrings and unit tests. Github runs this on multiple Python versions and operating systems, so ideally the user will know that this code is installable and works as advertised on MacOS 11, 12, 13, etc. and Ubuntu 20.04, 22.04, etc. on Python 3.10, 3.11, 3.12, etc. In my opinion, this is one of the nicest features that Github offers. If all tests pass, the project will have a green check mark on the home page and the README file will note that the CI is passing; oterwise, there will be red X.

Linter#

I also have the CI run a linter and save the results. To see the automated linting results, first go to the “Actions” tab on the project homepage. Then, click on the workflow run. Click one of the jobs that ran and then look for the command “Lint with pylint”. There, you can see the output of pylint on the code.

Miscellaneous#

Oddball files#

There are a number of files in the main level of the project that I haven’t discussed.

  • The .gitignore file is simply a list of files that should not be tracked by git.

  • The .pylintrc file tells pylint how to behave when it runs. Practically speaking, this defines what pylint outputs when it runs the CI.

  • The CITATION.cff file is a way to describe how to cite this repository. It’s not clear to me if this should be removed if you publish each release with zenodo.

  • The CONTRIBUTING.md file tells potential contributors how they should contribute to the project.

  • LICENSE.txt is the license.

  • pyproject.toml is the current best way to define an installation. It’s in this file where I put the project metadata and tell pip how I want it to install the project.

  • README.rst is the file that lives on the Github homepage.

Github files#

I defined a number of files for Github integration, all within the .github directory.

  • PULL_REQUEST_TEMPLATE.md is the template that I asked users to fill out when making a pull request. It just makes it easy for me to have some consistency when looking at pull requests.

  • I also have bug_report.yaml, documentation.yaml, and feature_request.yaml, which all define a template I ask users to fill out when asking for changes to the project. These are under the ISSUE_TEMPLATE directory. I thought these are the 3 types of issues I was most likely to encounter, but you can of course add more if it becomes useful.