How to make a Python project

3 minute read

Created:

Last modified:

The resources/documentations out there can be overwhelming for scientific/academic software development. The general concept of building, managing dependencies, and CI/CD might be gently introduced/reviewed in this lecture, but one could probably find resources more updated and specific to Python elsewhere.

To get a rough idea on when to do what in the whole process, this blog post is very concise to start with. For a second round of reading, this blog post is a more detailed walkthrough. Before doing anything, it’s helpful to also learn about each specific steps and popular tools.

There are tools such as Cookiecutter that can create a fairly comprehensive project templates. For learning purposes, I would like to begin with what I consider minimal/essential.

Example architecture of a package:

packaging_tutorial/
├── LICENSE
├── configuration_file
├── README.md
├── src/
│   └── example_package/
│       ├── __init__.py
│       └── example.py
└── tests/

Breakdown of the components

Project configuration file

Example file formats:

  • pyproject.toml
  • setup.cfg This file mainly specifies what packages (and their versions) are needed to develop your own package, but also records other metadata such as build and test tools used. A build backend tool such as setuptools perform the actual package building using the configuration file and can be called by frontend tools. Build frontend tools include pip and build.

Dev workflow tools

My understanding is that they are mostly needed to manage different virtual environments to resolve package dependency issues, but often comes with powerful general support of other tools for testing and documentation etc. Popular workflow tools include:

  • poetry
  • tox While tox is particularly well-known for testing against several environments (e.g. different version of Python), it seems a bit daunting for me, but this tutorial and this tutorial seems helpful if you are interested. For my first attempt, I went with poetry, but I have seen quite some friction points too. Perhaps plain old setuptools is still the way to go?

Tests

Popular tools include:

  • pytest
  • nose Tests can be referenced by the project configuration file and for pytest, has a naming convention of test_*.py or *_test.py. Note that pytest comes with poetry. Once starting writing tests, it would be helpful to learn about fixtures and how to aggregate them in a conftest.py file.

GitHub actions workflow

Set up a workflow of testing that’s triggered by version control related events such as push or PR. In this example from Careless, upon even triggering, it first sets up the python environment, install dependency, run pytest, and uploads code coverage of the test.

Documentation

Popular tools include:

  • sphinx
  • mkdocs Automatically generate documentation from python docstrings in your code.

Further reading


Additional notes

Codecov

Reports code coverage of testing.

  • Especially for private repo, need to install and configure, generate repo secret for token
  • Need .coveragerc file to specify which directory to get coverage on

Renovate

Manages dependency updates.

  • Also need to install and configure, go through the tutorial at the mean time
  • Change renovate.json as needed to adjust updates frequency for example
  • In the future, might be useful to use the automerge features for small/minor updates