Guides and Explanations#
This section contains some guides that have proven useful before when you are starting a project from scratch or porting an existing project.
In case you are unsure about the use(fulness) of Pixi Environments and Pre-Commit Hooks you will find concise explanations below.
Starting a new project from scratch#
Your general strategy should be one of divide and conquer. If you are not used to thinking in computer science / software engineering terms, it will be hard to wrap your head around all of the things that are going on. So write one bit of code at a time, understand what is happening and why, and move on.
Assuming you have installed the template for the language(s) of your choice as described in Customising the template for your needs, my recommendation would be as follows.
Leave the examples in place.
Now add your own data and code bit by bit. Append the
task_xxx
files as necessary or create new ones.Remove the build directory regularly to make sure you do not rely on outputs from tasks that do not exist any more — this is a frequent source of confusion.
Once you feel secure enough that you do not need the template files any more, delete all files carrying a
_template
in their names. You will also need to adjust the documents so they do not refer to figures and tables created by the template any more. Delete the build directory to make sure you do not rely on outputs from tasks that you removed.
Porting an existing project#
Your general strategy should be one of divide and conquer. If you are not used to thinking in computer science / software engineering terms, it will be hard to wrap your head around all of the things that are going on. So move one bit of code at a time to the template, understand what is happening and why, and move on.
Assuming that you use Git, first move all the code in the existing project to a subdirectory called old_code. Commit.
Now set up the templates.
Start with the data management code and move your data files to the spot where they belong under the new structure.
Move (the first steps of) your data management code to the folder under the templates. Create new
task_...
files.Run
pytask
, adjusting the code for the errors you’ll likely see.Move on step-by-step like this.
Once you feel secure enough that you do not need the template files any more, delete all files carrying a
_template
in their names. You will also need to adjust the documents so they do not refer to figures and tables created by the template any more. Delete the build directory to make sure you do not rely on outputs from tasks that you removed.
Pixi Environments#
Progammes change. Few things are as frustrating as coming back to a project after a long time and spending the first {hours, days} updating your code to work with a new version of your favourite data analysis library. The same holds for debugging errors that occur only because your coauthor uses a slightly different setup.
The solution is to have isolated environments on a per-project basis. Pixi environments allow you to do precisely this. This page describes them a little bit and explains their use.
The following commands can either be executed in a terminal or the Powershell (Windows).
Using the environment#
The templates ship with a pre-configured environment.
You can inspect the contents in the
[tool.pixi.xxx]
sections.When you type
pixi run ...
orpixi install
, the packages are downloaded to the.pixi
folder in the project root.
Updating packages#
Precise versions of packages are pinned down in the file pixi.lock
, ensuring
reproducibility. If you want to update a package, make sure that you are in the project
root and run
$ pixi update
to update all packages, or run
$ pixi update [package]
to update a specific [package]
.
Installing additional packages#
To list installed packages, type
$ pixi list
If you want to add a package to your environment, you can add the package to the
[tool.pixi.dependencies]
section in the pyproject.toml file. Alternatively, you can
run
$ pixi add [package]
You will notice that the pixi section in the pyproject.toml file is then also updated with the added package.
Choosing between conda-forge and PyPI
If you add a package under [tool.pixi.dependencies]
in the pyproject.toml file, pixi
will try to install the package via conda-forge. If you add
a package under [tool.pixi.pypi-dependencies]
, pixi will try to install the package
from PyPI.
Generally it is recommended to use conda-forge whenever possible. It is a necessity for many scientific packages. These often are not pure-Python code and pip is built mainly for that. For pure-Python packages, sometimes nobody bothered to set up a conda-forge package and we use pip.
Pre-Commit Hooks#
Pre-commit hooks are checks and syntax formatters that run upon every commit. If one of the hooks fails, the commit is aborted and you have to commit again after you resolved the issues raised by the hooks. Pre-commit hooks are defined in the .pre-commit-config.yaml. template_project contains most hooks you will need. Below we present three common hooks. Note that some hooks are programming language agnostic while others work on a specific language. You can find a list of most hooks in the pre-commit documentation under Supported hooks.
ruff: Formats your Python code and checks for errors. Ruff-formatted code looks the same regardless of the project you’re reading. Having ruff as a hook allows you to focus on the content while writing code and let the formatting along with catching of many errors be done automatically before each commit.
check-yaml: Checks whether all .yaml and .yml files within your project are valid yaml files. Similarly, having check-yaml as a hook allows you to focus on the content while writing yaml files. If you accidentally use a wrong syntax this hook will tell you before you commit.
codespell: Fixes common misspellings in text files. It’s designed primarily for checking misspelled words in source code, but it can be used with other files as well.
If you want to skip the pre-commit hooks for a particular commit, you can run:
$ git commit -am <your commit message> --no-verify