z - Advanced topic: Building an environment for literate programming
Source:vignettes/z-advanced-topic-building-an-environment-for-literate-programming.Rmd
z-advanced-topic-building-an-environment-for-literate-programming.Rmd
Introduction
This vignette will walk you through setting up a development environment with rix that can be used to compile Quarto documents into PDFs. We are going to use the Quarto template for the JSS to illustrate the process. The first section will show a simple way of achieving this, which will also be ideal for interactive development (writing the doc). The second section will discuss a way to build the document in a completely reproducible manner once it’s done.
Starting with the basics (simple but not entirely reproducible)
This approach will not be the most optimal, but it will be the simplest. We will start by building a development environment with all our dependencies, and we can then use it to compile our document interactively. But this approach is not quite reproducible and requires manual actions. In the next section we will show you to build a 100% reproducible document in a single command.
Since we need both the quarto R package as well as the
quarto
engine, we add both of them to the
r_pkgs
and system_pkgs
of arguments of
rix. Because we want to compile a PDF, we also need to
have texlive
installed, as well as some LaTeX packages. For
this, we use the tex_pkgs
argument:
path_default_nix <- tempdir()
rix(
r_ver = "4.3.1",
r_pkgs = c("quarto"),
system_pkgs = "quarto",
tex_pkgs = c("amsmath"),
ide = "other",
shell_hook = "",
project_path = path_default_nix,
overwrite = TRUE,
print = TRUE
)
(Save these lines into a script called build_env.R
for
instance, and run the script into a new folder made for this
project.)
By default, rix will install the “small” version of
the texlive
distribution available on Nix. To see which
texlive
packages get installed with this small version, you
can click here.
We start by adding the amsmath
package then build the
environment using:
nix-build
from a terminal, or nix_build()
from an interactive R
session.
Then, drop into the Nix shell with nix-shell
, and run
quarto add quarto-journals/jss
. This will install the
template linked above. Then, in the folder that contains
build_env.R
, the generated default.nix
and
result
download the following files from here:
- article-visualization.pdf
- bibliography.bib
- template.qmd
and try to compile template.qmd
by running:
quarto render template.qmd --to jss-pdf
You should get the following error message:
Quitting from lines 99-101 [unnamed-chunk-1] (template.qmd)
Error in `find.package()`:
! there is no package called 'MASS'
Backtrace:
1. utils::data("quine", package = "MASS")
2. base::find.package(package, lib.loc, verbose = verbose)
Execution halted
So there’s an R chunk in template.qmd
that uses the
MASS package. Change build_env.R
to generate
a new default.nix
file that will now add
MASS to the environment when built:
rix(
r_ver = "4.3.1",
r_pkgs = c("quarto", "MASS"),
system_pkgs = "quarto",
tex_pkgs = c("amsmath"),
ide = "other",
shell_hook = "",
project_path = path_default_nix,
overwrite = TRUE,
print = TRUE
)
Trying to compile the document results now in another error message:
compilation failed- no matching packages
LaTeX Error: File `orcidlink.sty' not found
This means that the LaTeX orcidlink
package is missing,
and we can solve the problem by adding "orcidlink"
to the
list of tex_pkgs
. Rebuild the environment and try again to
compile the template. Trying again yields a new error:
compilation failed- no matching packages
LaTeX Error: File `tcolorbox.sty' not found.
Just as before, add the tcolorbox
package to the list of
tex_pkgs
. You will need to do this several times for some
other packages. There is unfortunately no easier way to list the
dependencies and requirements of a LaTeX document.
This is what the final script to build the environment looks like:
rix(
r_ver = "4.3.1",
r_pkgs = c("quarto", "MASS"),
system_pkgs = "quarto",
tex_pkgs = c(
"amsmath",
"environ",
"fontawesome5",
"orcidlink",
"pdfcol",
"tcolorbox",
"tikzfill"
),
ide = "other",
shell_hook = "",
project_path = path_default_nix,
overwrite = TRUE,
print = TRUE
)
The template will now compile with this environment. To look for a LaTeX package, you can use the search engine on CTAN.
As stated in the beginning of this section, this approach is not the
most optimal, but it has its merits, especially if you’re still working
on the document. Once the environment is set up, you can simply work on
the doc and compile it as needed using quarto render
. In
the next section, we will explain how to build a 100% reproducible
document.
100% reproducible literate programming
Let’s not forget that Nix is not just a package manager, but also a
programming language. The default.nix
files that
rix generates are written in this language, which was
made entirely for the purpose of building software. If you are not a
developer, you may not realise it but the process of compiling a Quarto
or LaTeX document is very similar to the process of building any piece
of software. So we can use Nix to compile a document in a completely
reproducible environment.
First, let’s fork the repo that contains the Quarto template we need.
We will fork this
repo. This repo contains the template.qmd
file that we
can change (which is why we fork it, in practice we would replace this
template.qmd
by our own, finished, source .qmd
file). Now we need to change our default.nix
:
let
pkgs = import (fetchTarball "https://github.com/NixOS/nixpkgs/archive/976fa3369d722e76f37c77493d99829540d43845.tar.gz") {};
rpkgs = builtins.attrValues {
inherit (pkgs.rPackages) quarto MASS;
};
tex = (pkgs.texlive.combine {
inherit (pkgs.texlive) scheme-small amsmath environ fontawesome5 orcidlink pdfcol tcolorbox tikzfill;
});
system_packages = builtins.attrValues {
inherit (pkgs) R quarto;
};
in
pkgs.mkShell {
buildInputs = [ rpkgs tex system_packages ];
}
to the following:
let
pkgs = import (fetchTarball "https://github.com/NixOS/nixpkgs/archive/976fa3369d722e76f37c77493d99829540d43845.tar.gz") {};
rpkgs = builtins.attrValues {
inherit (pkgs.rPackages) quarto MASS;
};
tex = (pkgs.texlive.combine {
inherit (pkgs.texlive) scheme-small amsmath environ fontawesome5 orcidlink pdfcol tcolorbox tikzfill;
});
system_packages = builtins.attrValues {
inherit (pkgs) R quarto;
};
in
pkgs.stdenv.mkDerivation {
name = "my-paper";
src = pkgs.fetchgit {
url = "https://github.com/ropensci/my_paper/";
rev = "715e9f007d104c23763cebaf03782b8e80cb5445";
sha256 = "sha256-e8Xg7nJookKoIfiJVTGoJkvCuFNTT83YZ6SK3GT2T8g=";
};
buildInputs = [ rpkgs tex system_packages ];
buildPhase =
''
# Deno needs to add stuff to $HOME/.cache
# so we give it a home to do this
mkdir home
export HOME=$PWD/home
quarto add --no-prompt $src
quarto render $PWD/template.qmd --to jss-pdf
'';
installPhase =
''
mkdir -p $out
cp template.pdf $out/
'';
}
So we changed the second part of the file, we’re not building a shell
anymore using mkShell
, but a derivation.
Derivation is Nix jargon for package, or software. So what is
our derivation? First, we clone the repo we forked just before (I forked
the repository and called it my_paper
):
pkgs.stdenv.mkDerivation {
name = "my-paper";
src = pkgs.fetchgit {
url = "https://github.com/ropensci/my_paper/";
rev = "715e9f007d104c23763cebaf03782b8e80cb5445";
sha256 = "sha256-e8Xg7nJookKoIfiJVTGoJkvCuFNTT83YZ6SK3GT2T8g=";
};
This repo contains our quarto template, and because we’re using a
specific commit, we will always use exactly this release of the template
for our document. This is in contrast to before where we used
quarto add quarto-journals/jss
to install the template.
Doing this interactively makes our project not reproducible because if
we compile our Quarto doc today, we would be using the template as it is
today, but if we compile the document in 6 months, then we would be
using the template as it would be in 6 months (we should say that it is
possible to install specific releases of Quarto templates using
following notation: quarto add quarto-journals/jss@v0.9.2
so this problem can be mitigated).
The next part of the file contains following lines:
buildInputs = [ rpkgs tex system_packages ];
buildPhase =
''
# Deno needs to add stuff to $HOME/.cache
# so we give it a home to do this
mkdir home
export HOME=$PWD/home
quarto add --no-prompt $src
quarto render $PWD/template.qmd --to jss-pdf
'';
The buildInputs
are the same as before. What’s new is
the buildPhase
. This is actually the part in which the
document gets compiled. The first step is to create a home
directory. This is because Quarto needs to save the template we want to
use in /home/.cache/deno
. If you’re using
quarto
interactively, that’s not an issue, since your home
directory will be used. But with Nix, things are different, so we need
to create an empty directory and specify this as the home. This is what
these two lines do:
mkdir home
export HOME=$PWD/home
($PWD
—Print Working Directory— is a shell variable
referring to the current working directory.)
Now, we need to install the template that we cloned from Github. For
this we can use quarto add
just as before, but instead of
installing it directly from Github, we install it from the repository
that we cloned. We also add the --no-prompt
flag so that
the template gets installed without asking us for confirmation. This is
similar to how when building a Docker image, we don’t want any
interactive prompt to show up, or else the process will get stuck.
$src
refers to the path of our downloaded Github
repository. Finally we can compile the document:
quarto render $PWD/template.qmd --to jss-pdf
This will compile the template.qmd
(our finished paper).
Finally, there’s the installPhase
:
installPhase =
''
mkdir -p $out
cp template.pdf $out/
'';
$out
is a shell variable defined inside the build
environment and refers to the path, so we can use it to create a
directory that will contain our output (the compiled PDF file). So we
use mkdir -p
to recursively create all the directory
structure, and then copy the compiled document to $out/
. We
can now build our document by running nix_build()
. Now, you
may be confused by the fact that you won’t see the PDF in your working
directory. But remember that software built by Nix will always be stored
in the Nix store, so our PDF is also in the store, since this is what we
built. To find it, run:
readlink result
which will show the path to the PDF. You could use this to open the PDF in your PDF viewer application (on Linux at least):
xdg-open $(readlink result)/template.pdf
Conclusion
This vignette showed two approaches, both have their merits: the first approach that is more interactive is useful while writing the document. You get access to a shell and can work on the document and compile it quickly. The second approach is more useful once the document is ready and you want to have a way of quickly rebuilding it for reproducibility purposes.