Project Status: WIP – Initial development is in progress, but there has not yet been a stable, usable release suitable for the public. R build status Coverage status

The goal of tinkr is to convert (R)Markdown files to XML and back to allow their editing with xml2 (XPath!) instead of numerous complicated regular expressions. If new to XPath refer to this great intro. Possible applications are R scripts using this and XPath in xml2 to:

Only the body of the (R) Markdown file is cast to XML, using the Commonmark specification via the commonmark package. YAML metadata could be edited using the yaml package, which is not the goal of this package.

The current workflow I have in mind is

  1. use to_xml to obtain XML from (R) Markdown (based on commonmark::markdown_xml and blogdown:::split_yaml_body).

  2. edit the XML using xml2.

  3. use to_md to save back the resulting (R) Markdown (this uses a XSLT stylesheet, and the xslt package).

Maybe there could be shortcuts functions for some operations in 2, maybe not.


Wanna try the package and tell me what doesn’t work?




This is a basic example. We read “”, change all headers 3 to headers 1, and save it back to md.

# From Markdown to XML
path <- system.file("extdata", "", package = "tinkr")
yaml_xml_list <- to_xml(path)

# transform level 3 headers into level 1 headers
body <- yaml_xml_list$body
body %>%
  xml2::xml_find_all(xpath = './/d1:heading',
                     xml2::xml_ns(.)) %>%
  .[xml2::xml_attr(., "level") == "3"] -> headers3

xml2::xml_set_attr(headers3, "level", 1)

yaml_xml_list$body <- body

# Back to Markdown
to_md(yaml_xml_list, "")

R Markdown

For R Markdown files, to ease editing of chunk label and options, to_xml munges the chunk info into different attributes. E.g. below you see that code_blocks can have a language, name, echo attributes.

path <- system.file("extdata", "example2.Rmd", package = "tinkr")
yaml_xml_list <- tinkr::to_xml(path)
#> {xml_document}
#> <document xmlns="">
#>  [1] <code_block language="r" name="setup" include="FALSE" eval="TRUE">k ...
#>  [2] <heading level="2">\n  <text>R Markdown</text>\n</heading>
#>  [3] <paragraph>\n  <text>This is an R Markdown document. Markdown is a  ...
#>  [4] <paragraph>\n  <text>When you click the </text>\n  <strong>\n    <t ...
#>  [5] <code_block language="r" name="" eval="TRUE" echo="TRUE">summary(ca ...
#>  [6] <heading level="2">\n  <text>Including Plots</text>\n</heading>
#>  [7] <paragraph>\n  <text>You can also embed plots, for example:</text>\ ...
#>  [8] <code_block language="python" name="" echo="FALSE" eval="TRUE">plot ...
#>  [9] <code_block language="python" name="">plot(pressure)\n</code_block>
#> [10] <paragraph>\n  <text>Note that the </text>\n  <code>echo = FALSE</c ...

Inserting new elements

You can insert new elements into the document via {xml2}, but you should make sure that the namespace matches that of your xml document. For example, let’s say we wanted to add a new R code chunk after the setup chunk:

NOTE: Inserting new code MUST have a newline character at the end of the chunk or else the last line will be lost.

path <- system.file("extdata", "example2.Rmd", package = "tinkr")
yaml_xml_list <- tinkr::to_xml(path)
# Add chunk into document
                    "message(\"this is a new chunk from {tinkr}\")\n",
                    .where = 1L)
out <- tempfile(fileext = ".Rmd")
tinkr::to_md(yaml_xml_list, out)

Loss of Markdown style

General principles and solution

The (R)md to XML to (R)md loop on which tinkr is based is slightly lossy because of Markdown syntax redundancy, so the loop from (R)md to R(md) via to_xml and to_md will be a bit lossy. For instance

  • lists can be created with either “+”, “-” or "*“. When using tinkr, the (R)md after editing will only use”-" for lists.

  • Links built like [word][smallref] and bottom [smallref]: URL become [word](URL).

  • Characters are escaped (e.g. “[” when not for a link).

  • Block quotes lines all get “>” whereas in the input only the first could have a “>” at the beginning of the first line.

  • For tables see the next subsection.

Such losses make your (R)md different, and the git diff a bit harder to parse, but should not change the documents your (R)md is rendered to. If it does, report a bug in the issue tracker!

A solution to not loose your Markdown style, e.g. your preferring "*" over “-” for lists is to tweak our XSL stylesheet and provide its filepath as stylesheet_path argument to to_md.

The special case of tables

  • Tables are supposed to remain/become pretty after a full loop to_xml + to_md. If you notice something amiss, e.g. too much space compared to what you were expecting, please open an issue.


Please note that the ‘tinkr’ project is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.