Spec Compliance

This chapter gives a brief overview on how closely Laika adheres to the various formats it parses.

Markdown

Laika sticks to the original syntax description and incorporates the test suite from the PHP Markdown project which is a slightly expanded version of the original suite from John Gruber.

In cases where both the spec and the test suite are silent on how to handle certain edge cases, the Babelmark tool has been consulted and usually the approach the majority of available parsers have chosen has been picked for Laika, too.

Test Suite

The testing approach in Laika is adapted to cater for the library's design goals, one of which is decoupling input and output formats. Most existing, official test suites are providing inputs in text markup and output in HTML. In cases where the output differs from Laika's built-in HTML renderer, the renderer is adjusted with overrides just for the test. This is acceptable because these subtle differences do not represent a semantic difference. The only way to avoid these kind of "little cheats" would be to have a separate HTML renderer for each supported text markup format. But this would be undesirable, in particular in cases where users mix documents with different markup formats in the same input directory.

Verbatim HTML

One major difference to standard Markdown is that the parsing of verbatim HTML elements is not enabled by default, as Laika discourages the coupling of input and output formats, but it can be switched on if required.

See Raw Content for examples on how to enable verbatim HTML in the sbt plugin or the library API.

When this support is switched on, it follows the original spec, including the support for text markup and HTML syntax being interspersed in the input.

GitHub Flavored Markdown

Laika supports the syntax of GitHubFlavored Markdown through an ExtensionBundle that must be enabled explicitly. These are the parsers this extension adds to standard Markdown:

Subtle Differences to the GitHub Specification

CommonMark

Laika does not yet integrate the official CommonMark test suite.

In practice the differences should be minor as CommonMark is a specification that builds on top of the original Markdown spec plus some aspects of GitHub Flavor which Laika both supports. It mostly removes some ambiguity and adds detail to some of the under-specified features of classic Markdown.

Given that the effort would be quite significant (the test suite covers more than 600 tests) and the current level of participation in Laika development is not very high, it is unlikely that this feature support will be added in the near future. The idea could be revived in case new contributors chime in.

reStructuredText

The reStructuredText project is part of Python's Docutils project. It is also more strictly defined than Markdown, with a detailed specification and clearly defined markup recognition rules.

Apparently there is no official test suite for reStructuredText, therefore to add a realistic test to the Laika test suite a full transformation of the reStructuredText specification itself is integrated into Laika's test suite.

Supported Standard Directives

Directives are an extension mechanism of reStructuredText and the reference implementation supports a set of standard directives out of the box.

Out of this set Laika supports the following:

The following limitations apply to these directives:

Supported Standard Text Roles

Text roles are a second extension mechanism for applying functionality to spans of text and the reference implementation supports a set of standard text roles out of the box.

Out of this set Laika supports the following:

Unsupported Extensions

The following extensions are not supported:

There are various reasons for excluding these extensions, some of them being rather technical. For example, the target-notes and class directives would require processing beyond the directive itself, therefore would require new API. Others, like the pep-reference text role, seemed too exotic to warrant inclusion in Laika.

Raw Content Extensions

Two of the supported standard extensions, the raw directive and the raw text role, embed content in the target format in text markup. Like with verbatim HTML for Markdown, these extensions are disabled by default, as Laika discourages the coupling of input and output formats, but it can be switched on if required.

See Raw Content for examples on how to enable these extensions in the sbt plugin or the library API.

Implementing a Custom Directive

Laika comes with a concise and typesafe DSL to declare custom directives and text roles. It is fully documented in the scaladoc for Directives and TextRoles.

HOCON

Laika fully supports the official HOCON specification, based on its own parsers (it does not wrap the Typesafe Config library).

HOCON in Laika is supported in various places:

For more information on how HOCON is used in the library see Laika's HOCON API.

The only fairly minor exception in spec compliance is the deliberate decision not to support circular references. Direct self references are supported (e.g. path = ${path} [img, css]) as they cover a very common use case, but a referring to b referring to an earlier definition of a is not. The cost of additional complexity in the implementation this would require seems disproportional to the little use supporting this edge case would provide. Convoluted circular references are also harder to read and error-prone when definitions move.

If you think this should be supported, please open a ticket and explain your reasoning and your use case.