aboutsummaryrefslogtreecommitdiff
path: root/wasm/examples/markdown-to-rst/stdin
diff options
context:
space:
mode:
Diffstat (limited to 'wasm/examples/markdown-to-rst/stdin')
-rw-r--r--wasm/examples/markdown-to-rst/stdin250
1 files changed, 250 insertions, 0 deletions
diff --git a/wasm/examples/markdown-to-rst/stdin b/wasm/examples/markdown-to-rst/stdin
new file mode 100644
index 000000000..7c7cd7c8a
--- /dev/null
+++ b/wasm/examples/markdown-to-rst/stdin
@@ -0,0 +1,250 @@
+---
+author:
+- Albert Krewinkel
+- John MacFarlane
+date: 'January 10, 2020'
+title: Pandoc Lua Filters
+---
+
+# Introduction
+
+Pandoc has long supported filters, which allow the pandoc
+abstract syntax tree (AST) to be manipulated between the parsing
+and the writing phase. [Traditional pandoc
+filters](https://pandoc.org/filters.html) accept a JSON
+representation of the pandoc AST and produce an altered JSON
+representation of the AST. They may be written in any
+programming language, and invoked from pandoc using the
+`--filter` option.
+
+Although traditional filters are very flexible, they have a
+couple of disadvantages. First, there is some overhead in
+writing JSON to stdout and reading it from stdin (twice, once on
+each side of the filter). Second, whether a filter will work
+will depend on details of the user's environment. A filter may
+require an interpreter for a certain programming language to be
+available, as well as a library for manipulating the pandoc AST
+in JSON form. One cannot simply provide a filter that can be
+used by anyone who has a certain version of the pandoc
+executable.
+
+Starting with version 2.0, pandoc makes it possible to write
+filters in Lua without any external dependencies at all. A Lua
+interpreter (version 5.3) and a Lua library for creating pandoc
+filters is built into the pandoc executable. Pandoc data types
+are marshaled to Lua directly, avoiding the overhead of writing
+JSON to stdout and reading it from stdin.
+
+Here is an example of a Lua filter that converts strong emphasis
+to small caps:
+
+``` lua
+return {
+ {
+ Strong = function (elem)
+ return pandoc.SmallCaps(elem.c)
+ end,
+ }
+}
+```
+
+or equivalently,
+
+``` lua
+function Strong(elem)
+ return pandoc.SmallCaps(elem.c)
+end
+```
+
+This says: walk the AST, and when you find a Strong element,
+replace it with a SmallCaps element with the same content.
+
+To run it, save it in a file, say `smallcaps.lua`, and invoke
+pandoc with `--lua-filter=smallcaps.lua`.
+
+Here's a quick performance comparison, converting the pandoc
+manual (MANUAL.txt) to HTML, with versions of the same JSON
+filter written in compiled Haskell (`smallcaps`) and interpreted
+Python (`smallcaps.py`):
+
+ Command Time
+ --------------------------------------- -------
+ `pandoc` 1.01s
+ `pandoc --filter ./smallcaps` 1.36s
+ `pandoc --filter ./smallcaps.py` 1.40s
+ `pandoc --lua-filter ./smallcaps.lua` 1.03s
+
+As you can see, the Lua filter avoids the substantial overhead
+associated with marshaling to and from JSON over a pipe.
+
+# Lua filter structure
+
+Lua filters are tables with element names as keys and values
+consisting of functions acting on those elements.
+
+Filters are expected to be put into separate files and are
+passed via the `--lua-filter` command-line argument. For
+example, if a filter is defined in a file `current-date.lua`,
+then it would be applied like this:
+
+ pandoc --lua-filter=current-date.lua -f markdown MANUAL.txt
+
+The `--lua-filter` option may be supplied multiple times. Pandoc
+applies all filters (including JSON filters specified via
+`--filter` and Lua filters specified via `--lua-filter`) in the
+order they appear on the command line.
+
+Pandoc expects each Lua file to return a list of filters. The
+filters in that list are called sequentially, each on the result
+of the previous filter. If there is no value returned by the
+filter script, then pandoc will try to generate a single filter
+by collecting all top-level functions whose names correspond to
+those of pandoc elements (e.g., `Str`, `Para`, `Meta`, or
+`Pandoc`). (That is why the two examples above are equivalent.)
+
+For each filter, the document is traversed and each element
+subjected to the filter. Elements for which the filter contains
+an entry (i.e. a function of the same name) are passed to Lua
+element filtering function. In other words, filter entries will
+be called for each corresponding element in the document,
+getting the respective element as input.
+
+The return value of a filter function must be one of the
+following:
+
+- nil: this means that the object should remain unchanged.
+- a pandoc object: this must be of the same type as the input
+ and will replace the original object.
+- a list of pandoc objects: these will replace the original
+ object; the list is merged with the neighbors of the
+ original objects (spliced into the list the original object
+ belongs to); returning an empty list deletes the object.
+
+The function's output must result in an element of the same type
+as the input. This means a filter function acting on an inline
+element must return either nil, an inline, or a list of inlines,
+and a function filtering a block element must return one of nil,
+a block, or a list of block elements. Pandoc will throw an error
+if this condition is violated.
+
+If there is no function matching the element's node type, then
+the filtering system will look for a more general fallback
+function. Two fallback functions are supported, `Inline` and
+`Block`. Each matches elements of the respective type.
+
+Elements without matching functions are left untouched.
+
+See [module documentation](#module-pandoc) for a list of pandoc
+elements.
+
+## Filters on element sequences
+
+For some filtering tasks, it is necessary to know the order
+in which elements occur in the document. It is not enough then to
+inspect a single element at a time.
+
+There are two special function names, which can be used to define
+filters on lists of blocks or lists of inlines.
+
+[`Inlines (inlines)`]{#inlines-filter}
+: If present in a filter, this function will be called on all
+ lists of inline elements, like the content of a [Para]
+ (paragraph) block, or the description of an [Image]. The
+ `inlines` argument passed to the function will be a [List] of
+ [Inline] elements for each call.
+
+[`Blocks (blocks)`]{#blocks-filter}
+: If present in a filter, this function will be called on all
+ lists of block elements, like the content of a [MetaBlocks]
+ meta element block, on each item of a list, and the main
+ content of the [Pandoc] document. The `blocks` argument
+ passed to the function will be a [List] of [Block] elements
+ for each call.
+
+These filter functions are special in that the result must either
+be nil, in which case the list is left unchanged, or must be a
+list of the correct type, i.e., the same type as the input
+argument. Single elements are **not** allowed as return values,
+as a single element in this context usually hints at a bug.
+
+See ["Remove spaces before normal citations"][Inlines filter
+example] for an example.
+
+This functionality has been added in pandoc 2.9.2.
+
+[Inlines filter example]: #remove-spaces-before-citations
+
+## Traversal order
+
+The traversal order of filters can be selected by setting the key
+`traverse` to either `'topdown'` or `'typewise'`; the default is
+`'typewise'`.
+
+Example:
+
+``` lua
+local filter = {
+ traverse = 'topdown',
+ -- ... filter functions ...
+}
+return {filter}
+```
+
+Support for this was added in pandoc 2.17; previous versions
+ignore the `traverse` setting.
+
+### Typewise traversal
+
+Element filter functions within a filter set are called in a
+fixed order, skipping any which are not present:
+
+ 1. functions for [*Inline* elements](#type-inline),
+ 2. the [`Inlines`](#inlines-filter) filter function,
+ 2. functions for [*Block* elements](#type-block) ,
+ 2. the [`Blocks`](#inlines-filter) filter function,
+ 3. the [`Meta`](#type-meta) filter function, and last
+ 4. the [`Pandoc`](#type-pandoc) filter function.
+
+It is still possible to force a different order by explicitly
+returning multiple filter sets. For example, if the filter for
+*Meta* is to be run before that for *Str*, one can write
+
+``` lua
+-- ... filter definitions ...
+
+return {
+ { Meta = Meta }, -- (1)
+ { Str = Str } -- (2)
+}
+```
+
+Filter sets are applied in the order in which they are returned.
+All functions in set (1) are thus run before those in (2),
+causing the filter function for *Meta* to be run before the
+filtering of *Str* elements is started.
+
+### Topdown traversal
+
+It is sometimes more natural to traverse the document tree
+depth-first from the root towards the leaves, and all in a single
+run.
+
+For example, a block list `[Plain [Str "a"], Para [Str
+"b"]]`{.haskell} will try the following filter functions, in
+order: `Blocks`, `Plain`, `Inlines`, `Str`, `Para`, `Inlines`,
+`Str`.
+
+Topdown traversals can be cut short by returning `false` as a
+second value from the filter function. No child-element of
+the returned element is processed in that case.
+
+For example, to exclude the contents of a footnote from being
+processed, one might write
+
+``` lua
+traverse = 'topdown'
+function Note (n)
+ return n, false
+end
+```
+