| Age | Commit message (Collapse) | Author |
|
|
|
|
|
|
|
|
|
|
|
|
|
Note: we don't have an API change, but we choose to increment the
third digit because of some significant behavior changes.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
This reverts commit 12bce32bff7e74e0c405e0a979ef5d3e3528c4ad.
We don't need this yet, until we have a solution to #10525.
|
|
...when `styles` extension is enabled. Closes #9603.
Also improve manual's coverage of custom styles.
|
|
|
|
|
|
|
|
Pod ("Plain old documentation") is a markup languaged used principally
to document Perl modules and programs. Since it was originally meant to
be translated pretty directly to man, the semantics are fairly simple.
This Pod reader was developed with reference to the canonical user and
implementer documentation of Pod: https://perldoc.perl.org/perlpod and
https://perldoc.perl.org/perlpodspec.
There are 1490 .pod, .pl, and .pm in the Perl 5.34 distribution found in
/System/Library/Perl on my mac. Of those, this reader dies with a parse
error on 7 of them. All of them seem to be cases where pod commands are
found within a non-colon-prefixed =begin/=end. perlpodspec says I may
treat this as an error.
[API change] adds readPod
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
This change introduces a reader for mdoc, a roff-derived semantic markup
language for manual pages. The two relevant contemporary implementations
of mdoc for manual pages are mandoc (https://mandoc.bsd.lv/), which
implements the language from scratch in C, and groff
(https://www.gnu.org/software/groff/), which implements it as roff macros.
mdoc has a lot of semantics specific to technical manuals that aren't
representable in Pandoc's AST. I've taken a cue from the mandoc HTML
output and many mdoc elements are encoded as Codes or Spans with classes
named for the mdoc macro that produced them.
Much like web browsers with HTML, mandoc attempts to produce best-effort
output given all kinds of weird and crappy mdoc input. Part of the
reason it's able to do this is it uses a very accommodating parse tree
and stateful output routines specialized to the output mode, and when it
encounters some macro it wasn't expecting, it can easily give up on
whatever it was outputting and output something else. I've encoded as
much flexibility as I reasonably could into the mdoc reader here, but I
don't know how to be as flexible as mandoc.
This branch has been developed almost exclusively against mandoc's
documentation and implementation of mdoc as a reference, and the
real-world manual pages tested against are those from the OpenBSD base
system. Of ~3500 manuals in mdoc format shipped with a fresh OpenBSD
install, 17 cause the mdoc reader to exit with a parse error. Any
further chasing of edge cases is deferred to future work.
Many of the tests in test/Tests/Readers/Mdoc.hs are derived directly
from mandoc's extensive regression tests.
[API change] Adds readMdoc to the public API
|
|
The existing lexRoff does some stuff I don't want to deal with in mdoc
just yet, like lexing tbl, and some stuff I won't do at all, like
handling macro and text string definitions and switching between modes.
Uses a typeclass with associated type families to reuse most of the
escaping code between Roff (i.e. man) and Mdoc.
Future work could improve on this so that more lexing code could be
shared between Man and Mdoc. Mdoc inherits Roff's surface syntax so
hypothetically it makes sense to lex it into tokens that make sense for
roff. But it happens that the Mdoc parser is much easier to build with
an Mdoc specific token stream. Some discussion in jgm/pandoc#10225 about
the rationale.
Adds a test for the roff \A escape, which I accidentally dropped support
for in an earlier iteration without anything complaining.
|
|
Closes #10379.
|
|
Remove unnecessary definition of `endnote`.
Incorporate the one remaining definition into `default.typst`.
|
|
(Tested.)
|
|
|
|
Text.Pandoc.Logging: add YamlWarning constructor to LogMessage
[API change].
Closes #10312.
|
|
|
|
|
|
|
|
We incorporate this into fonts.tex, and move the beamer theme-setting
commands before both of them.
|
|
+ Split out common parts of latex template into partials: common.latex,
fonts.latex, font-settings.latex, passoptions.latex, hypersetup.latex,
after-header-includes.latex.
+ Split out old latex template into default.latex and default.beamer.
+ Make default.beamer the default template for beamer.
|
|
Pandoc already depends on `crypton-conntection`, and thus transitively
on `crypton`. The latter provides a vast variety of hashing algorithms
and makes the dependency on SHA unnecessary.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
The ANSI writer (-t ansi) outputs a document formatted with ANSI control
sequences for reading on the console.
Most Pandoc elements are supported and printed in a reasonable way, if
not always ideally. This version does no detection of terminal
capabilities nor does it fall back to different output styles for
less-capable terminals.
Some gory details:
- Title blocks are formatted with modest extravagance in --standalone
mode.
- Strong, Emph, Underline, and Strikeout spans are all formatted
accordingly using SGR codes (which will be silently ignored by
terminals that don't support them).
- Headings have somewhat arbitrary styles applied to them that
probably need immediate improvement.
- Blockquotes and all flavors of list look pretty good.
- Code spans are colored magenta-on-white, which on the author's
terminal looks kind of like the pinkish treatment of code spans used
by many stylesheets. This probably isn't a good final decision.
- Code blocks are formatted by Skylighting's formatANSI using standard
writer options and included directly in the output. This has some
issues; see code comments.
- Links are printed with OSC 8 to create hyperlinks and colored cyan.
The author's terminal automatically adds a dotted-underline to OSC 8
hyperlinks, but only colors them differently on command-mouseover.
Setting an underlined style on links may be more broadly accessible.
OSC 8 support is not checked for, so on terminals not supporting it or
with support disabled, the link text will be colored but not do
anything and the links will not be printed.
- Images are displayed as their alt text. Support for the Kitty and
iTerm 2 inline image protocols is planned. Supporting other terminals
by using Chafa (https://hpjansson.org/chafa/) to print sixels etc would
be cool too but the author would have to do some FFI stuff and it would
add a dependency to Pandoc.
- Tables are replaced with a useless placeholder. Table output using
box-drawing characters is desired.
- Subscripts and Superscripts are just parenthesized when accurate Unicode
representations aren't available. Because these span types could have
all kinds of semantics, there's not an obvious thing to do with them.
- Simple math is translated to Pandoc inlines using existing
functionality. An ambitious person could look into emulating the
console-mode math output of a computer algebra system, or rendering each
display math element as an image with TeX or Typst and including it, or
some other thing.
|
|
|
|
|
|
|