| Age | Commit message (Collapse) | Author |
|
Closes #10758.
When the alt differs from the caption, but only as regards
formatting, we still use an implicit figure.
|
|
Also, when falling back to a Div with class `figure` for a figure
that can't be represented any other way, include a Div with class
`caption` containing the caption.
Closes #10755.
|
|
Closes #10747.
|
|
Convert newlines to spaces as we do in other formats.
Closes #10730.
|
|
This affects T.P.RoffChar, T.P.Writers.Roff,
and the Man and Ms writers.
That is, `\(xy` instead of `\[xy]`. This was the original AT&T troff
form and is the most widely supported. The bracketed form causes
problem for some tools, e.g. `makewhatis` on macOS.
Closes #10716.
|
|
In GFM, you need to use `\\{` rather than `\{` for a literal brace.
Closes #10631.
|
|
Closes #10708.
|
|
In LaTeX's tabular environment, the tabular newline takes an optional
argument that we skip. But it only takes a single optional argument, and
any further square-bracketed text that follows shouldn't be skipped.
Fixes #7512, and also adds a test for the original problem raised in
that issue which was already fixed at some point.
|
|
|
|
Previously we tried to handle things like commented out list
items:
- one
<!--
- two
-->
- three
and also things like:
- one `and
- two` and
But the code we added to handle these cases caused problems with
other, more straightforward things, like:
- one
- ```
code
```
- three
So we are rolling back all the fanciness, so that the markdown
parser now behaves more like the commonmark parser, in which
indicators of block-level structure always take priority over
indicators of inline structure.
Closes #9865. Closes #7778. See also #5628.
|
|
It should not accept escaped newlines.
See #10672.
|
|
Otherwise we get undesirable results, as the format's native
citation mechanism is used instead of (or in addition to) the
citeproc-generated citations. Closes #10662.
|
|
Closes #10659.
|
|
Closes #10650.
|
|
This reverts commit cbe67b9602a736976ef6921aefbbc60d51c6755a.
Word sets `w:firstColumn="1"` by default for tables. You have to find
the Table Design tab and explicitly uncheck "First Column" to make this
go away. In most cases, I don't think writers intend to designate
the first column as a row head, so this commit is going to produce
unexpected results. In addition, because of the table normalization
done by pandoc-type's `tableWith`, any table containing a colspanned
cell in the left-hand column will get broken if the first column is
designated a row head. For these reasons it seems best to revert this
change, which was made in response to #9495.
Closes #10627.
|
|
Closes #10621.
|
|
See #10610.
|
|
Previously we tried to match curly quotes as well as straight
quotes, producing Quoted inlines. But it seems better just to
assume that those who use curly quotes want them passed through
verbatim.
This also fixes an (unintended) bug whereby curly single left
quotes would sometimes be changed to single right quotes.
Closes #10610.
|
|
We used to insert extra spaces to ensure that the content respected
the four-space rule. That is not really necessary now, since pandoc's
markdown and most markdowns don't follow the four-space rule.
Those who want the old behavior can obtain it by using
`-t markdown+four_space_rule`.
Closes #7172.
|
|
This has been fixed now in Babel for some time. So we can now
get rid of the ugly code that disabled language-specific shorthands
(see e26d31d).
Closes #6817.
|
|
We now specify the language as a global option again, so we
no longer need to specify it when invoking selnolig.
See #9863.
|
|
Previously we used the `.ini` files for every language, but
for European languages these tend to provide inferior results
to the `.ldf` files used by classic Babel. Currently Babel
documentation recommends using the classic system for European
languages written in Latin and Cyrillic scripts and Vietnamese.
So the LaTeX writer and template now follow this guidance.
Main languages in the list of languages with good "classic" support
are added to global documentclass options and will be automatically
handled by Babel using the `.ldf` files.
If the main language is not in this list, the `babeloptions` variable
will be set to `provide=*`, which will cause support to be loaded from
the `.ini` file rather than an `.ldf`. So, for example, setting
`-V babeloptions=''` with a polytonic Greek document will cause the
`.ldf` support to be used instead of the `.ini`.
The default setting of this variable can be overwritten, but in most
cases the default should give good results.
Closes #8283.
|
|
Once upon a time the only metadata element for links in Pandoc's AST was
a title, and it was hijacked to track certain links as having originated
in the wikilink syntax. Now we have Attrs and we can use a class to
handle wikilinks instead.
Requires coordinated changes to commonmark-hs.
|
|
Enable annotating author roles using the Contribution Role Taxonomy
(CRediT) and export this information in conformant JATS
Closes #10152.
Co-Authored-By: Jez Cope <[email protected]>
|
|
Also some other elements that allow title: blockquote,
calloutlist, etc.
Closes #10594.
|
|
Include identifiers and titles in each case.
The code should be credited to @tombolano.
Closes #8666.
|
|
Closes #5793
|
|
Closes #4470
|
|
The combination of #9648 Typst property output and #9778
`typst:no-figure` can cause fonts to spill out of tables.
This is because setting Typst text properties across a table
requires `set text(...)` outside the table, and previously we
were relying on the figure to provide a scope.
This adds an extra `#{...}` when the table has class `typst:no-figure`
and also has `typst:text:*` attributes.
|
|
This previously worked with regular citations, but not author-in-text
citations. Now it works with both.
|
|
The reader did not properly consume empty lines after =encoding commands,
which produced various incorrect parses depending on the content between
there and the next command.
Fixes #10537
|
|
|
|
Reader: When `w:tblLook` has `w:firstColumn` set (or an equivalent bit
mask), we set row heads = 1 in the AST.
Writer: set `w:firstColumn` in `w:tblLook` when there are row
heads. (Word only allows one, so this is triggered by any number
of row heads > 0.)
Closes #9495.
|
|
...when `styles` extension is enabled. Closes #9603.
Also improve manual's coverage of custom styles.
|
|
These are now converted to inches as in the LaTeX writer.
Closes #9945.
|
|
Closes #10385.
Closes #2337.
Closes #6424.
|
|
Fixes a regression in 3.6 that caused problems parsing
text with underscores.
Closes #10497.
|
|
Closes #10490.
|
|
Closes #10491.
|
|
This case was missed when changing the reference link strategy for RST
to allow a single pass.
Closes #10484.
|
|
...that would otherwise be interpreted as list starts.
Closes #9700.
|
|
...in the same way it works for other formats (with the top-level
heading being promoted to metadata title). This needed special
treatment because of the way djot surrounds sections with Divs.
Closes #10459.
|
|
Closes #10472.
|
|
The span needs to be separated from its surroundings by spaces.
Also, a span can have attributes, which we now attach.
Closes #9878.
|
|
...is preceded by whitespace.
Closes #10414.
|
|
This change introduces a reader for mdoc, a roff-derived semantic markup
language for manual pages. The two relevant contemporary implementations
of mdoc for manual pages are mandoc (https://mandoc.bsd.lv/), which
implements the language from scratch in C, and groff
(https://www.gnu.org/software/groff/), which implements it as roff macros.
mdoc has a lot of semantics specific to technical manuals that aren't
representable in Pandoc's AST. I've taken a cue from the mandoc HTML
output and many mdoc elements are encoded as Codes or Spans with classes
named for the mdoc macro that produced them.
Much like web browsers with HTML, mandoc attempts to produce best-effort
output given all kinds of weird and crappy mdoc input. Part of the
reason it's able to do this is it uses a very accommodating parse tree
and stateful output routines specialized to the output mode, and when it
encounters some macro it wasn't expecting, it can easily give up on
whatever it was outputting and output something else. I've encoded as
much flexibility as I reasonably could into the mdoc reader here, but I
don't know how to be as flexible as mandoc.
This branch has been developed almost exclusively against mandoc's
documentation and implementation of mdoc as a reference, and the
real-world manual pages tested against are those from the OpenBSD base
system. Of ~3500 manuals in mdoc format shipped with a fresh OpenBSD
install, 17 cause the mdoc reader to exit with a parse error. Any
further chasing of edge cases is deferred to future work.
Many of the tests in test/Tests/Readers/Mdoc.hs are derived directly
from mandoc's extensive regression tests.
[API change] Adds readMdoc to the public API
|
|
The existing lexRoff does some stuff I don't want to deal with in mdoc
just yet, like lexing tbl, and some stuff I won't do at all, like
handling macro and text string definitions and switching between modes.
Uses a typeclass with associated type families to reuse most of the
escaping code between Roff (i.e. man) and Mdoc.
Future work could improve on this so that more lexing code could be
shared between Man and Mdoc. Mdoc inherits Roff's surface syntax so
hypothetically it makes sense to lex it into tokens that make sense for
roff. But it happens that the Mdoc parser is much easier to build with
an Mdoc specific token stream. Some discussion in jgm/pandoc#10225 about
the rationale.
Adds a test for the roff \A escape, which I accidentally dropped support
for in an earlier iteration without anything complaining.
|
|
Closes #10390.
|
|
The plain writer behaved as a markdown variant with Ext_line_blocks
turned off, and so empty lines in a line block would get eliminated.
This is surprising, since if there's anything where the intent can be
preserved in plain text output it's empty lines.
It's still a bit surprising to have nbsps in plain text output, as in
the test, where the distinction doesn't really matter, but that'd be an
orthogonal change.
|
|
|