aboutsummaryrefslogtreecommitdiff
path: root/test/command
AgeCommit message (Collapse)Author
2025-04-05Markdown writer: improve use of implicit figures when possible.John MacFarlane
Closes #10758. When the alt differs from the caption, but only as regards formatting, we still use an implicit figure.
2025-04-04Markdown writer: render a figure with Para caption as implicit figure.John MacFarlane
Also, when falling back to a Div with class `figure` for a figure that can't be represented any other way, include a Div with class `caption` containing the caption. Closes #10755.
2025-04-01Typst writer: support `mark` class on spans.John MacFarlane
Closes #10747.
2025-03-29Org reader: don't include newlines in inine code/verbatim.John MacFarlane
Convert newlines to spaces as we do in other formats. Closes #10730.
2025-03-23Use the most compatible form for roff escapes.John MacFarlane
This affects T.P.RoffChar, T.P.Writers.Roff, and the Man and Ms writers. That is, `\(xy` instead of `\[xy]`. This was the original AT&T troff form and is the most widely supported. The bracketed form causes problem for some tools, e.g. `makewhatis` on macOS. Closes #10716.
2025-03-22Commonmark Reader: handle GFM math irregularity with braces.John MacFarlane
In GFM, you need to use `\\{` rather than `\{` for a literal brace. Closes #10631.
2025-03-21MediaWiki reader/writer: allow definition on same line as term.John MacFarlane
Closes #10708.
2025-03-19Skip at most one argument to LaTeX tabular newline (#10707)silby
In LaTeX's tabular environment, the tabular newline takes an optional argument that we skip. But it only takes a single optional argument, and any further square-bracketed text that follows shouldn't be skipped. Fixes #7512, and also adds a test for the original problem raised in that issue which was already fixed at some point.
2025-03-15Update tests for previous commit (protecting phantomsection).John MacFarlane
2025-03-14Markdown reader: remove some misguided list fanciness.John MacFarlane
Previously we tried to handle things like commented out list items: - one <!-- - two --> - three and also things like: - one `and - two` and But the code we added to handle these cases caused problems with other, more straightforward things, like: - one - ``` code ``` - three So we are rolling back all the fanciness, so that the markdown parser now behaves more like the commonmark parser, in which indicators of block-level structure always take priority over indicators of inline structure. Closes #9865. Closes #7778. See also #5628.
2025-03-07Markdown reader: fixed `escapedChar'` parser.John MacFarlane
It should not accept escaped newlines. See #10672.
2025-03-05Disable citations extension in writers if `--citeproc` is used.John MacFarlane
Otherwise we get undesirable results, as the format's native citation mechanism is used instead of (or in addition to) the citeproc-generated citations. Closes #10662.
2025-03-04LaTeX reader: better handle comments/whitespace in option lists and includes.John MacFarlane
Closes #10659.
2025-02-27Typst writer: better heuristics for escaping potential list markers.John MacFarlane
Closes #10650.
2025-02-19Revert "Docx reader and writer: support row heads."John MacFarlane
This reverts commit cbe67b9602a736976ef6921aefbbc60d51c6755a. Word sets `w:firstColumn="1"` by default for tables. You have to find the Table Design tab and explicitly uncheck "First Column" to make this go away. In most cases, I don't think writers intend to designate the first column as a row head, so this commit is going to produce unexpected results. In addition, because of the table normalization done by pandoc-type's `tableWith`, any table containing a colspanned cell in the left-hand column will get broken if the first column is designated a row head. For these reasons it seems best to revert this change, which was made in response to #9495. Closes #10627.
2025-02-14Markdown reader: allow line break between URL and title of link.John MacFarlane
Closes #10621.
2025-02-13Update pandoc-citeproc-320a test.John MacFarlane
See #10610.
2025-02-13Smart quote parsing: ignore curly quotes.John MacFarlane
Previously we tried to match curly quotes as well as straight quotes, producing Quoted inlines. But it seems better just to assume that those who use curly quotes want them passed through verbatim. This also fixes an (unintended) bug whereby curly single left quotes would sometimes be changed to single right quotes. Closes #10610.
2025-02-12Markdown writer: omit extra space after bullets.John MacFarlane
We used to insert extra spaces to ensure that the content respected the four-space rule. That is not really necessary now, since pandoc's markdown and most markdowns don't follow the four-space rule. Those who want the old behavior can obtain it by using `-t markdown+four_space_rule`. Closes #7172.
2025-02-10Use babel options `shorthands=off`.John MacFarlane
This has been fixed now in Babel for some time. So we can now get rid of the ugly code that disabled language-specific shorthands (see e26d31d). Closes #6817.
2025-02-10Remove selnolig-langs.John MacFarlane
We now specify the language as a global option again, so we no longer need to specify it when invoking selnolig. See #9863.
2025-02-08LaTeX writer/template: Improve babel support.John MacFarlane
Previously we used the `.ini` files for every language, but for European languages these tend to provide inferior results to the `.ldf` files used by classic Babel. Currently Babel documentation recommends using the classic system for European languages written in Latin and Cyrillic scripts and Vietnamese. So the LaTeX writer and template now follow this guidance. Main languages in the list of languages with good "classic" support are added to global documentclass options and will be automatically handled by Babel using the `.ldf` files. If the main language is not in this list, the `babeloptions` variable will be set to `provide=*`, which will cause support to be loaded from the `.ini` file rather than an `.ldf`. So, for example, setting `-V babeloptions=''` with a polytonic Greek document will cause the `.ldf` support to be used instead of the `.ini`. The default setting of this variable can be overwritten, but in most cases the default should give good results. Closes #8283.
2025-02-07Track wikilinks with a class instead of a titleEvan Silberman
Once upon a time the only metadata element for links in Pandoc's AST was a title, and it was hijacked to track certain links as having originated in the wikilink syntax. Now we have Attrs and we can use a class to handle wikilinks instead. Requires coordinated changes to commonmark-hs.
2025-02-05Add CRediT roles to JATSCharles Tapley Hoyt
Enable annotating author roles using the Contribution Role Taxonomy (CRediT) and export this information in conformant JATS Closes #10152. Co-Authored-By: Jez Cope <[email protected]>
2025-02-03DocBook reader: Handle title inside orderedlist.John MacFarlane
Also some other elements that allow title: blockquote, calloutlist, etc. Closes #10594.
2025-02-01DocBook reader: better handle formalpara, example, and sidebar.John MacFarlane
Include identifiers and titles in each case. The code should be credited to @tombolano. Closes #8666.
2025-01-29Handle <abbr> as a span-like inlineEvan Silberman
Closes #5793
2025-01-29Test \{,re}newcommand arguments (#10573)silby
Closes #4470
2025-01-24brace tables with typst:no-figure and typst:text attributes (#10563)Gordon Woodhull
The combination of #9648 Typst property output and #9778 `typst:no-figure` can cause fonts to spill out of tables. This is because setting Typst text properties across a table requires `set text(...)` outside the table, and previously we were relying on the figure to provide a scope. This adds an extra `#{...}` when the table has class `typst:no-figure` and also has `typst:text:*` attributes.
2025-01-16Citeproc: fix moving punctuation before citation notes.John MacFarlane
This previously worked with regular citations, but not author-in-text citations. Now it works with both.
2025-01-15Consume blanks after =encoding in pod reader (#10544)silby
The reader did not properly consume empty lines after =encoding commands, which produced various incorrect parses depending on the content between there and the next command. Fixes #10537
2025-01-10Fix 9495 command test for windows.John MacFarlane
2025-01-10Docx reader and writer: support row heads.John MacFarlane
Reader: When `w:tblLook` has `w:firstColumn` set (or an equivalent bit mask), we set row heads = 1 in the AST. Writer: set `w:firstColumn` in `w:tblLook` when there are row heads. (Word only allows one, so this is triggered by any number of row heads > 0.) Closes #9495.
2025-01-10Docx reader: read table styles as custom styles...John MacFarlane
...when `styles` extension is enabled. Closes #9603. Also improve manual's coverage of custom styles.
2025-01-01Typst writer: fix handling of pixel image dimensions.John MacFarlane
These are now converted to inches as in the LaTeX writer. Closes #9945.
2024-12-28AsciiDoc writer: improve escaping.John MacFarlane
Closes #10385. Closes #2337. Closes #6424.
2024-12-27RST reader: fix handling of underscores.John MacFarlane
Fixes a regression in 3.6 that caused problems parsing text with underscores. Closes #10497.
2024-12-23MediaWiki reader: allow empty quoted attributes.John MacFarlane
Closes #10490.
2024-12-23MediaWiki reader: allow cells starting with `+`.John MacFarlane
Closes #10491.
2024-12-22RST reader: handle explicit reference links (#10485)silby
This case was missed when changing the reference link strategy for RST to allow a single pass. Closes #10484.
2024-12-20Mediawiki writer: escape line-initial characters...John MacFarlane
...that would otherwise be interpreted as list starts. Closes #9700.
2024-12-19Allow `--shift-heading-level-by=-1` to work in djot...John MacFarlane
...in the same way it works for other formats (with the top-level heading being promoted to metadata title). This needed special treatment because of the way djot surrounds sections with Divs. Closes #10459.
2024-12-18LaTeX reader: handle `figure*` environment as a figure.John MacFarlane
Closes #10472.
2024-12-17Textile reader: improve parsing of spans.John MacFarlane
The span needs to be separated from its surroundings by spaces. Also, a span can have attributes, which we now attach. Closes #9878.
2024-12-17Textile reader: inline constructors don't trigger if closer...John MacFarlane
...is preceded by whitespace. Closes #10414.
2024-12-05Add mdoc readerEvan Silberman
This change introduces a reader for mdoc, a roff-derived semantic markup language for manual pages. The two relevant contemporary implementations of mdoc for manual pages are mandoc (https://mandoc.bsd.lv/), which implements the language from scratch in C, and groff (https://www.gnu.org/software/groff/), which implements it as roff macros. mdoc has a lot of semantics specific to technical manuals that aren't representable in Pandoc's AST. I've taken a cue from the mandoc HTML output and many mdoc elements are encoded as Codes or Spans with classes named for the mdoc macro that produced them. Much like web browsers with HTML, mandoc attempts to produce best-effort output given all kinds of weird and crappy mdoc input. Part of the reason it's able to do this is it uses a very accommodating parse tree and stateful output routines specialized to the output mode, and when it encounters some macro it wasn't expecting, it can easily give up on whatever it was outputting and output something else. I've encoded as much flexibility as I reasonably could into the mdoc reader here, but I don't know how to be as flexible as mandoc. This branch has been developed almost exclusively against mandoc's documentation and implementation of mdoc as a reference, and the real-world manual pages tested against are those from the OpenBSD base system. Of ~3500 manuals in mdoc format shipped with a fresh OpenBSD install, 17 cause the mdoc reader to exit with a parse error. Any further chasing of edge cases is deferred to future work. Many of the tests in test/Tests/Readers/Mdoc.hs are derived directly from mandoc's extensive regression tests. [API change] Adds readMdoc to the public API
2024-12-05Parameterize Roff escapingEvan Silberman
The existing lexRoff does some stuff I don't want to deal with in mdoc just yet, like lexing tbl, and some stuff I won't do at all, like handling macro and text string definitions and switching between modes. Uses a typeclass with associated type families to reuse most of the escaping code between Roff (i.e. man) and Mdoc. Future work could improve on this so that more lexing code could be shared between Man and Mdoc. Mdoc inherits Roff's surface syntax so hypothetically it makes sense to lex it into tokens that make sense for roff. But it happens that the Mdoc parser is much easier to build with an Mdoc specific token stream. Some discussion in jgm/pandoc#10225 about the rationale. Adds a test for the roff \A escape, which I accidentally dropped support for in an earlier iteration without anything complaining.
2024-11-19MediaWiki reader: fix indented tables with caption.John MacFarlane
Closes #10390.
2024-11-11Respect empty LineBlock lines in plain writerEvan Silberman
The plain writer behaved as a markdown variant with Ext_line_blocks turned off, and so empty lines in a line block would get eliminated. This is surprising, since if there's anything where the intent can be preserved in plain text output it's empty lines. It's still a bit surprising to have nbsps in plain text output, as in the test, where the distinction doesn't really matter, but that'd be an orthogonal change.
2024-11-04JATS writer: correct spelling of suppress attribute (#10350)Andreas Deininger