aboutsummaryrefslogtreecommitdiff
path: root/test/Tests
AgeCommit message (Collapse)Author
13 daysPPTX writer: support notes field in metadata for title slide (#11396)Chris Callison-Burch
This adds support for a `notes` field in the YAML metadata block that will be used as speaker notes for the title slide in PowerPoint output. Previously, there was no way to add speaker notes to the title slide since it is generated from metadata rather than from content blocks. The `::: notes` syntax only works for content slides. Example usage: --- title: My Presentation notes: | Welcome everyone to this presentation. Remember to introduce yourself. --- Closes #5844 (for PPTX output). Co-authored-by: Chris Callison-Burch <[email protected]>
2026-01-07Fix docx writer: skip directory entries when building media overrides (#11379)You Jiangbin
Pandoc's docx writer was previously adding an `<Override>` for `/word/media/` in `[Content_Types].xml` when the reference doc contains media, which violates OPC rules and causes Word to report corruption.
2025-12-28ODT reader: Add table row and column spans (#11366)Tuong Nguyen Manh
Parse the number-rows-spanned and number-columns-spanned attributes to create Cells for the Table.
2025-12-10Org: don't include 'example' class when parsing org example blocks.John MacFarlane
These are just unmarked code blocks. Closes #11339.
2025-11-30pptx writer: Handle reference doc without slides (#11310)Tuong Nguyen Manh
An empty `sldIdLst` is now added if the reference doc is missing one so that `modifySldIdLst` can replace it. To ensure PowerPoint doesn't say that the file will need fixing, the `sldIdLst` has to be placed after the `sldMasterIdLst`. I also added a test to ensure that if there are notes, they will be placed between the `sldMasterIdLst` and `sldIdLst`. Otherwise PowerPoint wouldn't show the slide of a note when viewing Notes Pages. Closes #7536.
2025-11-29Add asciidoc as an input format.John MacFarlane
New exported module Text.Pandoc.Readers.AsciiDoc, exporting readAsciiDoc [API change]. The bulk of parsing is handled by the asciidoc library. Closes #1456.
2025-11-24Fix warning in Docx reader test.John MacFarlane
2025-11-24Add `xlsx` (Microsoft Excel) as an input format.Anton Antich
Each worksheet turns into a section containing a table. The common file `nativeDiff` has been extract from the Docx and Pptx text files and put in Tests.Helpers.
2025-11-24Support pptx (PowerPoint) as an input format.Anton Antich
New module `Text.Pandoc.Readers.Pptx`, exporting `readPptx`. [API change] Factored out some common OOXML functions from Text.Pandoc.Readers.Docx.Util into a non-exported module Text.Pandoc.Readers.OOXML.Shared.
2025-11-05Add BBCode writer (#11242)reptee
`bbcode` is now supported as an output format, as well as variants `bbcode_fluxbb` (FluxBB), `bbcode_phpbb` (phpBB), `bbcode_steam` (Hubzilla), `bbcode_hubzilla` (Hubzilla), and `bbcode_xenforo` (xenForo). [API change] Adds a new module Text.Pandoc.Writers.BBCode, exporting a number of functions. Also exports `writeBBCode`, `writeBBCodeSteam`, `writeBBCodeFluxBB`, `writeBBCodePhpBB`, `writeBBCodeHubzilla`, `writeBBCodeXenforo` from Text.Pandoc.Writers.
2025-10-18Update to use latest dev citeproc.John MacFarlane
Fixed golden test regeneration in Docx reader test.
2025-09-17Use Tasty.Golden for Docx reader tests.John MacFarlane
This way we can update them with `--accept`.
2025-09-15Vimdoc writer (#11132)reptee
Support for vimdoc, documentation format used by vim in its help pages. Relies heavily on definition lists and precise text alignment to generate tags.
2025-09-08pptx writer: Handle single columnTuong Nguyen Manh
Add an additional guard for a single column to be able to process it.
2025-09-02Refactor highlighting options [API Change]Albert Krewinkel
A new command line option `--syntax-highlighting` is provided; it takes the values `none`, `default`, `idiomatic`, a style name, or a path to a theme file. It replaces the `--no-highlighting`, `--highlighting-style`, and `--listings` options. The `writerListings` and `writerHighlightStyle` fields of the `WriterOptions` type are replaced with `writerHighlightStyle`. Closes: #10525
2025-09-02Change `latex-pos` to `latex-placement`.John MacFarlane
2025-09-01LaTeX writer: control figure placement with attribute (#11094)Sean Soon
If a `latex-pos` attribute is present on a figure, it will be used as the optional positioning hint in LaTeX (e.g. `ht`). With implicit figures, `latex-pos` will be added to the figure (and removed from the image) if it is present on the image. Closes #10369.
2025-08-27Org reader: improve sub- and superscript parsing.Albert Krewinkel
Sub- and superscript must be preceded by a string in Org mode. Some text preceded by space or at the start of a paragraph was previously parsed incorrectly as sub- or superscript.
2025-08-26HTML reader: don't drop the initial newline in a pre element.John MacFarlane
Closes #11064.
2025-08-10ODT Reader: Add table-header-rowsTuong Nguyen Manh
2025-08-06Add `smart_quotes` and `special_strings` extensions for OrgAlbert Krewinkel
Org mode makes a distinction between smart parsing of quotes, and smart parsing of special strings like `...`. The finer grained control over these features is necessary to truthfully reproduce Emacs Org mode behavior. Special strings are enabled by default, while smart quotes are disabled. The behavior of `special_string` is brought closer to the reference implementation in that `\-` is now treated as a soft hyphen.
2025-08-03Fix named entity lookup in POD readerEvan Silberman
Translating entities by name ultimately relies on Commonmark.Entity.lookupEntity, which de facto requires the entity name to be followed by a semicolon. Paste a semicolon onto the end of the entity name read from POD to look it up. Fixes #11015
2025-07-26New `xml` format exactly representing a Pandoc AST.massifrg
This adds a reader and writer for an XML format equivalent to `native` and `json`. XML schemas for validation can be found in `tools/pandoc-xml.*`. The format is documented in `doc/xml.md`. API changes: - Add module Text.Pandoc.Readers.XML, exporting `readXML`. - Add module Text.Pandoc.Writers.XML, exporting `writeXML`. A new unexported module Text.Pandoc.XMLFormat is also added.
2025-07-24Org reader: Recognize "fast access" characters in TODO state definitions ↵Ryan Gibb
(#10990)
2025-06-02Markdown reader: make definition lists behave like other lists.John MacFarlane
If the `four_space_rule` extension is not enabled, figure out the indentation needed for child blocks dynamically, by looking at the first nonspace content after the `:` marker. Previously the four-space rule was always obeyed. Remove the old `compact_definition_lists` extension. This was neded to preserve backwards compatibility after pandoc 1.12 was released, but at this point we can get rid of it. T.P.Extensions: remove `Ext_compact_definition_lists` constructor for `Extension` [API change]. Fix tight/loose detection for definition lists, to conform to the documentation. Closes #10889.
2025-05-28Fix whitespace bugs.John MacFarlane
2025-05-28Adding support for sidebars to Asciidoc writerGreg
2025-05-26LaTeX writer: include alt option in `\includegraphics`.John MacFarlane
Closes #6095.
2025-05-16Fix problems with gridTable and add tests.John MacFarlane
Closes #10848.
2025-05-11Remove some redundant code in test.John MacFarlane
2025-05-11Org reader: change handling of inline TeX.John MacFarlane
Previously inline TeX was handled in a way that was different from org's own export, and that could lead to information loss. This was particularly noticeable for inline math environments such as `equation`. Previously, an `equation` environment starting at the beginning of a line would create a raw block, splitting up the paragraph containing it (see #10836). On the other hand, an `equation` environment not at the beginning of a line would be turned into regular inline elements representing the math. (This would cause the equation number to go missing and in some cases degrade the math formatting.) Now, we parse all of these as raw "latex" inlines, which will be omitted when converting to formats other than LaTeX (and other formats like pandoc's Markdown that allow raw LaTex). Closes #10836.
2025-03-29Use `pdf-engine` variable instead of extensions...John MacFarlane
...to determine what to do about `.pdfhref` macros in `ms` output. When no PDF engine is specified, we don't use the `.pdfhref` macros at all. This gives better results for links in formats other than PDF, since the link text would simply disappear if it exists only in a `.pdfhref` macro. When a PDF engine is specified, escape the argument of `.pdfhref O` in a way that is appropriate. Remove `groff` extension. Text.Pandoc.Extensions: remove `Ext_groff` constructor. See #10738. This revises the earlier commit 3adcb4bd8089cdb8408da5f17780cd49513b7cec.
2025-03-17Markdown writer: avoid spaces after/before open/close delimiters.John MacFarlane
E.g. instead of rendering `x<em> space </em>y` as `x* space *y` we render it as `x *space* y`. Closes #10696.
2025-03-14Markdown reader: remove some misguided list fanciness.John MacFarlane
Previously we tried to handle things like commented out list items: - one <!-- - two --> - three and also things like: - one `and - two` and But the code we added to handle these cases caused problems with other, more straightforward things, like: - one - ``` code ``` - three So we are rolling back all the fanciness, so that the markdown parser now behaves more like the commonmark parser, in which indicators of block-level structure always take priority over indicators of inline structure. Closes #9865. Closes #7778. See also #5628.
2025-02-12Markdown writer: omit extra space after bullets.John MacFarlane
We used to insert extra spaces to ensure that the content respected the four-space rule. That is not really necessary now, since pandoc's markdown and most markdowns don't follow the four-space rule. Those who want the old behavior can obtain it by using `-t markdown+four_space_rule`. Closes #7172.
2025-02-07Track wikilinks with a class instead of a titleEvan Silberman
Once upon a time the only metadata element for links in Pandoc's AST was a title, and it was hijacked to track certain links as having originated in the wikilink syntax. Now we have Attrs and we can use a class to handle wikilinks instead. Requires coordinated changes to commonmark-hs.
2025-01-30DOCX reader: do not issue warning for comments with `+styles` (#10572)Stephen Reindl
Closes #10571. Co-authored-by: Stephen Reindl <[email protected]>
2025-01-21Prefer MIME type when determining extensions for MediaBag items (#10557)Max Heller
Currently, remote images added to the MediaBag are stored at paths with extensions determined based on the external URI. For instance, an image from https://example.com/image.png is stored as <hash>.png. If the URI does not contain an extension (e.g., https://example.com/image), then the content-type of the downloaded image is used to determine the extension. This change switches the precedence such that content-type is preferred over extensions contained in the URI. This is necessary because some images are located at URIs with misleading extensions -- shields.io, for instance, serves SVGs from URIs with .yml extensions. With this change, the image/svg+xml content-type is now preferred over the .yml URI extension. This fixes a bug in the PDF writer in which such an image would be mishandled due to not being identified as an SVG.
2024-12-28AsciiDoc writer: improve escaping.John MacFarlane
Closes #10385. Closes #2337. Closes #6424.
2024-12-27Add Pod readerEvan Silberman
Pod ("Plain old documentation") is a markup languaged used principally to document Perl modules and programs. Since it was originally meant to be translated pretty directly to man, the semantics are fairly simple. This Pod reader was developed with reference to the canonical user and implementer documentation of Pod: https://perldoc.perl.org/perlpod and https://perldoc.perl.org/perlpodspec. There are 1490 .pod, .pl, and .pm in the Perl 5.34 distribution found in /System/Library/Perl on my mac. Of those, this reader dies with a parse error on 7 of them. All of them seem to be cases where pod commands are found within a non-colon-prefixed =begin/=end. perlpodspec says I may treat this as an error. [API change] adds readPod
2024-12-05Add mdoc readerEvan Silberman
This change introduces a reader for mdoc, a roff-derived semantic markup language for manual pages. The two relevant contemporary implementations of mdoc for manual pages are mandoc (https://mandoc.bsd.lv/), which implements the language from scratch in C, and groff (https://www.gnu.org/software/groff/), which implements it as roff macros. mdoc has a lot of semantics specific to technical manuals that aren't representable in Pandoc's AST. I've taken a cue from the mandoc HTML output and many mdoc elements are encoded as Codes or Spans with classes named for the mdoc macro that produced them. Much like web browsers with HTML, mandoc attempts to produce best-effort output given all kinds of weird and crappy mdoc input. Part of the reason it's able to do this is it uses a very accommodating parse tree and stateful output routines specialized to the output mode, and when it encounters some macro it wasn't expecting, it can easily give up on whatever it was outputting and output something else. I've encoded as much flexibility as I reasonably could into the mdoc reader here, but I don't know how to be as flexible as mandoc. This branch has been developed almost exclusively against mandoc's documentation and implementation of mdoc as a reference, and the real-world manual pages tested against are those from the OpenBSD base system. Of ~3500 manuals in mdoc format shipped with a fresh OpenBSD install, 17 cause the mdoc reader to exit with a parse error. Any further chasing of edge cases is deferred to future work. Many of the tests in test/Tests/Readers/Mdoc.hs are derived directly from mandoc's extensive regression tests. [API change] Adds readMdoc to the public API
2024-10-15RST reader: avoid putting metadata in Para.John MacFarlane
Create MetaInlines when possible, just as with markdown input. MetaBlocks is still used when there are multiple paragraphs or non-paragraph content. This change also affects field lists. Closes #7766.
2024-10-01RST writer: change bullet list hang from 3 to 2.John MacFarlane
This accords with the style in the reference docs.
2024-09-21DokuWiki reader: fix block quote behavior.John MacFarlane
Closes #6461. Blockquotes are not really block containers in DokuWiki; the lines are interpreted literally (so, e.g., you can't start a list), and line breaks are added at the ends.
2024-09-09Tests.Readers.Markdown: avoid use of 'head'.John MacFarlane
2024-09-09Tests: use 'drop 1' instead of partial function 'tail'.John MacFarlane
2024-09-09Avoid use of 'head' in Tests.Shared.John MacFarlane
2024-09-03Add ansi writer tests.John MacFarlane
2024-09-03HTML reader: only parse main element's contents (if present).John MacFarlane
If main has an id or class, we include a div with that id or class; otherwise just the contents. Closes #10140.
2024-07-27Docx writer: fix regression with nested lists.John MacFarlane
Closes #9994. The bug affects e.g. ordered lists with bullet sublists; after the sublist the top-level list reverts to bullets instead of being properly numbered. This regression was introduced in version 3.2.1 and was caused by commit f5531f1.