github.com/jgm/pandoc - Pandoc — The universal markup converter

Age	Commit message (Collapse)	Author
2024-12-17	Docx writer: use styleIds not styleNames for Title, Subtitle, etc.issue10282	John MacFarlane
	This change affects the default openxml template as well as the OpenXML writer. Closes #10282 (regression introduced in pandoc 3.5).
2024-12-14	Use lastMay instead of reverse	Joseph C. Sible

2024-12-14	Store a function instead of a Boolean	Joseph C. Sible
	Instead of storing isDisplay and then always choosing displayMath or math based on that, just store displayMath or math directly.
2024-12-14	Use <$> instead of >>= and return	Joseph C. Sible

2024-12-14	Put the length in the range expression instead of calling take later	Joseph C. Sible

2024-12-14	Remove redundant null check	Joseph C. Sible
	"all f []" is always true, so "null xs \|\| all f xs" can be simplified to just "all f xs".
2024-12-14	Use the definition of unsnoc from base	Joseph C. Sible
	This is more efficient than the existing one.
2024-12-14	Use catMaybes instead of building with maybe and (:) one element at a time	Joseph C. Sible

2024-12-14	Remove several unnecessary layers of indirection from refs	Joseph C. Sible

2024-12-11	Allow YAML bibliographies to be arrays of references.	John MacFarlane
	Previously, they had to be YAML objects with a `references` key. Closes #10452.
2024-12-11	Cosmetic code improvement.	John MacFarlane

2024-12-07	Add copyright info to two modules missing it.	John MacFarlane

2024-12-07	Stylistic tweak.	John MacFarlane

2024-12-07	Ensure that `--sandbox` affects `--embed-resources`.	John MacFarlane
	Previously it did not (contrary to what was implied by the manual), which means that an image with URL `/etc/passwd` would leak an encoded version of that file to HTML output with `--self-contained` or `--embed-resources`, even if `--sandbox` was used. Thanks to Samuel Mortenson for pointing out the issue.
2024-12-07	T.P.App.OutputSettings: add `sandbox'` function.	John MacFarlane
	This computes the sandboxed files from Opt and avoids some code repetition in T.P.App and T.P.App.OutputSettings.
2024-12-07	Docx reader: handle `\b`, `\i`, `\y` modifiers in `XE` index entries.	John MacFarlane
	See #10171.
2024-12-07	HTML reader: parse footnotes defined by dpub-aria roles.	John MacFarlane
	Closes #5294.
2024-12-05	Add mdoc reader	Evan Silberman
	This change introduces a reader for mdoc, a roff-derived semantic markup language for manual pages. The two relevant contemporary implementations of mdoc for manual pages are mandoc (https://mandoc.bsd.lv/), which implements the language from scratch in C, and groff (https://www.gnu.org/software/groff/), which implements it as roff macros. mdoc has a lot of semantics specific to technical manuals that aren't representable in Pandoc's AST. I've taken a cue from the mandoc HTML output and many mdoc elements are encoded as Codes or Spans with classes named for the mdoc macro that produced them. Much like web browsers with HTML, mandoc attempts to produce best-effort output given all kinds of weird and crappy mdoc input. Part of the reason it's able to do this is it uses a very accommodating parse tree and stateful output routines specialized to the output mode, and when it encounters some macro it wasn't expecting, it can easily give up on whatever it was outputting and output something else. I've encoded as much flexibility as I reasonably could into the mdoc reader here, but I don't know how to be as flexible as mandoc. This branch has been developed almost exclusively against mandoc's documentation and implementation of mdoc as a reference, and the real-world manual pages tested against are those from the OpenBSD base system. Of ~3500 manuals in mdoc format shipped with a fresh OpenBSD install, 17 cause the mdoc reader to exit with a parse error. Any further chasing of edge cases is deferred to future work. Many of the tests in test/Tests/Readers/Mdoc.hs are derived directly from mandoc's extensive regression tests. [API change] Adds readMdoc to the public API
2024-12-05	Parameterize Roff escaping	Evan Silberman
	The existing lexRoff does some stuff I don't want to deal with in mdoc just yet, like lexing tbl, and some stuff I won't do at all, like handling macro and text string definitions and switching between modes. Uses a typeclass with associated type families to reuse most of the escaping code between Roff (i.e. man) and Mdoc. Future work could improve on this so that more lexing code could be shared between Man and Mdoc. Mdoc inherits Roff's surface syntax so hypothetically it makes sense to lex it into tokens that make sense for roff. But it happens that the Mdoc parser is much easier to build with an Mdoc specific token stream. Some discussion in jgm/pandoc#10225 about the rationale. Adds a test for the roff \A escape, which I accidentally dropped support for in an earlier iteration without anything complaining.
2024-12-05	Docx reader: improve index reference support.	John MacFarlane
	Support crossrefs. Clean up and unify switch parsing for fields.
2024-12-05	Docx reader: parse index references as empty Spans.	John MacFarlane
	See #10171.
2024-12-01	Fix comments in TEI writer referring to DocBook (#10430)	silby

2024-11-30	Commonmark `implicit_figures` should check for empty caption...	John MacFarlane
	...and not produce an implicit figure in this case. Closes #10429.
2024-11-24	EPUB writer: use standardized filename for cover image...	John MacFarlane
	...instead of the original name. This avoids problems with e.g. filenames containing spaces. Closes #10404.
2024-11-23	Markdown writer: issue INFO warning when not rendering table...	John MacFarlane
	...e.g., when `raw_html` is disabled and the table can't be fit into a supported markdown table format. Closes #10407.
2024-11-22	Add WebP support to ImageSize (#10397)	silby
	Text.Pandoc.ImageSize: Add `Webp` constructor on ImageType. [API change]
2024-11-19	MediaWiki reader: fix indented tables with caption.	John MacFarlane
	Closes #10390.
2024-11-15	Text.Pandoc.Format: remove duplicate typst entry (#10388)	Caleb Maclennan

2024-11-11	Respect empty LineBlock lines in ANSI writer	Evan Silberman

2024-11-11	Respect empty LineBlock lines in plain writer	Evan Silberman
	The plain writer behaved as a markdown variant with Ext_line_blocks turned off, and so empty lines in a line block would get eliminated. This is surprising, since if there's anything where the intent can be preserved in plain text output it's empty lines. It's still a bit surprising to have nbsps in plain text output, as in the test, where the distinction doesn't really matter, but that'd be an orthogonal change.
2024-11-09	Typst writer: make template sensitive to a `page-numbering` variable.	John MacFarlane
	This can be set to an empty string (or, in metadata, to false) for no page numbers. Addresses #10370.
2024-11-08	Docx reader: handle case where Zotero itemData has different id...	John MacFarlane
	from the citationItem id. In this case we use the citationItemId in the bibliography as well, overriding the referenceId in the itemData. Closes #10366.
2024-11-06	Fix typos (#10349)	Andreas Deininger

2024-11-04	JATS writer: correct spelling of suppress attribute (#10350)	Andreas Deininger

2024-10-28	LaTeX writer: ensure that beamer footnotes go on frame, not column.	John MacFarlane
	Closes #5769.
2024-10-25	LaTeX reader: put minipage in specially marked Div.	John MacFarlane
	Closes #10266.
2024-10-23	Typst reader: support underparen, overparen.	John MacFarlane

2024-10-23	RST reader: support :file: on raw directive.	John MacFarlane
	Closes #8584.
2024-10-23	RST reader: implement option lists.	John MacFarlane
	Closes #10318.
2024-10-23	HTML writer: unwrap empty incremental divs	Albert Krewinkel
	Divs are unwrapped if the only purpose of the div seems to be to control whether lists are presented incrementally on slides. Closes: #10328
2024-10-22	PDF via LaTeX: always do max runs if toc is present.	John MacFarlane
	Closes #10308. The old method (checking to see if toc hash had changed) is not completely reliable; there are cases where an additional run is needed anyway to get the correct "logical" page numbers (especially when different pagination is used for front matter). See #10308 for an example.
2024-10-21	PDF: use .source extension, not .html, in `toPdfViaTempFile`.	John MacFarlane
	See #10314.
2024-10-21	Typst reader: avoid generating empty paragraphs.	John MacFarlane

2024-10-21	Typst reader: fix `#quote` attribution.	John MacFarlane
	If attribution is not present, don't print the `--`. See #10320.
2024-10-21	Typst reader: Fix typo in unicode code point for em dash.	John MacFarlane
	This affects attributions in quote blocks. See #10320.
2024-10-21	Issue warnings for duplicate YAML metadata keys.	John MacFarlane
	Text.Pandoc.Logging: add YamlWarning constructor to LogMessage [API change]. Closes #10312.
2024-10-16	RST reader: handle block level substitutions.	John MacFarlane

2024-10-15	RST reader: avoid putting metadata in Para.	John MacFarlane
	Create MetaInlines when possible, just as with markdown input. MetaBlocks is still used when there are multiple paragraphs or non-paragraph content. This change also affects field lists. Closes #7766.
2024-10-15	RST reader: fix linked substitutions.	John MacFarlane
	E.g. `\|Python\|_`. Closes #6588.
2024-10-15	RST reader: support inline anchors.	John MacFarlane
	Closes #9196.