aboutsummaryrefslogtreecommitdiff
path: root/src/Text
AgeCommit message (Collapse)Author
2025-03-15LaTeX writer: protect `\phantomsection` (#10688)etclub
`\phantomsection` is used for anchors. When these occur inside a caption, a LaTeX error is raised unless the `\phantomsection` is protected using `\protect`. So we now `\protect` every `\phantomsection` (even outside of captions -- this seems to be harmless).
2025-03-14LaTeX writer: use `*` for multirow width when no colwidth specified.John MacFarlane
Otherwise the multirow will be excessively wide. Closes #10685.
2025-03-14Markdown reader: remove some misguided list fanciness.John MacFarlane
Previously we tried to handle things like commented out list items: - one <!-- - two --> - three and also things like: - one `and - two` and But the code we added to handle these cases caused problems with other, more straightforward things, like: - one - ``` code ``` - three So we are rolling back all the fanciness, so that the markdown parser now behaves more like the commonmark parser, in which indicators of block-level structure always take priority over indicators of inline structure. Closes #9865. Closes #7778. See also #5628.
2025-03-10Markdown writer: treat `Emph [Emph ils]]` as `ils`.John MacFarlane
Otherwise we get `**content**` which means strong emphasis. This is a more robust solution than using `_`, which won't work for intraword emphasis. Closes #10642.
2025-03-10HTML reader: ignore style tags in the body.John MacFarlane
They are invalid but do occur in the wild. Closes #10643.
2025-03-08HTML reader: skip MathJaX-introduced cruft.John MacFarlane
See #10673.
2025-03-07Markdown reader: fixed `escapedChar'` parser.John MacFarlane
It should not accept escaped newlines. See #10672.
2025-03-07T.P.Logging: Change NoTitleElement from WARNING to INFO.John MacFarlane
Users commonly complain about the warning when producing HTML documents without an explicit title. It seems that an info message is more appropriate, since pandoc's default here (using the input's base name) ensures compliance with the standard and many users are happy with that default. Those who want to make sure the message is seen can use `--verbose`. Closes #10671.
2025-03-05Disable citations extension in writers if `--citeproc` is used.John MacFarlane
Otherwise we get undesirable results, as the format's native citation mechanism is used instead of (or in addition to) the citeproc-generated citations. Closes #10662.
2025-03-04Typst writer: ensure that `citation-style` works as well as `csl`.John MacFarlane
Closes #10661.
2025-03-04LaTeX reader: support `\newline`, `\linebreak`.John MacFarlane
2025-03-04LaTeX reader: better handle comments/whitespace in option lists and includes.John MacFarlane
Closes #10659.
2025-02-27Typst writer: better heuristics for escaping potential list markers.John MacFarlane
Closes #10650.
2025-02-19Revert "Docx reader and writer: support row heads."John MacFarlane
This reverts commit cbe67b9602a736976ef6921aefbbc60d51c6755a. Word sets `w:firstColumn="1"` by default for tables. You have to find the Table Design tab and explicitly uncheck "First Column" to make this go away. In most cases, I don't think writers intend to designate the first column as a row head, so this commit is going to produce unexpected results. In addition, because of the table normalization done by pandoc-type's `tableWith`, any table containing a colspanned cell in the left-hand column will get broken if the first column is designated a row head. For these reasons it seems best to revert this change, which was made in response to #9495. Closes #10627.
2025-02-18Give better position information when YAML metadata parsing fails...John MacFarlane
with a YAML exception. See #10231.
2025-02-16EPUB writer: use a nonbreaking space after section number in nav.xhtml.John MacFarlane
This seems to be required for iOS books app to display the space.
2025-02-16T.P.Shared, makeSections: put some attributes on section element only.John MacFarlane
Certain `role` and `epub:type` attributes should only be on the section (and indeed, many `role`s give a validation error if left on the heading element).
2025-02-14Markdown reader: allow line break between URL and title of link.John MacFarlane
Closes #10621.
2025-02-13Powerpoint writer: avoid extra blank lines before author.John MacFarlane
(In the case where there is no subtitle.) Closes #10619.
2025-02-13Smart quote parsing: ignore curly quotes.John MacFarlane
Previously we tried to match curly quotes as well as straight quotes, producing Quoted inlines. But it seems better just to assume that those who use curly quotes want them passed through verbatim. This also fixes an (unintended) bug whereby curly single left quotes would sometimes be changed to single right quotes. Closes #10610.
2025-02-12Markdown writer: omit extra space after bullets.John MacFarlane
We used to insert extra spaces to ensure that the content respected the four-space rule. That is not really necessary now, since pandoc's markdown and most markdowns don't follow the four-space rule. Those who want the old behavior can obtain it by using `-t markdown+four_space_rule`. Closes #7172.
2025-02-10Remove selnolig-langs.John MacFarlane
We now specify the language as a global option again, so we no longer need to specify it when invoking selnolig. See #9863.
2025-02-09TWiki reader: use "wikilink" class, instead of title.John MacFarlane
2025-02-08LaTeX writer/template: Improve babel support.John MacFarlane
Previously we used the `.ini` files for every language, but for European languages these tend to provide inferior results to the `.ldf` files used by classic Babel. Currently Babel documentation recommends using the classic system for European languages written in Latin and Cyrillic scripts and Vietnamese. So the LaTeX writer and template now follow this guidance. Main languages in the list of languages with good "classic" support are added to global documentclass options and will be automatically handled by Babel using the `.ldf` files. If the main language is not in this list, the `babeloptions` variable will be set to `provide=*`, which will cause support to be loaded from the `.ini` file rather than an `.ldf`. So, for example, setting `-V babeloptions=''` with a polytonic Greek document will cause the `.ldf` support to be used instead of the `.ini`. The default setting of this variable can be overwritten, but in most cases the default should give good results. Closes #8283.
2025-02-07Track wikilinks with a class instead of a titleEvan Silberman
Once upon a time the only metadata element for links in Pandoc's AST was a title, and it was hijacked to track certain links as having originated in the wikilink syntax. Now we have Attrs and we can use a class to handle wikilinks instead. Requires coordinated changes to commonmark-hs.
2025-02-05Add CRediT roles to JATSCharles Tapley Hoyt
Enable annotating author roles using the Contribution Role Taxonomy (CRediT) and export this information in conformant JATS Closes #10152. Co-Authored-By: Jez Cope <[email protected]>
2025-02-03DocBook reader: Handle title inside orderedlist.John MacFarlane
Also some other elements that allow title: blockquote, calloutlist, etc. Closes #10594.
2025-02-02DocBook reader: better handle informalequation (#10592)Sen-wen DENG
Include id attribute. The code should be credited to @tombolano.
2025-02-01DocBook reader: better handle formalpara, example, and sidebar.John MacFarlane
Include identifiers and titles in each case. The code should be credited to @tombolano. Closes #8666.
2025-02-01ODT reader: create Figure elements for images that are figures.John MacFarlane
Closes #10567.
2025-01-31Markdown reader: Simplify and fix normal citation parsing.John MacFarlane
Closes #10584. This fixes a bug that causes some normal citations to be parsed as bracketed regular citations.
2025-01-31Docx writer: repeat reference doc's sectPr for each new section.John MacFarlane
Previously we were only carrying over the reference doc's sectPr at the end of the document, so it wouldn't affect the intermediate sections that are now added if `--top-level-division` is `chapter` or `part`. This could lead to bad results (e.g. page numbering starting only on the last chapter). Closes #10577.
2025-01-31Docx writer: create section divisions with `--top-level-division=part`.John MacFarlane
Closes #10576.
2025-01-30ODT reader: avoid producing spurious blockquotes in list items.John MacFarlane
Closes #9505.
2025-01-30ODT reader: fix unwanted block quotes.John MacFarlane
Previously the reader created block quotes whenever a paragraph was marked indented (even though this just affects the first line). With this change we still generate block quotes for content that has an altered left margin, but not for indented paragraphs. See #10575. This patch does NOT address the related #9505 which concerns lists.
2025-01-30DOCX reader: do not issue warning for comments with `+styles` (#10572)Stephen Reindl
Closes #10571. Co-authored-by: Stephen Reindl <[email protected]>
2025-01-29Handle <abbr> as a span-like inlineEvan Silberman
Closes #5793
2025-01-24brace tables with typst:no-figure and typst:text attributes (#10563)Gordon Woodhull
The combination of #9648 Typst property output and #9778 `typst:no-figure` can cause fonts to spill out of tables. This is because setting Typst text properties across a table requires `set text(...)` outside the table, and previously we were relying on the figure to provide a scope. This adds an extra `#{...}` when the table has class `typst:no-figure` and also has `typst:text:*` attributes.
2025-01-21Prefer MIME type when determining extensions for MediaBag items (#10557)Max Heller
Currently, remote images added to the MediaBag are stored at paths with extensions determined based on the external URI. For instance, an image from https://example.com/image.png is stored as <hash>.png. If the URI does not contain an extension (e.g., https://example.com/image), then the content-type of the downloaded image is used to determine the extension. This change switches the precedence such that content-type is preferred over extensions contained in the URI. This is necessary because some images are located at URIs with misleading extensions -- shields.io, for instance, serves SVGs from URIs with .yml extensions. With this change, the image/svg+xml content-type is now preferred over the .yml URI extension. This fixes a bug in the PDF writer in which such an image would be mishandled due to not being identified as an SVG.
2025-01-16Citeproc: fix moving punctuation before citation notes.John MacFarlane
This previously worked with regular citations, but not author-in-text citations. Now it works with both.
2025-01-15Consume blanks after =encoding in pod reader (#10544)silby
The reader did not properly consume empty lines after =encoding commands, which produced various incorrect parses depending on the content between there and the next command. Fixes #10537
2025-01-14Fix escaping of `-` in ms writer.John MacFarlane
In 5132f1ef330d3eb2a0bf87037035beaeaf19d3f3 we added `-` to the list of characters needing backslash escaping, to accommodate a change in groff man's behavior, described here: https://lwn.net/Articles/947941/ This change also led `-` to be escaped in ms output, but that is wrong; `\-` in ms is a unicode minus sign. To fix this, we add a Boolean parameter to `escapeString` in Text.Pandoc.Writers.Roff that determines whether `-` is to be escaped. (NB: This is not an exported function in the API.) The list `standardEscapes` in Text.Pandoc.RoffChar no longer contains `-`. Closes #10536.
2025-01-10Docx reader and writer: support row heads.John MacFarlane
Reader: When `w:tblLook` has `w:firstColumn` set (or an equivalent bit mask), we set row heads = 1 in the AST. Writer: set `w:firstColumn` in `w:tblLook` when there are row heads. (Word only allows one, so this is triggered by any number of row heads > 0.) Closes #9495.
2025-01-10Docx reader: read table styles as custom styles...John MacFarlane
...when `styles` extension is enabled. Closes #9603. Also improve manual's coverage of custom styles.
2025-01-07HTML reader: add size information for fa svg icons.John MacFarlane
If the icon has class fa-fw or fa-w16 or fa-w14, we add a width attribute to prevent the icon from appearing full-width in PDF or docx output. Closes #10134.
2025-01-06Djot reader/writer highlighted text fixes:John MacFarlane
- The reader now uses a Span with class "mark" rather than "highlighted", for consistency with the other pandoc readers and writers. - The writer renders a Span with sole class "mark" as highlighted text.
2025-01-06Asciidoc writer: don't emit class in span if it's just "mark".John MacFarlane
"mark" class is used for highlighting, and Asciidoc treats bare `#...#` with no attributes as highlighted text. Closes #10511.
2025-01-06EPUB v2 writer: fix cover image.John MacFarlane
Closes #10505. Regression from 3.6 caused by #10404.
2025-01-03Add mdoc St for C23Evan Silberman
Following mandoc: https://cvsweb.bsd.lv/mandoc/st.c?rev=1.19&content-type=text/x-cvsweb-markup
2025-01-01Typst writer: fix handling of pixel image dimensions.John MacFarlane
These are now converted to inches as in the LaTeX writer. Closes #9945.