aboutsummaryrefslogtreecommitdiff
path: root/test/jats-reader.native
AgeCommit message (Collapse)Author
2024-11-06Fix typos (#10349)Andreas Deininger
2023-10-17JATS reader: fix handling of alt-text (#9134)Julia Diaz
Previously we were looking for an attribute that doesn't exist in JATS; alt-text is provided by a child element. Closes #9130.
2023-08-30JATS reader: Multilevel support for `<permissions>` metadata (#9037)Julia Diaz
This revises the earlier support for `<permissions>`: now metadata objects with multiple fields are created, matching the structure in JATS.
2023-08-10Fix display of block elements in JATS reader (PR #8971)Julia Diaz
A number of block elements, like disp-quote, list, and disp-formula, were always treated as inlines if appearing inside paragraphs, even if their usage granted a separate block. The function isElementBlock has been refined to prevent this, and a number of specific parse cases have been added to parseBlock. Also, some minimal cleanup of the test file, in order for it to pass XML validation against the JATS DTD 1.3 (it was not compliant with the current or any previous versions of JATS). Closes #8889.
2023-06-08Add footer and multiple body parsing to JATS table reader (#8795)Noah Malmed
Closes #8765.
2023-06-06Improve title and label parsing in the JATS reader (#8840)Noah Malmed
Closes #8718.
2023-04-05Add rowspan, colspan and alignment to cells in jats table reader (#8726)Noah Malmed
Partially addresses #8408
2021-09-28Switch from pretty-simple to pretty-show for native output.John MacFarlane
Update tests. Reason: it turns out that the native output generated by pretty-simple isn't always readable by the native reader. According to https://github.com/cdepillabout/pretty-simple/issues/99 it is not a design goal of the library that the rendered values be readable using 'read'. This makes it unsuitable for our purposes. pretty-show is a bit slower and it uses 4-space indents (non-configurable), but it doesn't have this serious drawback.
2021-09-21Use pretty-simple to format native output.John MacFarlane
Previously we used our own homespun formatting. But this produces over-long lines that aren't ideal for diffs in tests. Easier to use something off-the-shelf and standard. Closes #7580. Performance is slower by about a factor of 10, but this isn't really a problem because native isn't suitable as a serialization format. (For serialization you should use json, because the reader is so much faster than native.)
2021-02-10Add new unexported module T.P.XMLParser.John MacFarlane
This exports functions that uses xml-conduit's parser to produce an xml-light Element or [Content]. This allows existing pandoc code to use a better parser without much modification. The new parser is used in all places where xml-light's parser was previously used. Benchmarks show a significant performance improvement in parsing XML-based formats (especially ODT and FB2). Note that the xml-light types use String, so the conversion from xml-conduit types involves a lot of extra allocation. It would be desirable to avoid that in the future by gradually switching to using xml-conduit directly. This can be done module by module. The new parser also reports errors, which we report when possible. A new constructor PandocXMLError has been added to PandocError in T.P.Error [API change]. Closes #7091, which was the main stimulus. These changes revealed the need for some changes in the tests. The docbook-reader.docbook test lacked definitions for the entities it used; these have been added. And the docx golden tests have been updated, because the new parser does not preserve the order of attributes. Add entity defs to docbook-reader.docbook. Update golden tests for docx.
2020-04-15Use the new builders, modify readers to preserve empty headersdespresc
The Builder.simpleTable now only adds a row to the TableHead when the given header row is not null. This uncovered an inconsistency in the readers: some would unconditionally emit a header filled with empty cells, even if the header was not present. Now every reader has the conditional behaviour. Only the XWiki writer depended on the header row being always present; it now pads its head as necessary.
2020-04-15Adapt to the removal of the RowSpan, ColSpan, RowHeadColumns accessorsdespresc
2020-04-15Adapt to the newest Table type, fix some previous adaptation issuesdespresc
- Writers.Native is now adapted to the new Table type. - Inline captions should now be conditionally wrapped in a Plain, not a Para block. - The toLegacyTable function now lives in Writers.Shared.
2020-04-15Implement the new Table typedespresc
2018-03-05Remove extraneous, significant whitespace in JATS writer output (#4335)Nokome Bentley
This patch fixes some cases where the JATS writer was introducing semantically significant whitespace by indenting and wrapping tags. Note that the JATS spec has a content model for `<p>` tags of `(#PCDATA | ...`. Any tag where `#PCDATA` children are possible should not have any indentation. The same is true for `<th>`, `<td>`, `<term>`, `<label>`.
2017-12-23JATS reader: process author metadata.John MacFarlane
2017-12-20Add Basic JATS reader based on DocBook readerHamish Mackenzie