| Age | Commit message (Collapse) | Author |
|
Closes #11380.
|
|
|
|
This PR aims to handle a common run field instruction (fieldInstr)
from docx format : REF, specifically those with the "link" switch \h.
In word software, you can create REF field instruction with the
Cross-reference button. You can create cross-reference to
many things such as Equation, Table, Title...
|
|
New module `Text.Pandoc.Readers.Pptx`,
exporting `readPptx`. [API change]
Factored out some common OOXML functions from
Text.Pandoc.Readers.Docx.Util into a non-exported module
Text.Pandoc.Readers.OOXML.Shared.
|
|
The docx reader uses caption styles to identify figures and captioned
tables. It now checks for known caption styles in the full styles
hierarchy of a paragraph instead of just checking the style directly.
This allows to recognize caption styles that are built on top of the
basic *caption* style, as is sometimes the case in sophisticated styles.
|
|
This should only be used if sectPr is not found.
|
|
Previously we assumed that every table took up the full text
width. Now we read the text width from the document's
sectPr.
Closes #9837.
Closes #11147.
|
|
This revises the solution to #9214 in commit 2e8ecb3 in order to
handle a standard Word way of inserting emojis.
Closes #11113.
|
|
This was too heavy-handed a fix, and it interferes with processing
Word emojis (#11109).
|
|
It previously converted things like `11ccc` to an integer;
now it requires that the whole string be parsable as an integer.
Closes #9184.
|
|
Closes #7691.
|
|
...instead of defining it again.
|
|
This reverts commit cbe67b9602a736976ef6921aefbbc60d51c6755a.
Word sets `w:firstColumn="1"` by default for tables. You have to find
the Table Design tab and explicitly uncheck "First Column" to make this
go away. In most cases, I don't think writers intend to designate
the first column as a row head, so this commit is going to produce
unexpected results. In addition, because of the table normalization
done by pandoc-type's `tableWith`, any table containing a colspanned
cell in the left-hand column will get broken if the first column is
designated a row head. For these reasons it seems best to revert this
change, which was made in response to #9495.
Closes #10627.
|
|
Reader: When `w:tblLook` has `w:firstColumn` set (or an equivalent bit
mask), we set row heads = 1 in the AST.
Writer: set `w:firstColumn` in `w:tblLook` when there are row
heads. (Word only allows one, so this is triggered by any number
of row heads > 0.)
Closes #9495.
|
|
...when `styles` extension is enabled. Closes #9603.
Also improve manual's coverage of custom styles.
|
|
See #10171.
|
|
Support crossrefs.
Clean up and unify switch parsing for fields.
|
|
See #10171.
|
|
Headings in docx, even ones that do not have a visible number,
can have a numId, and in odd cases can even share a numId with
a list that continues after the header. In this case the list
numbering should be reset by the header.
To accomplish this, we add a Heading constructor to BodyPart and
include on it all the information list items have.
Closes #10258.
|
|
|
|
- Turn captioned images into Figure elements. Closes #9391.
- Improve the logic for associating elements with captions.
Closes #9358.
- Ensure that captions that can't be associated with an
element aren't just silently dropped. Closes #9610.
|
|
We'll use this for image captions as well. Word does not really
distinguish these.
|
|
This also fixes a small bug in parsing delimiters in numbered lists,
which led to the default delimiter being used wrongly in some cases.
Closes #8211.
|
|
using `bottomUp` with a faster one using `walk`.
|
|
Also fix tests.
|
|
OpenXML doesn't have a way of indicating column alignments, but
we guess them by looking at the justification property on the
first paragraph of a cell, if there is one.
We take the column alignments from the first body row.
Closes #8551.
|
|
...and not just Runs. This fixes a problem wherein comments
inside insertions or deletions would be ignored. Closes #9833.
|
|
We support both pandoc-style and the style described at
https://support.microsoft.com/en-us/office/insert-a-horizontal-line-9bf172f6-5908-4791-9bb9-2c952197b1a9
Closes #6285.
|
|
This paves the way to supporting horizontal rules in the reader.
We still need to adjust the parser to create HRule appropriately;
so far, this change has no effect, but it's a step on the way
to #6285.
|
|
|
|
Normally these occur outside the table element itself, but they
should still be parsed as captions in this case.
Closes #9518.
|
|
The styleId can change depending on the localization.
Partially resolves #9518.
|
|
Header and footer references may be absolute in the reference.docx.
E.g. editing it with dotnet's Open-XML-SDK causes this error:
```
+ pandoc test.md -t docx --reference-doc referenceh.docx -o test.docx
word//word/header1.xml missing in reference docx
```
There was already code in pandoc to handle relative vs absolute paths in
references, so use it.
Signed-off-by: Edwin Török <[email protected]>
|
|
The argument can apparently be omitted, and then we just have
a fragment URL. Closes #9246.
|
|
|
|
* #9214 text in shape format test document
* #9214 support Text in Shape Format
* #9214 remove irrelevant code
|
|
Add T.P.Readers.Docx.Symbols. This gives us a table to use to
resolve characters included in docx via w:sym element.
Use this table to resolve characters when symbol fonts are specified.
Closes #9220.
|
|
|
|
Closes #9002.
|
|
Previously the backup PNG was exported even if an SVG was
present, but the SVG should be preferred.
Closes #7244.
|
|
LibreOffice tags images slightly differently than Word; this change lets
the parses take that difference into account when looking for an image
description (alt text).
|
|
|
|
Closes #8483.
The problem is that oMathPara can either occur at the block-level
(child of w:body) or at the inline level (child of w:p, potentially
with other content). We need to handle both cases.
Previously the code just assumed that if we had a w:p with an oMathPara,
the math would be the sole content.
This patch removes OMathPara as a constructor of BodyPart
and adds it as a constructor of ParPart.
|
|
|
|
This will no doubt produce a bunch of warnings and hence CI
failures, which we'll need to work around with explicit imports.
|
|
We were exporting Parser, ParserT as synonyms of Parsec, ParsecT.
There is no good reason for this and it can cause confusion.
Also, when possible, we replace imports of Text.Parsec with
T.P.Parsing. The idea is to make it easier, at some point,
to switch to megaparsec or another parsing engine if we want to.
T.P.Parsing new exports: Stream(..), updatePosString, SourceName,
Parsec, ParsecT [API change].
Removed exports: Parser, ParserT [API change].
|
|
|
|
|
|
If a document uses numbered headings, then headings without numbers are
marked with class `unnumbered`, the default class used by pandoc to
convey this kind of information. The classes are not added if none of
the headings in a document are. This change ensures good conversion
results when converting with `--number-sections`.
Closes: #8148
|
|
|