#601 closed defect (fixed)
xml2rfc --v2v3 --add-xinclude strips CDATA, converts < > to < >
Reported by: | mahoney@nostrum.com | Owned by: | krathnayake@ietf.org |
---|---|---|---|
Priority: | medium | Milestone: | |
Component: | v3 vocabulary | Version: | |
Keywords: | Cc: | rfc-editor@rfc-editor.org |
Description
The command:
xml2rfc --v2v3 --add-xinclude <xmlfile>
can be used to covert "long-way" references (<reference anchor="RFC2119" ...) to xi:includes references (<xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.2119.xml"/>).
However, if the document has artwork that contains CDATA, the CDATA is stripped, and any arrows (< >) in the artwork are converted to their XML entities (< >).
See:
https://www.rfc-editor.org/v3test/draft-ietf-quic-tls-34.original.xml
https://www.rfc-editor.org/v3test/draft-ietf-quic-tls-34.original.v2v3.xml
Change History (7)
comment:1 Changed 15 months ago by mahoney@nostrum.com
comment:2 Changed 15 months ago by henrik@levkowetz.com
This is a side effect of the way the XML library (lxml) deals with CDATA. Certain transformations of the XML causes lxml to re-evaluate the whole xml content, with the result that CDATA blocks are converted to lxml's preferred rendering, which doesn't use CDATA. It might be possible to convince lxml to re-introduce CDATA blocks, but when I looked for such an option a couple of years ago I didn't find anything of that kind.
comment:3 Changed 15 months ago by mahoney@nostrum.com
We are only seeing this when "xml2rfc --v2v3 --add-xinclude <xmlfile>" is run on a v3 file.
comment:4 Changed 14 months ago by rjsparks@nostrum.com
- Status changed from new to accepted
comment:5 Changed 12 months ago by krathnayake@ietf.org
- Owner set to krathnayake@ietf.org
- Status changed from accepted to assigned
comment:6 Changed 12 months ago by krathnayake@ietf.org
- Resolution set to fixed
- Status changed from assigned to closed
comment:7 Changed 11 months ago by rjsparks@nostrum.com
Fixed in [3989]:
Merged in [3981] from krathnayake@ietf.org:\n Stop stripping CDATA with v2v3 option. Fixes #601.
I am also seeing this behavior when I run an RPC spellchecker, newspell, which invokes rfclint:
When you tell newspell to correct a typo within <sourcecode>, newspell removes CDATA and replaces < > with < > in the output file. Other <sourcecode> blocks that do not have typos corrected are not impacted.
Perhaps the problem is in rfclint?