- Back to Home »
- C# , XML »
Posted by : Jebastin
Thursday, 7 August 2014
All text in an XML document will be parsed by the parser.
But text inside a CDATA section will be ignored by the parser.
XML parsers normally parse all the text in an XML document.
When an XML element is parsed, the text between the XML tags is also parsed:
Characters like "<" and "&" are illegal in XML elements.
"<" will generate an error because the parser interprets it as the start of a new element.
"&" will generate an error because the parser interprets it as the start of an character entity.
Some text, like JavaScript code, contains a lot of "<" or "&" characters. To avoid errors script code can be defined as CDATA.
Everything inside a CDATA section is ignored by the parser.
A CDATA section starts with "<![CDATA[" and ends with "]]>":
Code:
A CDATA section cannot contain the string "]]>". Nested CDATA sections are not allowed.
The "]]>" that marks the end of the CDATA section cannot contain spaces or line breaks.
Reference: W3Schools, StackOverflow
XML parsers normally parse all the text in an XML document.
When an XML element is parsed, the text between the XML tags is also parsed:
<message>This text is also parsed</message>The parser does this because XML elements can contain other elements, as in this example, where the <name> element contains two other elements (first and last):
<name><first>Bill</first><last>Gates</last></name>and the parser will break it up into sub-elements like this:
<name>Parsed Character Data (PCDATA) is a term used about text data that will be parsed by the XML parser.
<first>Bill</first>
<last>Gates</last>
</name>
CDATA - (Unparsed) Character Data
The term CDATA is used about text data that should not be parsed by the XML parser.Characters like "<" and "&" are illegal in XML elements.
"<" will generate an error because the parser interprets it as the start of a new element.
"&" will generate an error because the parser interprets it as the start of an character entity.
Some text, like JavaScript code, contains a lot of "<" or "&" characters. To avoid errors script code can be defined as CDATA.
Everything inside a CDATA section is ignored by the parser.
A CDATA section starts with "<![CDATA[" and ends with "]]>":
<script>In the example above, everything inside the CDATA section is ignored by the parser.
<![CDATA[
function matchwo(a,b)
{
if (a < b && a < 0) then
{
return 1;
}
else
{
return 0;
}
}
]]>
</script>
Code:
- string CDATAStart = "<![CDATA[", CDATAEnd = "]]>";
- LongDescription = CDATAStart + mcLongDescription.Trim() + CDATAEnd;
A CDATA section cannot contain the string "]]>". Nested CDATA sections are not allowed.
The "]]>" that marks the end of the CDATA section cannot contain spaces or line breaks.
Reference: W3Schools, StackOverflow