开发者

Escaping new-line characters with XmlDocument

开发者 https://www.devze.com 2023-02-14 17:55 出处:网络
My application generates XML using XmlDocument.Some of the data contains newline and carriage return characters.

My application generates XML using XmlDocument. Some of the data contains newline and carriage return characters.

When text is assigned to an XmlElement like this:

   e.InnerText = "Hello\nThere";

The resulting XML looks like this:

<e>Hello
There</e>

The receiver of the XML (which I have no control over) treats the new-line as white space and sees the above text as:

 "Hello There"

For the receiver to retain the new-line it requires the encoding to be:

<e>Hello&#xA;There</e>

If the data is applied to an XmlAttribute, the new-line is properly encoded.

I've tried applying text to XmlElement using InnerText and InnerXml but the output is the same for both.

Is there a way to get XmlElement text nodes to output new-lines and carriage-returns in their encoded forms?

Here is some sample code to demonstrate the problem:

string s = "return[\r] newline[\n] special[&<>\"']";
XmlDocument d = new XmlDocument();
d.AppendChild( d.CreateXmlDeclaration( "1.0", null, null ) );
XmlElement  r = d.CreateElement( "root" );
d.AppendChild( r );
XmlElement  e = d.CreateElement( "normal" );
r.AppendChild( e );
XmlAttribute a = d.CreateAttribute( "attribute" );
e.Attributes.Append( a );
a.Value = s;
e.InnerText = s;
s = s
    .Replace( "&" , "&amp;"  )
    .Replace( "<" , "&lt;"   )
    .Replace( ">" , "&gt;"   )
    .Replace( "\"", "&quot;" )
    .Replace( "'" , "&apos;" )
    .Replace( "\r", "&#xD;"  )
    .Replace( "\n", "&#xA;"  )
;
e = d.CreateElement( "encoded" );
r.AppendChild( e );
a = d.CreateAttribute( "attribute" );
e.Attributes.Append( a );
a.InnerXml = s;
e.InnerXml = s;
d.Save( @"C:\Temp\XmlNewLineHandling.xml" );

The output of this program is:

<?xml version="1.0"?>
<root>
  <nor开发者_如何学Pythonmal attribute="return[&#xD;] newline[&#xA;] special[&amp;&lt;&gt;&quot;']">return[
] newline[
] special[&amp;&lt;&gt;"']</normal>
  <encoded attribute="return[&#xD;] newline[&#xA;] special[&amp;&lt;&gt;&quot;']">return[
] newline[
] special[&amp;&lt;&gt;"']</encoded>
</root>

Thanks in advance. Chris.


How about using HttpUtility.HtmlEncode()?
http://msdn.microsoft.com/en-us/library/73z22y6h.aspx

OK, sorry about the wrong lead there. HttpUtility.HtmlEncode() will not handle the newline issue you're facing.

This blog link will help you out, though
http://weblogs.asp.net/mschwarz/archive/2004/02/16/73675.aspx

Basically, the newline handling is controlled by the xml:space="preserve" attribute.

Sample working code:

XmlDocument doc = new XmlDocument();
doc.LoadXml("<ROOT/>");
doc.DocumentElement.InnerText = "1234\r\n5678";

XmlAttribute e = doc.CreateAttribute(
    "xml", 
    "space", 
    "http://www.w3.org/XML/1998/namespace");
e.Value = "preserve";
doc.DocumentElement.Attributes.Append(e);

var child = doc.CreateElement("CHILD");
child.InnerText = "1234\r\n5678";
doc.DocumentElement.AppendChild(child);

Console.WriteLine(doc.InnerXml);
Console.ReadLine();

The output will read:

<ROOT xml:space="preserve">1234
5678<CHILD>1234
5678</CHILD></ROOT>


Encoding is probably your best bet using the methods described here. Or perhaps you could look at using a CData section for your content instead.


In .net 2.0 use the XmlDocument PreserveWhitespace switch

XmlDocument d = new XmlDocument();
d.PreserveWhitespace = true;


i had the same problem Preserve carriage returns when i write/read from xml file using asp.net

the solution is to replace xml space to html space after html generated i add this

        strHtml = strHtml.Replace("&lt;br/&gt;", "<br/>");

at the end of the method before closing the stream reader

0

精彩评论

暂无评论...
验证码 换一张
取 消