Scholar Repository
Home>Manual>XML Encoding

XML Encoding

From http://www.w3schools.com (Copyright Refsnes Data)

XML Encoding

previous next


XML documents can contain non ASCII characters, like Norwegian � � � , or French � � �.

To avoid errors, specify the XML encoding, or save XML files as Unicode.


XML Encoding Errors

If you load an XML document, you can get two different errors indicating encoding problems:

An invalid character was found in text content.

You get this error if your XML contains non ASCII characters, and the file was saved as single-byte ANSI (or ASCII) with no encoding specified.

Single byte XML file with encoding attribute.

Same single byte XML file with no encoding attribute.

Switch from current encoding to specified encoding not supported.

You get this error if your XML file was saved as double-byte Unicode (or UTF-16) with a single-byte encoding (Windows-1252, ISO-8859-1, UTF-8) specified.

You also get this error if your XML file was saved with single-byte ANSI (or ASCII), with double-byte encoding (UTF-16) specified.

Double byte XML file without encoding.

Same double byte XML file with single byte encoding.


Windows Notepad

Windows Notepad save files as single-byte ANSI (ASCII) by default.

If you select "Save as...", you can specify double-byte Unicode (UTF-16).

Save the XML file below as Unicode (note that the document does not contain any encoding attribute):

<?xml version="1.0"?>
<note>
  <from>Jani</from>
  <to>Tove</to>
  <message>Norwegian: ���. French: ���</message>
</note>

The file above, note_encode_none_u.xml will NOT generate an error. But if you specify a single-byte encoding it will.

The following encoding (open it), will give an error message:

<?xml version="1.0" encoding="windows-1252"?>

The following encoding (open it), will give an error message:

<?xml version="1.0" encoding="ISO-8859-1"?>

The following encoding (open it), will give an error message:

<?xml version="1.0" encoding="UTF-8"?>

The following encoding (open it), will NOT give an error:

<?xml version="1.0" encoding="UTF-16"?>



Conclusion

  • Always use the encoding attribute
  • Use an editor that supports encoding
  • Make sure you know what encoding the editor uses
  • Use the same encoding in your encoding attribute

previous next

From http://www.w3schools.com (Copyright Refsnes Data)

Home>Manual>XML Encoding