violations of well-formedness constraints (when detectable);
declaration of a encoding which the processor cannot process;
invalid references to entities.
I find also the following 20 kinds of non-fatal errors:
violations of validity constraints;
unescaped & or < in character data;
unescaped > in "]]>" that is not the end of a CDATA section;
-- within comment;
DOCTYPE declaration that is not before the first element;
DTD that contains text other than markup declarations;
"standalone='no'" in a non-standalone document;
xml:space declared as other than type "(default|preserve)"
user-defined language id not beginning with "x-";
2-letter language id not from ISO 639;
text declaration provided by reference to a parsed entity;
text declaration not at the beginning of an entity;
UTF-16 encoded entity doesn't begin with BOM;
parsed entity not encoded in UTF-8 or UTF-16 lacks encoding declaration;
encoding declaration doesn't agree with actual encoding;
built-in entities declared incorrectly;
system identifier contains fragment identifier (optional error);
non-deterministic content model;
non-supported XML version (optional error);
version number is "1.0" but document conforms to another version;
This list was compiled by looking at occurrences of "error" and "must".
Not all "must"s appear here: "must"s that constrain the processor
rather than the document are omitted, as are definitional "must"s
like "If an element is empty, it must be represented either by a
start-tag immediately followed by an end-tag or by an empty-element
tag", where "must" really means "is".
-- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5)