Re: Mixed Content Models

Lars Marius Garshol (larsga@ifi.uio.no)
21 Sep 1998 21:06:35 +0200


* Jerome McDonough
|
| I've inherited a DTD for development that was originally intended to
| be an SGML DTD, and has been converted to XML. Contained within it
| is the following:
|
| <!ELEMENT qstn (#PCDATA | (preQTxt?, qstnLit?, postQTxt?, forward?,
| backward?, ivuInstr*))*
|
| Is this a legitimate content model under XML section 3.2.2?

No, it is not. XML mixed content models must be of the form

<!ELEMENT qstn (#PCDATA | child1 | child2 | child3 ...)*>

| Msxml doesn't have a problem with it,

MSXML is not updated to the latest specification.

| and nsgmls using the -wxml flag also happily parses the DTD.

Hmmm. This deviation is not documented in the SP documentation.

| IBM's xml4j, however, complains:
| "Codebook.dtd: 1256, 33: This content model is not matched with the
| mixed model '(#PCDATA|FOO|BAR|. . .|BAZ)*': '(#PCDATA|(preQTxt?, qstnLit?,
| postQTxt?,forward?,backward?,ivuInstr*))*".

This is correct behaviour. (Note that it also gives an example of a
correct mixed content model.)

| I suppose this boils down to, should the parser ignore what's within
| a content group when evaluating whether someone is trying to
| constrain the order or number of occurrences of 'child elements.'
| If the only 'child elements' to be considered in the above case are:
|
| A. #PCDATA, and
| B. (preQTxt?, qstnLit?, postQTxt?, forward?, backward?, ivuInstr*)
|
| then the above content model simplies to (A | B)* and doesn't appear
| to conflict with section 3.2.2.

Well, it does, you see, because it conflicts with the grammar, so this
is actually a well-formedness error.

| But if 'child elements' means any element, even those within a
| group, then my content model is probably bogus. Can someone tell me
| which is the correct interpretation?

Your content model is bogus. :)

--Lars M.