All bytes that are not markup are data, and passed to the application.
Yes, this will be surprising to people who are used to HTML. Too bad -
HTML's behavior is unacceptable for many classes of applications. It
would be surprising to those who understand the 8879 rules, but
experience shows that this group includes only about a dozen people,
and they disagree. The rule given above has the virtue that it is
short, simple, and easily understood by everyone. We spent a lot
of time on this, and it's the only sane way to go. -Tim