please consider whether

Steven R. Newcomb (srn@techno.com)
Sun, 28 Sep 1997 22:33:01 -0400


[Patrick Gannon:]

> Since this topic has spilled over from the original meeting posting
> and generated significant interest, I will request a listserv be
> established for xml-catalog. This will allow for application
> oriented discussions of XML that are now related to development
> (XML-DEV) or EDI (XML-EDI) issues that have their own listserv.

Patrick -- Here is a note to post on the listserv. -- Steve

**********************************************************************

This note asks those in the online product catalog business to
consider whether they need XML to support SGML Architectures --
multiple architectural inheritance. (Others may also find it
interesting.)

The designers of XML want to know why multiple architectural
inheritance is a feature that should remain unsupported, at least
temporarily.

If you want to use and benefit from the "SGML Architectures" notion
outlined in my earlier note (attached below), I believe you should now
consider (while you still have an option in the matter) whether you
want to be able to use XML for your company-internal "information
source code" for all the information that is the essence of your
company's value. An ISO standard alternative, SGML/HyTime, is also
available for that purpose.

On the one hand, SGML/HyTime is one helluva strong set of paradigms,
of which XML and all the things currently present in or planned for
XML (linking, addressing, metadata) are a proper subset. Together,
these paradigms put the information manager and owner in maximum
control of the cost of creating and maintaining information about
information.

On the other hand, XML will have a wider audience. XML data will flow
across the internet to an awful lot of users (or so we think, anyway)
who won't have full SGML/HyTime capabilities in their systems any time
soon.

If, because your internal databases are limited in functionality to
the representational power of XML, your internal applications cannot
deliver the cost-cutting power of SGML/HyTime for creating and
maintaining massive amounts of n-dimensional (and n-dimensionally
interrelated) information, maybe that's ok because the potential for
higher code maintenance costs is worth the convenience of being able
to dump copies of sections of your metadata source code directly out
to the internet. (Somehow the latter doesn't seem to me a very good
business idea, but that's for you to decide.)

You might be able to avoid having to make this decision early by
letting the w3c-xml-sig group know that your business applications
expect to benefit from multiple architectural inheritance a la SGML
Architectures, so you'd like to have XML support SGML Architectures
sooner, rather than later.

I'm not particular about whatever reason you may have for expressing
to the w3c-xml-sig group your interest (if any) in SGML Architectures;
I just think the online product catalog industry should consider doing
so, and very soon indeed.

I've already made clear my own reasons for bringing this issue up in
my earlier note. For your convenience, I'm attaching it below (sans
some stuff I shouldn't have put in in the first place because it was
from an unpublished W3C discussion about XML).

-Steve

--
             Steven R. Newcomb   President
         voice +1 716 271 0796   TechnoTeacher, Inc.
           fax +1 716 271 0129   (courier: 23-2 Clover Park,
      Internet: srn@techno.com    Rochester NY 14618)
           FTP: ftp.techno.com   P.O. Box 23795
    WWW: http://www.techno.com   Rochester, NY 14692-3795 USA

********************************************************************************

*** Not as originally posted. Unpublished W3C material has been deleted. ***

Date: Thu, 25 Sep 1997 16:08:44 -0400 Message-Id: <199709252008.QAA01199@bruno.techno.com> From: "Steven R. Newcomb" <srn@techno.com> To: Jon.Bosak@eng.Sun.COM CC: ark@DB.Stanford.EDU, gannon@commerce.net, brucek@agentsoft.com, btait@mercantec.com, caallen@webmethods.com, claire_celeste_carnes@ccm.jf.intel.com, dmarquis@kinetoscope.com, f.deschamps@bull.com, harvey@eccnet.eccnet.com, jmt@commerce.net, Jon.Bosak@eng.Sun.COM, jonathan@poet.com, jonlewis@cngroup.com, marthao@icat.com, Michael.Leventhal@grif.fr, paul@arbortext.com, pjordan@microstar.com, ptrevithick@bitstream.com, rcw@commerce.net, smith@adobe.com, tbadger@kodak.com, trung@ondisplay.com, weld@cs.washington.edu, xml-dev@ic.ac.uk, andrewl@microsoft.com, higginsc@lanepowell.com In-reply-to: <199709251550.IAA13057@boethius.eng.sun.com> (Jon.Bosak@eng.Sun.COM) Subject: Re: XML iMarket Project Planning Meeting

[Jon Bosak:]

> What I as a consumer want to be able to do is quite simple. I want to > be able to say, "Hey, I need a new jacket," sit down at my computer, > call up my find-a-product robot, enter my jacket parameters, and then > come back a while later to find all the jackets that fit those > parameters offered by all the vendors whose products I'm interested in > considering. If the catalog scheme isn't standardized enough to > support this, then I as a consumer am not interested in using it. If > one of the vendors differentiates itself by adopting a scheme of data > representation that doesn't allow this kind of transparent direct > comparison, then it differentiates itself right out of the class of > vendors I'm interested in, because if all it's giving me is the > ability to cruise its catalog in isolation, I can get the same > functionality from the printed version; it no longer participates in a > way that allows the net to add value to me as a consumer. > > I'm not denying that vendors will want to differentiate their > offerings, but if they can't do it in a way that supports detailed > direct comparisons based on the differentia that I am interested in > *as a consumer* then they are simply not in the game at all.

There is a very serious problem here that bears strikingly on an ongoing discussion in XML-land: the discussion of so-called "namespaces". The idea that there will be consortia of vendors, or any other sort of authority who will determine some list of names of characteristics of each sort of product, so that characteristics can be directly and automatically compared, is dangerous to innovation, competition, and commerce, and it is totally unnecessary, too. It will open the door for existing businesses to use such architectures as weapons against upstarts in niche markets and in unusual or new market combinations. Moreover, the use of information architectures as weapons will always seem like perfectly reasonable business practices, so it will be nobody's fault when new concepts fail to be accepted in the marketplace, because the internet failed to live up to its promise of helping people find what they are looking for and make informed purchasing decisions. The macroeconomy will be damaged.

*** Mr. (or Ms.) X *** (whom I do not know, but would like to) has laid out a list of requirements for the implementation of namespaces which, if used as guidance in the development of XML's namespace features, will create a need for authorities who give "standard" names to such things as product characteristics. The concentration of power in such authorities will hinder innovation, by making it difficult to compare products regarded as "out of category" for some authority's set of defined names.

*** [To say that there is no industrial requirement for XML to support multiple architectural inheritance is to place the design of XML in conflict] *** with the evolutionary process of defining and marketing new products. How will the catalog of everything that is for sale handle a case where the same product characteristic, or even the same entire product, arises from multiple industries simultaneously, and each of those industries already uses its own authoritative schema? Will the contents of documents have to be duplicated and translated so as to conform with multiple schemas, so that different comparisons can be made? If so, that will cause much of the value of making the comparisons in the first place to be lost; features regarded by authorities as "out of category" will simply disappear. Imagine a single device that is a fax machine, a telephone, a copier, a computer, and a stereo sound system. Should it appear in a list of telephones? Maybe. Should the output wattage of its amplifier be listable in a comparison with the output wattage of other telephones? Maybe. Should the people who figure out what are the interesting characteristics of telephones anticipate that output wattage may be an important characteristic of telephones? It's completely unrealistic to expect those people to anticipate that. And, yet, it's an interesting and relevant statistic and it may be important to some consumers.

The ugly truth is that we can't predict whether information that is now thought to be irrelevant to other information (or, maybe we don't even know about the existence of the other information yet) will turn out to be semantically identical or semantically mappable. In my own mind, anyway, the real justification for the existence of businesses that provide "yellow pages on steroids" in support of internet commerce is to provide the added value of mapping semantics to each other in such a way that they can be directly compared, just as Jon says. That mapping can be expressed in some proprietary fashion, or it can be done using SGML documents that inherit from multiple SGML architectures, or, if XML supports it, it can be done with XML documents that inherit from multiple XML architectures, with no limit on the number of XML architectures that can be inherited, and no limits on the number of architectures that can usefully be fielded by old and new industries. *** [Without multiple architectural inheritance, XML documents that represent such semantic mappings will be more costly to create and maintain. (I guess you'd have to do it all with hyperlinks. Anything can be done with hyperlinks, but that doesn't mean that everything *should* be done with hyperlinks. In general, hyperlinks are best regarded by information managers as a last resort because they cost more to maintain and their structure is arbitrary and external. It's better if the information, in effect, maps itself. Inheritable SGML architectures allow information to map itself in complex ways. Why shouldn't it be possible to accomplish the same end in XML, without requiring the use of hyperlinks?)

So, I continue to harp on the importance of allowing a single element to inherit multiple semantics (and/or the _same_ semantic differently named or named within different namespaces). *** [Other opinions notwithstanding,] *** in my own mind, anyway, this really *is* a requirement for cataloging companies to extract maximum value from their listings at minimum information management cost in a dynamic, non-authoritarian market environment. It would allow internet catalog providers to map each new DTD into their existing DTDs simply by tweaking their existing DTDs. For example, in the DTD for their catalog of telephone products, when the output wattage issue first arises (i.e., when a telephone appears on the market that lists an output wattage), a declaration is added that allows the characteristics listed in the DTD for the manufacturer's product description document to be inherited. In the same declaration, the features of the product, such as its "colour", can be mapped to the things that are the same that are already in the DTD, (such as "color"). The new feature, "outputWattage", can be made to appear with a default value of "not applicable", so now all the existing telephone product listings have this feature, and they can all respond meaningfully (if uninterestingly) to queries about it. No need to create and maintain (!) any hyperlinks. No need to write or maintain any extra documents. One change in one place updates all telephone products listed in the catalog, regardless of how many there are. The amount of information stored hardly increases at all, but the value of the information increases quite a lot. Essentially the same change can be applied to the DTDs for stereo systems (now they can have a redial feature, yes or no), the DTD for copiers, etc. Cheap and very powerful, no? The catalog provider gets to add a terrific amount of value at very little cost. New products can be found by consumers even if they didn't know the hybrid category existed. ("I want a very loud telephone. Hmmm.") New products for untried niches can be usefully listed in multiple catalogs. Innovation is not penalized for being unanticipated by the authorities who created DTDs for product listings in various categories, or by the failure to recognize a viable category. Indeed, there is no need for such authorities at all. There is only a need for catalogers who can read and understand incoming DTDs and perform these cheap semantic mapping tricks.

You can do all this now with SGML (as of August 1, 1997; see http://www.ornl.gov/sgml/wg8/document/1920.htm). The only question is whether XML will be able to do it. Maybe it doesn't matter; providers of internet shopping directories can always maintain their source information in SGML and simply deliver it in XML form, if they like. (Or in HTML form, for that matter.)

-Steve

--
             Steven R. Newcomb   President
         voice +1 716 271 0796   TechnoTeacher, Inc.
           fax +1 716 271 0129   (courier: 23-2 Clover Park,
      Internet: srn@techno.com    Rochester NY 14618)
           FTP: ftp.techno.com   P.O. Box 23795
    WWW: http://www.techno.com   Rochester, NY 14692-3795 USA

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)