Orcmid's Lair status 
privacy 
 
about 
contact 

2008-02-06

OOXML-ODF: The Harmonization Hope Chest

[Update 2008-03-06 (via Brian Jones): The DIN NIA 34 working group has raised its head above the waters.  It is officially NA 043-01-34-01 VT Working Group Translation 29500-26300.  They have published a schedule and the first working-paper draft is available for download.  There is a terrific little graphic about the cases they want to understand.  The copyright notice on this document is what strikes me as the fatal flaw of ISO, ISO/IEC, and their business model.  Why would I voluntarily lock up the fruits of my labors in such a prison?  Would you?  I'm sorry.  This sets off my elitism-paternalism-bullshit detector.  I'm sure it is simply thoughtless, but it's a level of ritual that does not invite my cooperation.  I'll calm down later.  Now I'm venting.] [Update 2008-03-07T20:13Z: I have calmed down and I am now embarrassed by my intemperate outburst.  This is uncalled for.  It demeans the worthy efforts of the DIN working group.  I apologize.  I will also apologize directly to the working group contact, and I will make a separate post on the lesson for me.]

[Update 2008-02-16: Microsoft has delivered on its promise to ECMA TC45, originally reported by Brian Jones on January 16[3].  The specifications for seven binary formats are placed under the Open Specification Promise and freely available for download [14].  The SourceForge project, b2xTranslator, has been established and its initial program of work has been posted [15].  I am hopeful that this work and the conclusion of the DIS 29500 Ballot Resolution meeting will energize the DIN NIA 34 working group on OOXML-ODF translatability [6].]

This post has four points:

  1. Translatability Informs Harmonization
    Difficulty of translation is a rough indicator of the degree to which reconciliation of the formats is difficult.
     
  2. Models Limit Translation/Harmonization
    There is no assurance that either format is universal enough, in the sense that it can faithfully express all documents of the other format. 
       
  3. We Must Know the Differences 
    We need to know the severity of the model disparities and individual-feature differences before concluding how much useful harmonization is possible.
         
  4. The Reality in the Punchbowl
    People do not deal with the file formats when creating office documents.  The stored formats are below the level of office-worker attention and detail.  Most people deal with their office-document software in an ad hoc way, using trial and error to obtain the appearance that is desired.  Interchange among those using different (versions of) office-document software is dominated by these user practices.  Common formats are not enough by themselves to ensure interoperability and faithful interchange.

1. Translatability Informs Harmonization

Brian Jones has reported that there is a great deal of interest in harmonization of ODF and OOXML in some way [1].    Subsequently, Brian reported on the ECMA response to ISO/IEC JTC1 DIS 29500 (OOXML) ballot comments requesting harmonization [2], quoting from the ECMA response:

"Ecma believes that the work of the DIN (NIA-01-34) committee is essential to any harmonization effort. The work ... will enable the industry at large to understand the detailed differences between the formats. Based on this detailed understanding, the ODF and Open XML formats could be extended in the future in order to enable better sharing of information and allow future translations tools to provide even better translation and interoperability between the formats."

The announcement of the DIN initiative indicates that it will include the experience of the SourceForge OpenXML/ODF Translator Add-In project, and other efforts in its findings [5, 6, 7].

I agree, this is the best way we know to learn where the stumbling points are and what the prospects are for reconciling the document models.  I also appreciate that the ECMA response recognizes that current translation capabilities could be better with improved understanding of the formats, but no claim to perfect translatability is made.

2. Models Limit Translation/Harmonization

Cautious statements about the prospects for translation are also welcome.   With regard to present capabilities, Jones quote of the ECMA response contains this earlier passage [2]:

"There are many translation tools already in existence that enable interoperability between different formats by providing useful translation capabilities between ODF, Open XML and UOF."

where that's my emphasis on "useful."

I maintain that getting from "useful" through "even better" to "assured" is unlikely [10]. 

I think it is reasonably-well understood that the architectures of OOXML and ODF documents are extremely different.  There is great impact on how the format architectures are intended to be processed and how they map to tangible renderings of the document content.  These details might be found to have common conceptual abstractions within which translation is reasonably-direct rewriting from one digital format to the other.   There is reason for skepticism.   Harmonization of terminology and metadata provides some important considerations [9]:

"Existing solutions to the metadata harmonization issue are few—systems are either limited to a single specification, or implement ad-hoc solutions that only work in that particular environment. There are many examples of 'mappings' between specifications that provide partial solutions to the problem, but generally fail due to low-fidelity translations and lack of generality (i.e. the mapping only works for limited parts of specifications). Another solution is to create a top-level data model that encompasses the common aspects of all the specifications. This has proven to be feasible in relatively well-constrained domains . . . In the field of general metadata, where there is no such common ground, such an approach is substantially less likely to be successful."

I am skeptical that office-document formats provide a relatively well-constrained domain.  Our optimism must be tempered by caution.

3. We Must Know the Differences

I do agree that we need some concrete, specific undertaking to calibrate what kind of harmonization is possible as a practical matter:

  • What are the features of ODF and OOXML that correspond in ways that can be preserved via useful translation and that could be a practical basis for harmonization?
       
  • What are the places where ODF and OOXML are incommensurate in ways that can be reconciled by additions to one or the other?
       
  • What is left as unreasonably difficult and how does that impact the users of software that relies on one format or the other?

It will take serious effort to delineate these cases, and at the moment attention is on the Translation Working Group under DIN NIA-34 [5, 6].  Update: I am heartened to learn that there has been very serious analysis by public agencies of the Danish government [12]Update Update: I have also unblocked my own thinking about an approach that might provide more concrete visibility on behalf of harmonization [13].

4. The Reality in the Punchbowl

Meanwhile, Sam Hiser offers a different impression of the DIN effort [4]:

"The ODF-to-OOXML harmonization effort being hosted by the German standards group, DIN, is Europe's best effort to resolve our Mexican Standoff between Microsoft, Sun and IBM. Even though harmonization is laughably complex and will not work unless the applications are harmonized too, the best and brightest of Germany are left to hope for success."  [emphasis mine: dh]

Although the mission of the German effort is translation (Übersetzung), not harmonization, I find there is a very important point that is not made often enough: 

  • People write, read, and edit office documents with little, if any, understanding of the particular format that makes them persistent in digital form.  The XML-based open formats do not change that.
      
  • People adapt to the software/device they are using by trial and error.  We train ourselves to obtain the visible results that we want.  Different people obtain superficially similar results by quite different means.
      
  • Even when someone has gone to the trouble to create style sheets, forms, macros, templates and other format-impacting aids, it is very loosey-goosey in practice.  And it still does not require paying attention to the file format.  

I think that is a big deal. It is what people deal with. The formats are below everyone's direct attention and, while standardization and assurance of interchange is essential, it is insufficient, I think, with regard to what works for people who use products that "support" the formats.  If interchange is the game, something else is required.

4.1 Living in a World of Appearances

Apart from the tab, space, enter, and back-tab keys, my personal control of layout is confined to an amazingly-prevalent set of toolbar icons:

Windows Live Writer 2008 

Microsoft Office FrontPage 2003 (SP3)

OpenOffice.org 2.3.0 Writer on Windows XP

Blogger.com Post Editor (on 2008-02-06)

Microsoft Office Word 2003 (SP3)

Microsoft Office Word 2007 (SP1)

MediaWiki entry editing toolbar

For each of the HTML editors, I often edit the file format for desired effect.  Usually the editors leave my markup alone enough for the result to be achieved.  I am reminded of performing similar contortions with Borland Sprint, a document editor that employed a plaintext markup scheme. 

MediaWiki editing is also a form of direct markup, although my control over the HTML that is served up is indirect and mastered by trial-and-error post-and-revise exercises. 

The important observation is that I use these features for the appearance that is produced and the tools and the symbols encourage me to do exactly that.  I need not make any semantic distinctions while using the tools to achieve the formatting that serves my purposes.   (I am disgruntled that some products replace indent and outdent with a “” toggle.  I don't care that the element involved is tagged <blockquote> and don't get fancy with me.)

4.2 For Users, It is the Computer Programs and What Happens

People ask questions about software products and getting their job done, not formats.  On the Massachusetts Information Technology Division site, all of the practical questions are for working between Microsoft Office and OpenOffice.org programs, not how to use the formats[11]:

"There are many features or ways your old office suite worked which are not accommodated in OpenOffice. Likewise, OpenOffice has many features not implemented in your old office suite. All users share different habits and rely upon a different subset of features to perform their daily tasks. While you may miss a few features that are significant to you, there is often a similar way in the new software to accomplish the same task; and users report that the new suite is fast to learn."

That is a great testament to the adaptability and resourcefulness of people, something that we are unable to endow in translators of any kind at this time.  It is also not clear how this interactive dance fits into an interchange regimen, even among users of the same software product (release).

This passage shows how far the concerns in Massachusetts are from matters that might be governed by the respective formats:

"Yet there remain differences in the new [improved StarOffice and OpenOffice.org] tools which fail to offer improvement, and we would like users to feel empowered to report such discoveries to the mailing list."

Then there is this wonderful alibi about conversion failures:

"Even if the number of files likely to convert with trouble is low, anyone encountering file compatibility issues will experience frustration and inconvenience. In most cases still, corrupted files or those with characters slightly out of place can be fixed in the few minutes that it takes to curse one's luck.
Users tend to blame the new office suite for the problems. Please keep in mind that engineers would be able to eliminate 99% of such conversion problems if the technical specifications of legacy file formats were to be published."

We'll see, won't we, especially since the legacy file formats technical specifications have been available for some time and will soon be easier to obtain [3].

Update: I think there may be ways to offer products that are specifically geared to only using harmonious features of open-standard document formats [13].  This would not require users to know or care about the formats and the ways that their documents comply, and the usual ad hoc behavior would work (so long as one does not have frustrating expectations of feature parity with all products).


[1] Brian Jones: Harmonization: Finding the Differences.  Brian Jones: Open XML Formats (web log), msdn.com, 2008-01-31.
This post reminds us of the important work on understanding the differences between ODF and OOXML was announced in May 2007.  The expert working group would be conducted under the auspices of DIN, the German national body for promulgation of standards [6].
    
[2] Brian Jones: Is It Jetlag?  Brian Jones: Open XML Formats (web log), msdn.com, 2008-02-01.
 
[3] Brian Jones: Mapping Documents in the Binary Format (.doc; .xls; .ppt) to the Open XML Format.  Brian Jones: Open XML Formats (web log), msdn.com, 2008-01-16.
I'm reluctantly using this sole public (but not exactly official) statement of the Microsoft offer to provide direct access to documentation on the Office Word, Excel, and PowerPoint binary formats.  Microsoft also proposes to sponsor a new SourceForge project that will deliver tools and guidance for translations from the binary formats to OOXML.  According to Brian Jones, visible steps will be taken by February 15, 2008.  Also according to Jones, this commitment is being conveyed in the ECMA proposed dispositions for the DIS 29500 Ballot Resolution Meeting in Geneva, February 25-29.
   So far, the ODF Translator SourceForge project [7] provides the only readily-visible public effort to convert between ODF and OOXML.  The new binary format to OOXML project is not directly related to OOXML-ODF translation/harmonization.  Still, there will be value for those looking into harmonization, just as there is value in examining the binary format to ODF implementation in OpenOffice.org, the OOXML to ODF implementations in the OpenOffice.org Novell edition, and the Sun ODF Plugin for Microsoft Office.
  
[4] Sam Hiser: The Disappointing FormatsPlexNex (web log), 2008-02-01
  
[5] Fraunhofer Institute for Open Communication Systems:  Übersetzung ODF-OOXML (English page), Document Exchange Format project, eGovernment Business Unit, FOKUS Projects.
This is the on-line presence of the project from the Fraunhofer Institute.  According to the undated material on this page and additional information found from its tabs, the project is still in its formation stage.  There is no evidence of work product or indication of work activity as of February 1, 2008.  The May 2007 Press release (English version, PDF format) suggests results will appear by mid-2008.  It appears that the start-up process, is moving slowly, especially with regard to enlisting partners in the formation of an expert group under the FOKUS research-and-development service.  The only promised output is a white paper.  I speculate that everyone is awaiting the outcome of the DIS 29500 Ballot Resolution Meeting and subsequent vote updates.
  
[6] DIN: Deutsches Institut für Normung (English home page), accessed 2008-02-01.
If you are searching for a public face of the Translation of Document Formats working group, you will have your work cut out for you now find it at NA 043-01-34-01 VT Working Group Translation 29500-26300.
   NIA, the short name for DIN standards committee NA 043, is responsible for standards in the area of "Information technology and selected IT applications."  NIA has over 300 projects.  NIA provides the DIN delegation to the Ballot Resolution Meeting on OOXML.  NIA subcommittee NA 043-01-34 AA (NIA-34) is the DIN counterpart for ISO/IEC JTC 1/SC 34.  It has current responsibility for the DIN alignment with 54 ISO/IEC standards, including ISO/IEC 26300 (ODF) and DIS 29500 (OOXML), the latter being one of the 15 active projectsNIA-34 is the home of the Translation of Document Formats Working Group, but there is was no account for that on the DIN NIA web site at this time as of (February 1, 2008).  [update 2008-03-06: There is a web page available and the February 20-produced PDF of the first working draft is available for download.]
   The most-complete version of the NIA-34 home page is in German.  The English version simply reports on the DIN delegation to the DIS 29500 Ballot Resolution Meeting.  The Translation Working Group was announced in May, 2007 (the English version translates "Tag" as "date").  Again, I suspect that experts interested in this activity have been preoccupied with Ballot Resolution Meeting review and preparation.  Still, it's been a long time since May 2007.  They may be challenging my personal record for project silence (and my silence usually means very little is happening).
  
[7] SourceForge.net: OpenXML/ODF Translator Add-In for Office.  Project Page, accessed 2008-02-02.
This is the most-visible active project related to translation between ODF and OOXML.  The Fraunhofer Institute and DIN pprojects rely on this effort [5, 6].  There is documentation, extensive lists of problems and incompatibilities, and ongoing activity.  Although the work is in C# and XSLT, the documentation and test cases are important and valuable for those interested in higher-level understanding of translation issues.  Until or unless the DIN and Fraunhofer activities open up, this is the most-concrete activity available for any kind of reality check.
  
[8] Charles W. Bailey, Jr: Harmonization of Metadata StandardsDigital Koans (web log), 2008-01-31.
Bailey quotes the introductory statement that summarizes the obstacles reconciling metadata.  
  
[9] Mikael Nilsson (ed.): Harmonization of Metadata Standards, Version 1.0, Deliverable D4.7, PROLEARN, European Commission Sixth Framework Project (IST-507310), 2008-01-21 (via Charles Bailey [8]).  Accessed 2008-02-04 (468kB PDF file).
The context for this report is summarized on this Wiki page.  The PDF File can be downloaded using the link in the Manifest section of the Wiki page.
  
[10] Dennis E. Hamilton: Magical Thinking and the Universal Document ElixirProfessor von Clueless in the Blunder Dome (web log), 2005-10-17.
It would appear that there is still yearning for this prospect, just as it was claimed technically simple for Microsoft to adopt ODF as a/the native format for Microsoft Office after arranging suitable additions in ODF, if needed.  If we propose to attempt this to some degree of success, we need to make that degree measurable and confirmable. 
   
[11] ITD: OpenOffice.org 2.0 Software.  Open Standards - Frequently Asked Questions, User Resources, Open Document, Open Initiatives.  Official Web Site of the Information Technology Division, Mass.gov.  Accessed 2008-02-06.
  
[12] Rajiv Shah, et.al: Interoperability Testing and Resources?  (discussion thread) OpenDocumentXML.org, accessed 2008-02-07 (via Rob Weir via Gray Knowlton via Doug Mahugh)
There's an RSS feed (no full content though) for this site, which has room for technical matters as well as advocacy and product news.  On the particular thread, there is the only indication I have found that the OpenFormula effort is going to bear fruit in ODF 1.2.  More important to the current discussion, there is identification of some test suite and interoperability work by the Danish IT and Telecom agency.  The English slide set of their report is fascinating.  Although the contributor of the Danish material was also unsuccessful in finding anything from the Fraunhofer Institute and DIN efforts, I am heartened to see this other significant and responsibly-undertaken work.
  
[13] Dennis E. Hamilton: ODF-OOXML: nfoWorks for Harmony?  Professor von Clueless in the Blunder Dome (web log), 2008-02-07 (2008-02-08 touch-ups).
Here are some ideas about being able to implement processors of open-standard document formats that don't use non-harmonized features and are therefore safe-enough with regard to how users employ them.  There are, of course, new open questions that arise and the whole undertaking might be fruitless.
  
[14] Brian Jones: Binary Documentation (.doc, .xls, .ppt) and Translator Project Site are now live.  Brian Jones: Open XML Formats (web log), msdn.com, 2008-02-15.
There are links to the download location for the document formats and some useful comments about maintenance that may be required to those specifications.
  
[15] Stephen McGibbon: OpenXML translator project on SourceForgeNotes2Self (web log), 2008-02-15.
The home page of the SourceForge project, to be developed by DIaLOGIKa with Microsoft guidance, is linked and the first three milestones are summarized.
  

This is a very difficult article to write.  I thought it would be a quick summary of the situation, with some useful links to the translator projects in NIA-34.  It is very disappointing to find so little public evidence of useful activity under NIA-34 and the Fraunhofer Institute, with the only visible progress still that of the OpenXML/ODF Translator Add-In SourceForge project.

[update 2008-02-08T10:35 -0800: I have finally expressed some pent-up ideas about technical approaches to practical harmonization (and determining if there is useful feasibility) [13].  I've updated this post to cross-reference to that emerging material.  I'm also enjoying an exchange in the comments, though I want to bring any rejoinders of mine back to the thrust of this article, which does not seem to be fully understood (whether or not agreed with).
 update 2008-02-07T12:11 -0800: Thanks to some incoming links to this article, I discovered resources that very much lift my heart with regard to the interoperability efforts of public organizations [12].
update 2008-02-07T10:33 -0800: The statement about semantic-free use of markup for its effect and nothing else was garbled and I reworded it.  Actually, this note explains it better, but I am satisfied enough.  The point is that the application software is not entrusted with the semantics, it is simply expected to do the formatting in a way where whatever the author's intention is happens to be usefully preserved.]

 
Comments:
 
This is one of the best posts I have ever read that describes the gap between human activity and work, and what that has to do with representations of information in computer systems. (I'll tip my own hand here and say that the relationship between the two, while important, is tenuous at best.)

Good piece of exposition, orcmid.
 
 
I think that this posting misses a few critical things:

(1) ODF isn't limited to OpenOffice. If OpenOffice isn't good enough, another ODF application (such as ones from IBM or Corel) should be. If the next version of Office doesn't have a feature or gratuitously increases your training costs via a radical GUI change (as Microsoft has) or arbitrarily decides to drop legacy file format support (as Microsoft has), you're out of luck.

(2) Microsoft has a poor history of displaying and saving files from one version of Office in another. There's no reason to believe this will change, so people are *already* dealing with this. ODF at least provides some hope of escape since the format is set in stone.

(3) the OOXML spec isn't implemented anywhere, even by Microsoft and there's no guarantee that it will implement the official spec and there's ample history that shows that they'll embrace and extend any spec they support. There's very little change with the DOC format situation.

(4) There's zero reason to mass migrate all legacy DOC files. DOC is a defacto standard that's been reverse engineered to death and has several apps that are able to convert this format into a modern format as the document is needed with high accuracy. If an exact visual match is needed, export to PDF is the the best option.

(5) People don't care about formats until it bites them where it hurts. Anyone who has a drawer full of 8 track tapes or 5.25 inch floppies or Amiga OS 3.5 inch floppies knows that having the data is pointless useless you can access it properly. Anyone with a MS Word 2.0 document knows how badly even MS Office 2000 mangles it. Word 2.0 isn't that long ago, especially in government where many docs need to be kept for over 20 years.

(6) If your data is transitory and you don't care about storing it, then it doesn't matter what format or application you use. Standardizing transient information reduces flexibility within different departments. The only thing that makes sense to standardize this is to avoid vendor lock-in, but even this this case, standardizing any multi-vendor format is good enough.

In short, I don't see what point you're trying to make.
 
 
OK, I'll bite:
(1) I haven't said that. I have only indicated that at user level it is not the format that people deal with or express much control over (apart from the general choice), it is the software product, and that applies to all of the software products.

(2) I'm not so sure how poor Microsoft's history is, and we should probably calibrate that somehow. However, ODF is *not* set in stone and it is ingenuous to claim so.

(3) I hear this claim a lot. What is the basis for that? Where is it that new documents in Word 2007 when saved are not in OOXML? Documentation please.

(4) I don't think I raised this at all. Where do you see me promoting or even considering mass migration?

(5) You are making my point about what people care about and what is at their level of attention.

(6) I think the summary at the top of the post says all that I intend. I think it is material when one considers interoperating using products of different vendors. I really was not addressing preservation, but I certainly agree that known standards (de factor or public) matter for that.
 
 
Here's my followup:
(1) The point I'm making is that I agree, but that because you're not locked in, you have choice on UI and function, which can't happen with OOXML (which is controlled by one vendor with self interest in vendor lock in). HTML-like formatting is perfectly good for email and it's all that many people need, so why should these users be burdened with something as large and complex as MS Office or OpenOffice? At the opposite end of the spectrum, there are highly technical desktop publishers who need a high degree of control and a lot of the "cute user-friendly" feature of MS Office or OpenOffice get in the way. Each needs a different interface and different application, but both can exchange documents if you have standardized the document format.

(2a) Okay. Calibrate that against paper. I have a stack newspapers that I found between the floors from 1942. Although it's brown, sooty and fragile because it wasn't taken care of, it's still very readable. It paints a very different picture of the time than what's portrayed in the media. Or if you want a software calibration, ASCII text. I have some text documents from my Commodore 64 with my school projects that are still readable. I also have several books with TROFF formatting (created in the 1960s) that still work today on any modern Unix machine. Ditto for LaTex. Any good document format needs to be at least that good at keeping legacy.

(2b) ODF 1.0 *is* set in stone and so are ODF 1.1, etc. And all are backwards compatible since so many parties involved have a vested interest in not rewriting their tools. That's the whole purpose of standardization.

(3) Here's a reference for Microsoft Office.
 
 
Interesting. OK, let's continue this a bit.

(1) I don't see how OOXML, the format, has anything to do with UI lock-in. I think the people at MindJet would be surprised. I don't want to speculate so far out to whether there is lock-in. I think you and I are reading different things into my comments about what users do, which applies to any UI and how users train themselves to a UI to get the results they want. I suppose in one sense we are agreeing on the phenomenon, but not on what OOXML has to do with it.

(2a) I meant calibration on the extent to which Microsoft has broken its own formats in their successors. I'm with you on paper, and other enduring formats (text, etc.), although some of those have incompatibilities, whether code-page dependencies in text and e-mail-format difficulties.

(2b) Considering that ODF is still incomplete/underspecified and the conformance requirements are next-to-nill, how can you say ODF is set in stone? I just don't follow that and then ...

(3) Your reference didn't come through. I'd like to see it.

(3b) Also, I should have asked this differently. Name one ODF-supporting processor that implements the ODF spec and only the ODF spec, with no un[der]-specified in ODF content. That means no use of namespaces not supported in the ODF specification (and no abuse of ones that are).

(7) So let's come back to the prospects for harmonization. What are you views on that: possible? not possible? irrelevant? dangerous? what?
 
 
Not possible unless dangerous. This is not something Microsoft would want, so they won't co-operate, and their sincere co-operation would be absolutely necessary because OOXML is, despite all the pages and pages, seriously underspecified and anyway a "harmonized" format wouldn't fly if Microsoft didn't adopt it.

But there is one condition under which Microsoft *would* want it: If they controlled it. In which case it would just be a way of killing ODF. They'd end up with a "harmonized" format for all the different word processors to use--that only Microsoft Word could use.

Rufus Polson
 
 
Here's the link that got stripped out:
http://surguy.net/articles/ooxml-validation-and-technical-review.xml

ooxml-validation-and-technical-review.xml
 
 
Good heavens, is that all? This is pretty trivial. If it is the only example it is mildly ridiculous.

I agree that it is a bug, and easy to see how it got there. (I bet MSFT didn't have a test case for this until it came up in the comment about the Translator project.)

I figured there was a problem with Microsoft using more than what was in the spec, but this is a bug in the implementation of an initially-uncommon feature in the spec. It is a bug, though.

This is in the section of OOXML on versioning of the format itself where a new attribute introduced by someone making an extension can be tagged to be ignored, preserved, or deleted by an OOXML processor that doesn't understand the extension.

They were not honoring the instruction to preserve them. I wonder if this was fixed in Office 2007 Service Pack 1? I'll see if I can find the test file and check.

I'll also see if the spec. is solid around what should happen if the content of the element that has such an attribute is edited.
 
 
Oops! Not a bug.

OK, I looked into ECMA-379, Part 5, clause 9.1.3, and it is not a bug. The relevant bits are on p.17, lines 5-23 just before clause 9.1.3.1:

"Even in the presence of explicit preservation guidance in a markup specification, any markup editor might choose to discard together all ignored markup without regard to the presence of any PreserveElements or PreserveAttributes attribute. ... [M]arkup consumers shall always accept, but possibly disregard PreserveElements and PreserveAttributes attributes on any element."

It turns out that the PreservElements and PreserveAttributes stipulations are hints, and it is not mandatory to honor them.

So technically, as much as I don't like it myself, the behavior of Microsoft Office Word 2007 is acceptable either way. There are other nuances, but this is the bottom line of what ECMA-357 says on the topic.
 
 
I have created a blog and site that is specifically about harmonization of office-productivity format usage to an interoperable level. It is nfoWorks: Pursuing Harmony.

I am going to replicate this post there as part of the history of the project. Unfortunately, the comment thread won't travel with them. I must find something creative to do about that.
 
 
It was certainly interesting for me to read the post. Thank you for it. I like such themes and anything that is connected to this matter. BTW, why don't you change design :).
 
Post a Comment
 
Construction Zone (Hard Hat Area) You are navigating Orcmid's Lair.

template created 2002-10-28-07:25 -0800 (pst) by orcmid
$$Author: Orcmid $
$$Date: 10-04-05 12:43 $
$$Revision: 2 $