Blunder Dome Sighting  

Hangout for experimental confirmation and demonstration of software, computing, and networking. The exercises don't always work out. The professor is a bumbler and the laboratory assistant is a skanky dufus.

Click for Blog Feed
Blog Feed

Recent Items
Recipe for Nano-ISV Success? J. D. Meier just post...
The Difference Between Resolution and Size: Or, My...
Specialization is for Insects I’ve become a regula...
Getting to Unicode: The Least That Could Possibly ...
Dear Microsoft: No Thanks for the Updates
Raymond Chen: What Feature Did You Remove Today?
More Spolsky Gems: Open-Source, the Desktop, and S...
Is There an MVC in the House?
Nano-ISV Are I, Are I Your Order Has Shipped - Joy to Book R...

This page is powered by Blogger. Isn't yours?

Locations of visitors to this site
visits to Orcmid's Lair pages

The nfoCentrale Blog Conclave
Millennia Antica: The Kiln Sitter's Diary
nfoWorks: Pursuing Harmony
Numbering Peano
Orcmid's Lair
Orcmid's Live Hideout
Prof. von Clueless in the Blunder Dome
Spanner Wingnut's Muddleware Lab (experimental)

nfoCentrale Associated Sites
DMA: The Document Management Alliance
DMware: Document Management Interoperability Exchange
Millennia Antica Pottery
The Miser Project
nfoCentrale: the Anchor Site
nfoWare: Information Processing Technology
nfoWorks: Tools for Document Interoperability
NuovoDoc: Design for Document System Interoperability
ODMA Interoperability Exchange
Orcmid's Lair
TROST: Open-System Trustworthiness



Why Not .rtfx?

The Suitability of RTF for Open Interchange

I was researching the latest status of the Open Document Standards activity in Massachusetts yesterday.  Much to my amazement, I discovered that RTF 1.7 is now an acceptable document format (scroll down to the very end [1]): 

Guidelines  – Agencies may use the RTF document format for ease of interoperability among different systems. Agencies should be aware that saving in this format usually results in larger file sizes.

Standards and Specifications

Not only is the format accepted, the Commonwealth’s Information Technology Division is now posting its procedures and related materials for download in OpenDocument Format Text (ODT) and RTF (rather than ODT plus PDF, the other still-allowed alternative format). 

The increase in size of RTF over ODF is because RTF files are in pure plain-text, including plain-text encodings of binary data. 

Why, I asked myself, don’t they simply allow Zip to be used for interchange of RTF files, since they are already permitting Zip in the built-in compression of ODF documents?  I suspect it is because zipped RTF is not something that RTF-handling application software is able to treat as a recognized (not to mention default) format.  Too bad, I thought.  10–20x compression is common and, in my experience, the Zip files are less vulnerable to damage in transmission and handling.

Availability of RTF Specifications

Microsoft Rich Text Format has been in use for years in conjunction with Microsoft Word [2].  The specifications are available as Microsoft Downloads.  

  • RTF 1.6 was specified in 1999 and the specification is available on-line. 
  • RTF 1.7 is apparently supported by products through Word 2002 (XP) and is available for download
  • RTF 1.8 includes updates for Word 2003, and
  • the just-issued RTF 1.9 has updates for Word 2007.

My first experience with RTF as an interesting format arose in the back pages of a Microsoft Word Technical Reference book that I had found years ago and have since discarded.  I used that RTF information to produce RTF in plain-text report files from a Borland Paradox 4.5 database used in a 1989 glossary-creation project.  The RTF markup was just part of the non-field report text, so that the generated report would be a pure RTF file saved to disk.  I could then import the material (into a Xerox Workstation document application) with proper formatting for display and submission to a network laser printer (I have participated in this kind of trickery for generating documents from collaborative publishing projects in a number of flavors since). 

Since that time, I haven’t thought much about RTF until I stumbled across the recent announcement of RTF 1.9.  As you can see, there has been consistent availability of the RTF specifications since at least Office 2000 and there was sporadic availability before that.

The Value of .rtfx

Yesterday, I was noodling around making notes for a software beta release when it struck me, for reasons unknown, that the ideal way to eliminate the bulkiness of RTF documents is to incorporate them into Open Packaging Convention (OPC) packages.  An .rtfx format is a perfect format for preservation and lower-cost storage and transmission.

Because the RTF is always extractable, there is no risk of loss of the document, so long as the Zip format and the OPC conventions remain known (and you wouldn’t need to know OPC just to fish out the .rtf part).

Furthermore, an .rtfx is probably one of the most-amenable to introduction as a peer-format in software that already supports RTF.  This is an ideal case for current versions of Microsoft Word [3].

Even if there is only niche interest in .rtfx (e.g., for retention in document– and records-management systems), that’s all right with me.  My interest in an .rtfx project is because

  • It provides for simple but meaningful application of the Microsoft Open Packaging Conventions directly to a legacy format
  • It illustrates how additional metadata, relational information, and supplemental content (even digital signatures) can be packaged with a non-XML format to provide essential context valuable for interchange, preservation, and specialized use.
  • It allows simple demonstration cases for plug-in and converter development, uncluttered by the heavy work of serious format translation (assuming there is already RTF support that can be relied upon).
  • It allows for simple reference implementations that serve as introductions for those wanting to apply the packaging techniques to more-ambitious undertakings.
  • There is opportunity to discover unexpected applications and opportunities involving OPC and legacy document/information forms.
  • There is a nice fit with where I want to take nfoWare in the introduction of document-processing concepts and toolcraft.  

The value of .rtfx is not as some new world-altering document format.  The definition and implementation of .rtfx is an useful precursor to more interesting developments with other OPC formats, including those of Open Office Open XML.  It is a way of reducing the difficulty of entry into use of the new formats by providing a light-weight case.

 What About the Name?

It is not entirely consistent to speak of .rtfx when the packaged format is not that of an XML encoding.  I don’t know about you, but .rtfz is also off (because it is OPC, not merely Zip) and although .rtfp might be apt, it simply doesn’t roll off the tongue (and has a certain scatological lilt to it).

On the other hand, Microsoft might have some objection, especially if they fear confusion with existing OPC and format usage in ECMA-376 and in the Microsoft Office System 2007.  I guess it might be worthwhile to obtain some simple blessing (starting with requesting that the RTF specifications be brought under the Open Specifications Promise).

Meanwhile, RTFX just seems like the right term, even if it isn’t exactly the same pure notion as the original .---x formats.

So, Who Wants to Play?

I can’t take my focus off of some existing commitments around document-processing systems.  Still, I have been looking for a simple on-ramp that would allow the basics of Open Packaging Conventions to be demonstrated and made adaptable to new areas of document application.  

For me, .rtfx seems like a perfect training-wheels, entry level for getting into OPC and the creation of plug-ins and translation fixtures.  In particular, we don’t have to deal with translation issues up front, yet we can demonstrate useful results and establish a base of experience that is valuable for more-elaborate undertakings in handling various document models and their formats.

Let’s talk.

  1. The Commonwealth web site has a complex content-management structure, and the URL deep into the Enterprise Technical Reference Model (ETRM) v. 3.6 might fail. As an altermative, start at the Massachusetts Information Technology Division Open Initiatives Policies page.  Follow the first link, for Enterprise Technical Reference Model … (ETRM v3.6).  Then follow the link to Information Domain (ETRM v3.6).  The quoted passage is at the very end of the page.  The material is also available for direct download in .odt (48k) and in RTF (623k) from this download area.  The extent of the Open Initiatives Policies, including the January 2004 Open Standards Policy are all worthy of review for its great pragmatism and focus on benefit to the public as well as attention to economies and efficiencies in operations.
  2. The RTF specifications are provided with sample code and a restrictive license that does not provide for redistribution, possible patent licenses and son on.  The sample code is designed as part of guidance on how to create converters from external binary formats.  The language has been essentially unchanged as far back as I could find materials on the Microsoft web site.  That includes version 1.3 and bits of 1.5 (the first to provide a 7–bit plain-text encoding of Unicode).  Ideally, these specifications could all be brought under the Open Specification Promise to enable open-source development and community efforts of the kind proposed here.
  3. In essays on the OpenDocument Foundation’s da Vinci project, the parody ACME-376 proof-of-concept is described as producing an XML encoding of RTF format.  This it appears that the necessary hooks for plugging-in OPC packaging of RTF format are available.  The ODF Translator Project also serves as confirmation of plug-in capability, and its code is available for open-source use.  This, or CodePlex, might even be a good place to undertake .rtfx development.
The challenge of course will be to get reasonable integration into the Office UI. Can you make RTFX be the default format for new documents in Office? Get an RTFX attachment in Outlook to launch into Word? Post RTFX documents into SharePoint from within Office, digital sign RTFX documents, etc.?
The whole idea, for me, is to explore all of those questions and see what integration is possible or not and where the seams are.

I also think an .rtfx experiment can provide great guidance for others thinking about integration of peer formats and also peer services (e.g., non-file-system repositories). It just seems like the right-sized laboratory for these questions.

Thanks for adding those great cases. I look forward to suggestions of other ones.

Construction Structure (Hard Hat Area) You are navigating Orcmid's Lair.

template created 2004-06-17-20:01 -0700 (pdt) by orcmid
$$Author: Orcmid $
$$Date: 10-04-30 22:33 $
$$Revision: 21 $