Tech Brief

OpenDocument Format

The Open XML-encoded file format that competes with Microsoft's Office Open XML.

The OpenDocument Format (ODF) is an open international standard that allows the exchange and retrieval of information in office documents.

ODF uses descriptive XML tags, so that large parts of an ODF document can be understood even without reading the specification.

ODF stores document content in a ZIP archive. Thus, unzipping an ODF file reveals XML streams, which then can be processed. The ZIP compression keeps the file size small without making the access difficult. Most programming and scripting languages, as well as build environments like Apache Ant, provide support for ZIP archives.

Open Standard
ODF was approved by the Organization for the Advancement of Structured Information Standards (OASIS) in 2005 and by the International Organization for Standardization (ISO) in 2006.

The first work on ODF started as early as 1999. StarDivision, a software company in Germany, offered an office suite for DOS and Windows called StarOffice. The StarOffice developers considered vendor-specific, proprietary, binary file formats a dead end, believing that the future instead demanded open, XML-encoded file formats. Sun Microsystems Inc. acquired StarOffice in 1999, and released the source code as free software, which is the basis for OpenOffice.org. StarOffice 8 and OpenOffice.org both support ODF.

The ODF Toolkit: OpenDocument Format (ODF files)
[click image for larger view]
The ODF Toolkit: OpenDocument Format (ODF files)

Today ODF support is also available for all major platforms including Windows, Linux, Mac OS X, Solaris, OS/2, FreeBSD as well as Symbian. Multiple vendors offer ODF implementations-Sun Microsystems, IBM Corp., Novell Inc., Red Hat Inc. and Google Inc. Through plug-in technologies, ODF support is also available for Microsoft Office and the Firefox browser. The strong industry support for ODF is evident in the fact that the ODF Alliance has already more than 420 members including the BBC, Bristol City Council, Corel Corp., EDS Corp., EMC Corp., Google, IBM, MySQL AB, Novell, Oracle Corp., Red Hat, Software AG, Sun Microsystems and the City of Vienna.

Reuse of Standards
A key benefit of ODF for developers is its reuse of existing standards. Instead of reinventing the wheel or using proprietary technologies, ODF integrates and leverages standards like HTML, Dublin Core, SVG, MathML, XForms, XLS:FO, XLink and SMIL. As a consequence, a developer already familiar with any of these standards can apply his or her existing knowledge to ODF. In addition, the conversion of ODF into other standards or the integration of ODF into other applications becomes simpler and standards-based. This reuse of standards also keeps the ODF specification lean and manageable.

ODF also reuses concepts. The definition of a table in a spreadsheet document is almost identical to the definition of a table in a text document. Thus, code written for processing a spreadsheet table can be reused for a text table. This not only reduces complexity and redundancy but also lowers the ODF learning curve.

If a developer is integrating ODF into workflows and business processes, for example, by using an ODF implementation like OpenOffice.org as a client for back-end services, ODF uses the XForms technology. The next version, ODF 1.2, supports RDF-based meta-data, so that existing standards are used to enable workflow and business process integration.

Abundance of Tools
Various development tools for ODF are available or under development. The most basic-but still very powerful-use case is the conversion of ODF content via XSLT. Because Apache Ant has both ZIP and XSLT support, it can be used to convert file formats. OpenOffice.org also provides a very granular and powerful API for the Java technology. Thus, by combining Ant with OpenOffice.org, powerful solutions can be built that include printing and PDF-exporting capabilities.

In this context it's important to keep in mind that ODF is the default file format in two open source applications-OpenOffice.org and KOffice. Thus, solutions can be developed, tested, deployed and distributed in a very cost-efficient, flexible and platform-independent way.

Another simple way to access ODF content and to modify ODF files is through a Perl module available from CPAN. The Perl module was originally developed for customer projects in order to replace hard-to-manage macro code stored in documents with easier-to-manage centralized solutions. The Perl module allows users to set meta-data information or extract paragraphs that contain a specific key word.

Several open source community projects and vendors are implementing software development kits for ODF. The goal of the open source ODF Toolkit project on OpenOffice.org is to provide a set of tools that simplify the integration of ODF into new and existing applications. First results of the project include a Java API as well as a .NET interface.
comments powered by Disqus

Featured

  • Windows Community Toolkit v8.2 Adds Native AOT Support

    Microsoft shipped Windows Community Toolkit v8.2, an incremental update to the open-source collection of helper functions and other resources designed to simplify the development of Windows applications. The main new feature is support for native ahead-of-time (AOT) compilation.

  • New 'Visual Studio Hub' 1-Stop-Shop for GitHub Copilot Resources, More

    Unsurprisingly, GitHub Copilot resources are front-and-center in Microsoft's new Visual Studio Hub, a one-stop-shop for all things concerning your favorite IDE.

  • Mastering Blazor Authentication and Authorization

    At the Visual Studio Live! @ Microsoft HQ developer conference set for August, Rockford Lhotka will explain the ins and outs of authentication across Blazor Server, WebAssembly, and .NET MAUI Hybrid apps, and show how to use identity and claims to customize application behavior through fine-grained authorization.

  • Linear Support Vector Regression from Scratch Using C# with Evolutionary Training

    Dr. James McCaffrey from Microsoft Research presents a complete end-to-end demonstration of the linear support vector regression (linear SVR) technique, where the goal is to predict a single numeric value. A linear SVR model uses an unusual error/loss function and cannot be trained using standard simple techniques, and so evolutionary optimization training is used.

  • Low-Code Report Says AI Will Enhance, Not Replace DIY Dev Tools

    Along with replacing software developers and possibly killing humanity, advanced AI is seen by many as a death knell for the do-it-yourself, low-code/no-code tooling industry, but a new report belies that notion.

Subscribe on YouTube