Tech Brief

OpenDocument Format

The Open XML-encoded file format that competes with Microsoft's Office Open XML.

The OpenDocument Format (ODF) is an open international standard that allows the exchange and retrieval of information in office documents.

ODF uses descriptive XML tags, so that large parts of an ODF document can be understood even without reading the specification.

ODF stores document content in a ZIP archive. Thus, unzipping an ODF file reveals XML streams, which then can be processed. The ZIP compression keeps the file size small without making the access difficult. Most programming and scripting languages, as well as build environments like Apache Ant, provide support for ZIP archives.

Open Standard
ODF was approved by the Organization for the Advancement of Structured Information Standards (OASIS) in 2005 and by the International Organization for Standardization (ISO) in 2006.

The first work on ODF started as early as 1999. StarDivision, a software company in Germany, offered an office suite for DOS and Windows called StarOffice. The StarOffice developers considered vendor-specific, proprietary, binary file formats a dead end, believing that the future instead demanded open, XML-encoded file formats. Sun Microsystems Inc. acquired StarOffice in 1999, and released the source code as free software, which is the basis for OpenOffice.org. StarOffice 8 and OpenOffice.org both support ODF.

The ODF Toolkit: OpenDocument Format (ODF files)
[click image for larger view]
The ODF Toolkit: OpenDocument Format (ODF files)

Today ODF support is also available for all major platforms including Windows, Linux, Mac OS X, Solaris, OS/2, FreeBSD as well as Symbian. Multiple vendors offer ODF implementations-Sun Microsystems, IBM Corp., Novell Inc., Red Hat Inc. and Google Inc. Through plug-in technologies, ODF support is also available for Microsoft Office and the Firefox browser. The strong industry support for ODF is evident in the fact that the ODF Alliance has already more than 420 members including the BBC, Bristol City Council, Corel Corp., EDS Corp., EMC Corp., Google, IBM, MySQL AB, Novell, Oracle Corp., Red Hat, Software AG, Sun Microsystems and the City of Vienna.

Reuse of Standards
A key benefit of ODF for developers is its reuse of existing standards. Instead of reinventing the wheel or using proprietary technologies, ODF integrates and leverages standards like HTML, Dublin Core, SVG, MathML, XForms, XLS:FO, XLink and SMIL. As a consequence, a developer already familiar with any of these standards can apply his or her existing knowledge to ODF. In addition, the conversion of ODF into other standards or the integration of ODF into other applications becomes simpler and standards-based. This reuse of standards also keeps the ODF specification lean and manageable.

ODF also reuses concepts. The definition of a table in a spreadsheet document is almost identical to the definition of a table in a text document. Thus, code written for processing a spreadsheet table can be reused for a text table. This not only reduces complexity and redundancy but also lowers the ODF learning curve.

If a developer is integrating ODF into workflows and business processes, for example, by using an ODF implementation like OpenOffice.org as a client for back-end services, ODF uses the XForms technology. The next version, ODF 1.2, supports RDF-based meta-data, so that existing standards are used to enable workflow and business process integration.

Abundance of Tools
Various development tools for ODF are available or under development. The most basic-but still very powerful-use case is the conversion of ODF content via XSLT. Because Apache Ant has both ZIP and XSLT support, it can be used to convert file formats. OpenOffice.org also provides a very granular and powerful API for the Java technology. Thus, by combining Ant with OpenOffice.org, powerful solutions can be built that include printing and PDF-exporting capabilities.

In this context it's important to keep in mind that ODF is the default file format in two open source applications-OpenOffice.org and KOffice. Thus, solutions can be developed, tested, deployed and distributed in a very cost-efficient, flexible and platform-independent way.

Another simple way to access ODF content and to modify ODF files is through a Perl module available from CPAN. The Perl module was originally developed for customer projects in order to replace hard-to-manage macro code stored in documents with easier-to-manage centralized solutions. The Perl module allows users to set meta-data information or extract paragraphs that contain a specific key word.

Several open source community projects and vendors are implementing software development kits for ODF. The goal of the open source ODF Toolkit project on OpenOffice.org is to provide a set of tools that simplify the integration of ODF into new and existing applications. First results of the project include a Java API as well as a .NET interface.
comments powered by Disqus

Featured

  • Building Secure and Scalable APIs in .NET 8

    Tony Champion: "From giving you access to the entire lifecycle of a request, the ability to configure and extend authentication and authorization, .NET 8 gives you the power to create APIs to meet even the most demanding needs."

  • What's New for Java Tooling in VS Code, Azure Cloud

    Java on Visual Studio Code gets a new tool to its extension pack, while Java on Azure upgraded the Azure Toolkit for IntelliJ and more in new regular updates for both properties.

  • Microsoft Highlights Third-Party Open-Source '.NET Smart Components'

    Microsoft has long acknowledged third-party vendor contributions to dev tooling ecosystems like Blazor and is now doing the same for its newly open-sourced .NET Smart Components.

  • Data Science Pack for VS Code Bundles Python, Data and Copilot Tools

    New extension pack bundles wildly popular tools for Python development, assisted by the AI-powered GitHub Copilot and a data wrangler.

Subscribe on YouTube