|
(This column focuses on the current trends in
technical communication with regard to the economy, domain, technology,
documentation, management, and employment. To write in to this column or
to share your comments, questions, and suggestions, please write to the
column editor, Sita Bhatt)
XML: The Answer to Everything?
By Kay Ethier & Scott Abel
This is the second and last part of this article. The
first part has appeared in the May-June issue of Indus.
XML Uses
In the publishing arena, XML is used by authoring and
content management tools. Authors use the XML elements and attributes to
produce documents. Content management tools use the XML elements and
attributes as data that can be retrieved or marked for reuse.
Is this the answer to everything? Well, in the
publishing world the answer is sometimes “no”, because affordable
publishing can sometimes be accomplished without the help of XML - XML
would be overkill. However, XML often is the best option for
organizations that take the time to evaluate their content lifecycle and
to examine how much it costs to create, maintain, translate, deliver,
store, reuse, archive, and retire content. A recent study by ZapThink
(“XML in the Content Lifecycle Foundation Report Creating, Managing,
Publishing, Syndicating, and Protecting Content with XML”) found that
the biggest - and most expensive - challenge for most organizations
today is content reuse. The study found that “Producers of content in
the enterprise spend over 60% of their time locating, formatting, and
structuring content and just 40% of their time actually creating it.”
(Source:
www.zapthink.com/report.html?id=ZTR-CL100)
The sad fact is, most organizations don’t know how much
their content creation and management efforts cost them, and so they
assume that XML is not for them. The reality is that the only way to
know whether XML is the right choice for your organization’s publishing
needs is to seek the assistance of a content management expert who can
perform an organizational needs analysis, a content lifecycle analysis,
and an audit of your existing content. Additional services offered by
content management consultants include customer needs analysis, tools
recommendations and assistance calculating return on investment.
Analysis often identifies obstacles to change (tools, processes, and
people) that will need to be addressed before you adopt XML as a
publishing solution. Once you know how much it costs, and what obstacles
you’ll face, you can make an informed business decision about whether to
move to XML publishing or not. XML does provide a lot of options.
Exchanging content, for example, is often easier and more affordable
with XML than it is with proprietary tools like Microsoft Word. Rather
than saving content in a proprietary format, authors can output their
document content into XML and pass it along to colleagues or customers
who need the content but who may use other authoring and publishing
tools. Additionally, XML makes reuse of information easier since
formatting data is separated from XML content. Separating content from
format is one of the biggest productivity gains an organization can
obtain by adopting XML.
XML content may be used to produce one document, and
that same XML content can then be harnessed to create additional
documents, each with a completely different look and feel.
Alternatively, the same XML content can be dynamically served up to
various audiences in different chunks or in different sequences using
other technologies (see “XSLT”, below). This represents a degree of
flexibility that HTML simply doesn’t offer.
Free XML
Authoring Tools
There are a wide variety of free XML authoring tools
available for download on the internet. Each has its own strengths and
weaknesses, and no one free tool does it all (i.e. your mileage may
vary).
Check them out and learn as much as you can about XML
authoring before you decide to employ any particular tool:
XML-related
Technologies
www.gca.org/papers/xmleurope2001/papers/bio/s13-1auth2.html, an XML
Research Specialist at Software AG, once exclaimed, “XML doesn’t do
anything!” In its purest sense, this is true; by itself, XML will not
magically repurpose content for multiple media or audiences. XML doesn’t
provide formatting in the absence of additional technologies. In order
to make XML “look good”, or turn it into a final deliverable, some
assistance from format-conscious technologies is required... but on the
other hand, no amount of such formatting technology can turn
ugly-duckling HTML content into a coterie of media swans.
XSL and XSLT
In the HTML world, Cascading Style Sheets (“CSS” files)
make HTML display as desired... in a web browser. Because XML separates
content from its formatting data, you must employ additional
technologies to format XML, allowing it to display as you wish. XML can
be formatted a few different ways. You can bring XML content into
XML-based tools to change its appearance. (You can also use HTML to
format XML.) The XML formatting and transforming language (Extensible
Stylesheet Language, Transform, “XSLT” for short) can adjust XML output
for various display purposes. When you have multiple media in which you
want to present your content, XML is far more flexible than its HTML
ancestors.
XSLT uses the tags within an XML document to control
formatted output. Formatting XML content can be as simple as adding bold
to a tagged object. The formatting can be as complex as
telling all of the pieces of an invoice, for example, to display in a
certain font, point size, style, etc. in a table and make the table
content “sortable” by any of the tags used in your XML content.
Free software tools used for XSLT include Saxon and
Xalan (and others). Each allows you to perform transforms without moving
your XML content into a proprietary tool that will “trap” you into using
that tool in future. Saxon, created by Michael Kay, is available in
several flavors. The “lite” version allows you to do transformations on
any PC running the Java Runtime Environment (JRE). Saxon is available
via
saxon.sourceforge.net. The JRE
is available from multiple sites, including
www.java.com/en/download/windows_automatic.jsp.
Xalan is an XSLT processor designed to transform XML
documents into HTML, text, or other XML document types and is available
via
xml.apache.org/xalan-j among
other sites.
A good resource for more information on working with
XSLT and XML is Mitch Amiano’s free software collection, the “Agile
Markup Toolkit”, which is available at no cost. The CD itself contains
several dozen free software installations and links. Any software on the
CD also includes reference information that indicates where it came
from, allowing you to update as new releases become available. Mitch is
a big user of free software, very involved in the free software
community, and is also a user of the tools he has gathered on this CD.
XSL-FO
Another subset of XSL is XSL-FO. The FO stands for
“formatting objects.” XSL-FO provides a means for formatting XML for
presentation. More information on its capabilities is available
http://www.w3.org/TR/xsl/.
XQuery
Some companies may be publishing information stored in a
database or even stored as XML. XQuery allows you to query XML, similar
to the way SQL is used to access databases. More information, and a
great overview, are available from LINK Data Direct Technologies,
http://www.datadirect.com/techzone/xml/basics/basics/index.ssp
XML Performance
How has XML met with the
www.w3.org? Certainly there are many XML-driven websites. Check out
Safari, CNN, Fidelity, and Wired, among others. These are dynamically
generated pages with XML behind the scenes. At Fidelity, XML ties
together web and back-end systems to deliver hundreds of thousands of
transaction per hour to its web site customers. Fidelity says it’s
realizing millions of dollars of savings in infrastructure and
development costs by eliminating the need for transformation of data
between the company’s disparate database systems and by reducing (by
50%) the number of web application servers through which customer data
travels.
(Source:
www.internetweek.com/newslead01/lead080601.htm).
In publishing, XML has proven beneficial for creating
materials derived from information stored in database or publishing
information that developers have created in XML. Some tools can open the
XML and style it, providing paragraph formatting along with page layout
(and in publishing, presentation is everything!). Such tools, which can
automatically style XML, make publishing data easier and more affordable
than traditional publishing methods. However, XML can slow performance,
if not integrated properly and appropriately planned for. “Research by
IBM Labs shows that even small XML-based documents can increase the CPU
cost of a relational database transaction by up to 10 times in the
absence of a dedicated XML processing engine. The research concluded
that XML parsing could have a ‘potentially fatal impact’ on
high-performance, transaction-oriented database applications that use
XML.”
(Source:
www.nwfusion.com/news/2004/0503xmlaccel.html). Hardware vendors are
rushing to develop new gigabit-speed silicon to address the spread of
XML and the processing problems it can sometime cause.
Again, it’s important to employ a content management
expert with experience in planning and implementing XML solutions before
you adopt XML in your organization. XML is a business solution, not an
IT solution. Employ it only after developing and conducting a thorough
analysis of your organizational business needs, the needs of your
customers, and after evaluating your content lifecycle. The results
should yield a unified strategy for XML use across your enterprise that
will provide measurable benefits and a positive return on investment.
Conclusion
XML isn’t the universal panacea... but it is often
preferable to alternatives. Particularly in publishing applications,
which represent so many ways data can be caught up in proprietary
systems, it’s a good idea to use non-proprietary technologies for
content authoring, management and delivery, and it’s crucial to assess
and quantify the potential paybacks of XML versus HTML systems. T
This article was first published in February (2005)
issue of Interface, newsletter of the STC Hoosier Chapter.
http://www.hoosierstc.org/newsletter/february2005.shtml. It is
reprinted in Indus with permission from Interface and the content might
have undergone some changes to in compliance with the editorial policy
of Indus.
(Kay Ethier is an Adobe Certified Expert in
FrameMaker 7.x and several. In 2001, Kay co-authored the book XML
Weekend Crash Course (Wiley/HungryMinds). She has most recently been a
contributing author on Advanced FrameMaker (TIPS Technical Publishing)
and XML and FrameMaker (Apress). Scott Abel is a technical
writing specialist and content management strategist whose strengths lie
in helping organizations improve the way they author, maintain, publish
and archive their information assets.) |
|