|
Indexing a Multi-Volume Project
November 18, 2003
Forum organized by Virginia Rich
Presentation by Cynthia Berman and Ellen Perry
Notes by Dawn Adams
As the technical field moves more and more
toward using multi-volume reference works, indexing has had to
change along with it. Indexing software manuals already differs
substantially from trade book indexing (traditional
back-of-the-book indexing) in that the manuals are often produced
electronically in PDF or HTML formats rather than print. Now that
many of the manuals are no longer standalone books but volumes in
a set, the complexity of the indexing process has increased
correspondingly, necessitating new processes and project
management guidelines.
"One thing we want to point out is that a
well-formed index is a well-formed index, well-formed in social
science, software, or whatever," said Cynthia Berman, BAEF member and freelance
technical editor, indexer, and writer, formerly with Siebel
Systems. "Indexing has been a real area of growth in editing and
software. Now very few hardbound books are going out any more in
our world. The indexing process is evolving from just creating
the index to creating hierarchies and taxonomies. We work with
content objects and content management systems rather than
books."
Ellen Perry, BAEF member and lead technical editor for
AutoCAD Technical Publications
at Autodesk, Inc., said that one issue can be getting the upper
management buy-in for
creating an index in the first place. Once that's accomplished,
convincing the writers to
close the files (i.e., stay out of the manual's
files) long enough for the files to be indexed is the next
hurdle. Perry has had to work
nights and weekends to accommodate writers who need daytime
access to their files.
"You have to gauge upper management support
for help files and the
time needed to edit them," Perry said. "You have to push hard up
front to make sure that
everyone is on board, consider staffing issues, and define the
schedule."
Planning out the process
One of the key elements in making the
indexing process for a
multi-volume work go smoothly is figuring out staffing and
project requirements.
For
example:
* skill set-what kind of skill set are you looking for in an
indexer? Someone who uses
dedicated indexing software or who embeds index entries in a
FrameMaker document?
* toolset-what tools do the indexers need from you? Do you need
to provide an add-on
product for FrameMaker or PDF files?
* work environment-will the indexer work onsite or will he or
she be working offsite?
* project management tool-what will you use to track milestones
and maintain schedules?
According to Perry and Berman, the tools they have used run the
gamut from Excel to Word
to MS Project
* defining project start-when is a book ready for indexing? For
Perry, indexing is
occurring earlier and earlier in the production process-now at
the proofreading stage.
Localization (translating the help files into different languages
for different user
markets) often determines her deadlines for indexing. Berman
noted that indexing can begin
when a book is 85% done. In her projects, the documents were
often prioritized for
indexing by how they were used.
Maintaining indexing standards
In multi-volume projects especially, it is
vital to have indexing standards
set up. When several indexers are working on different
parts of the same work,
there is huge potential for deviation in how concepts and
products are handled by
different people. Berman recommends having writers follow a
writing model on the front end
and having the indexers follow an indexing style guide on the
back end.
"If you have several indexers working, you want to make sure that
product or feature is
spelled the same way throughout index by using a glossary or
something similar," Berman
said. "That will speed up editing and delivery, help with
translating the content, and
help your user build a mental model of how the index and volume
relate."
According to Berman, having the writers follow a writing model
can help to "seed" an
index. The writers deal with important concepts in the same
manner, using standard ways to
refer to standard content elements. This will help the editor to
develop the indexing
style guide as well as enable the indexers to consistently
identify concepts for indexing.
Berman cautioned that only product names and feature names that
have been approved by
marketing should go into the style guide and index.
Some of the important guidelines that belong
in the style guide are:
* What your indexable subjects are (e.g., tasks and concepts, or
one or the
other?)
* How to handle cross-references and double-posting
* What the entry structure should be-for example, Gerund, noun
(Printing,
reports)
* What the sort sequence is-do cross-references belong at the
top or the bottom of an
entry?
* Whether any special formatting is used (e.g., bold or italic)
and how
* Where the index entries are embedded (in the text or the
heading)? According to Berman,
the delivery mode has implication for placement of entries. For
example, when using
WebWorks Publisher she had to put entries in headings. And Perry
said that quite often you
have to put entries into a certain location for the localization
team.
Berman emphasized the need for controlled
vocabulary. According to
Berman, using particular terminology not only builds a good
mental model, but it
facilitates information mapping, making it much easier to track
types of information. It
also enables content to be more easily reused later, since the
vocabulary is standardized,
and facilitates the work of the localization team responsible for
translating the
documents and indexes into languages other than English.
"There are advantages to controlled vocabulary," said Berman. "It
is easier to edit,
easier for contractors to get to know the vocabulary. And for
content management systems,
it seeds the metadata and has positive implications for
searching."
In addition, consider up front how you will allow for updates to
the style guide as styles
evolve, Berman said.
Merging volumes into a whole
At times, volumes that were indexed
separately will need to have
their indexes merged into a consolidated whole. Perry noted that
once indexes are merged,
there is normally quite a bit of duplication, some mysterious
entries appear, and
sometimes top-level entries will get lost. In the editing
process, Perry recommended
deciding up front whether you're going to maintain your edits for
separate delivery
methods (e.g., for both print and HTML), since some HTML edits
don't apply to print.
"In editing a merged index, you have to decide what tradeoffs can
you live with as an
editor?" she said. "You need to make some compromises."
Scheduling and planning are also an integral part of editing a
merged index. According to
Perry, You should not only plan what you will do in your indexing
passes but how long you
will allow for each pass.
"I once had about 17,000 entries to edit in 3 to 4 days," Perry
said. "In the best of all
possible worlds, I would have had a week-you edit subentries
then cross-references;
check for style adherence; spell-check; and then merge entries
until you drop."
Perry highly recommends using IXgen if you are indexing in
FrameMaker. IXgen pulls all of
the index markers into a list, enabling you to search in the
file. Once in IXgen, you can
edit the text any way you'd like, both deleting and adding index
entries. Once you have
done all your edits, you apply the markers in the Frame document.
Tools of the trade
"Software-it's great to have lots of it,"
Berman said.
As far as software tools go, there are quite a few options
available. According to Berman,
some of the more common out there are FrameMaker using IXgen as a
plug-in; Word; Excel;
dedicated indexing software such as Macrex, Cindex, or Sky Index;
and specially developed
scripts. The index can be delivered in PDF, WebWorks Publisher,
or HTML Help Workshop.
"In FrameMaker, it's easier to generate an index, but harder to
enter and edit index
entries," Berman said. "IXgen is a plug-in that turbo-charges
indexing in Frame. Some say
that Word is easy to use for indexing, but it's absolutely
impossible for multi-volume
works. And custom scripts, while versatile, they do require
special skills."
Single-source indexes
Single-source indexes are a special beast,
according to Berman and
Perry. Because a single work may be published in multiple formats
(e.g., print, PDF, and
HTML), there can be implications for the indexing process.
Everything that you take into
account for a multi-volume work applies, along with additional
considerations of marker
placement for the PDF vs. the HTML versions. But no matter what,
all page ranges must go,
said Berman, and using controlled vocabulary becomes even more
important.
Reindexing or reinventing the wheel?
One issue that comes up over and over again
with management is
whether to reindex a work that has been revised or whether to
update the existing index,
said Berman. Indexers generally maintain that it's both easier
and cheaper to reindex,
while managers claim that that cannot be so, she said.
"Managers see indexing as a software function not an analytical
function," Berman said.
"Revision can work, if the writers are using a writing model that
promotes chunking and
labeling such as information mapping. But indexes still can get
stale, so you need clear
delineation of what's new, updated, and deleted."
Updating an existing index that was created with embedded entries
can be difficult because
writers quite often will do a lot of cutting and pasting.
Sometimes the entry gets pasted,
and sometimes it doesn't.
"If the writer doesn't change the index entry, and the indexer
doesn't have the time to do
a really thorough job, no one may notice that that entry is
wrong," Berman said. "This
gets to be a problem with conditional text in Frame."
In addition, it can also be problematic to identify new material
that needs to be indexed
if you are updating an index for a legacy document-change bars
are not always a good
indicator of new text, Berman said. The flip side of that is that
it can be difficult to
find deleted text and make sure that the index entry has been
deleted. An updated index
can also become stale, as terminology, features, and products all
change.
The bottom line is that it may be quicker to reindex than to
perform a thorough review,
said Berman, but in any case there is almost never enough money,
and almost never enough
time.
Resources:
Indexing Books, Nancy Mulvany, 1994
Indexing From A to Z, Hans Wellisch, 2nd ed., 1995 ANSI NISO
Standard for Monolingual Thesauri
Information Architecture for the World Wide Web, Rosenfeld and
Morville, 2nd ed., 2002
Managing Enterprise Content, Anne Rockley, 2003
The Content Management Bible, Bob Boiko, 2002
What is a Controlled Vocabulary? Karl Fast, Fred
Leise, and Mike Steckel
www.willpower.org
This panel is adapted from a presentation that Perry and
Berman gave with Jan C. Wright in
Vancouver, B.C., June 19-21, 2003 at the annual meeting of the
American Society of Indexers
(ASI) and the Indexing and Abstracting Society of Canada/Societe
Canadienne Pour L'analyse
de Documents (IASC/SCAD).
|