-
INNOVATIONS
A promising new application for LaTeX is
as a generator
of sophisticated PDF documents. As well as
value added e-books, we can use
LaTeX for on-the-fly document production.
-
Why LaTeX?
-
-
LaTeX is a stable and robust program, having the best representation of mathematical
text available, tables, cross referencing and other structures, and ablility to build
table of contents, index, and bibliography on the fly.
-
-
LaTeX is a programming language with features include loops, conditionals, on the
fly definition construction, the ability to send information to external files,
and/or bring other external files into the document, measure text,
do arithmetic operations, and much more.
-
-
LaTeX allows parsing of form results,
allowing either PDF forms or HTML forms to be used as input,
as well as database information.
-
-
Importantly, it
can embed PostScript and PDFmark information within LaTeX
commands, allowing dynamic graphics generation and
hypertext linking as the LaTeX document is produced.
Any of the other features
that can be added to a PDF document can also be written into the
LaTeX macro set and be customized depending on the input
to the LaTeX file.
-
Real Life Example, On-the-fly Document Production
-
Database publishing:
Architectural Specifications Example
PDF forms produce output that may be parsed by LaTeX This discovery
was made in the process of building a proof of concept for an
Architectural Specifications company that routinely produces
documents that are as long as 20,000 pages, building the document
from pre-existing units, based on the requirements of the particular
project, a high rise or shopping center, for example. A PDF form
was designed so that choices may easily be made:
Sample Form
When
processed with Acrobat, the form results are presented as
text in a new file (FDF). Sample FDF Data The form file
may be "cleared" and reused.
The FDF file is now available for parsing by LaTeX,
changing the form results into LaTeX commands.
Here is the SpecCheckList form, used twice, and producing two new
PDF documents:
First use of Spec Check List Form==>
Parsed ==>
Document Produced
Second use of Spec Check List Form==>
Parsed ==>
Document Produced
(Tech note: the FDF file is input, and parsed, then the results
sent to an .inf file, which is client and date stamped so that
it is unique, and may be reused. The file is then input back
to the .tex file where the new definitions are used in the
prepared fields below. That is what is happening here:
Sample use of parsed FDF data
)
As well as populating an existing document with the information
gathered from the form data, as shown in the example above,
the newly generated LaTeX commands
may be used to input the appropriate sub documents. In this
example, they could be used to build
an entire architectural specification, turn it into PDF,
to be presented to the Architectural client, and distributed
to subcontractors.
-
OPPORTUNITIES
-
Starting with an Acrobat or HTML form is useful for database publishing
or on-the-fly report generation of any variety.
Here are only some of the possibilities:
-
-
Automating the building of large custom documents, which can also
have a hyperlinked table of contents, cross referencing, links to
on-line material, and automatically generated index.
-
-
Automate the building of graphical data representation. Input to
the form may be numbers or math that can then be used to
generate PostScript graphics on the fly. An example might be
medical reports that show lab results graphically and give custom advice
to the patient based on those results.
-
-
Automate datamining on-line, and representation of the results.
Consider the uses of this technology in
Bioinformatics, Sequence and Genome Analysis for example.
Genome research projects typically involve a variety of data
(sequences, annotations, analysis results, database links, graphical
images, etc.) that may be distributed over multiple storage locations
and networks. Management, analysis, and communication of this information
may be greatly helped by this automated report generation tool.
An on-line search
of a genetics database yields information that may then be
represented in PostScript in a way that helps the researcher
evaluate the results quickly. This information would be presented in
a PDF file which may have links to further information as well.
-
-
Publishing: used for tracking article or book submissions.
Authors fill out form, report is built, with PDF file as record to refer to later.
-
What uses can you imagine for this technology?
-
Background:
How LaTeX, PostScript and PDF Work Together
-
LaTeX output normally is printed after converting it to PDF.
An intermediate step is changing the LaTeX output to PostScript,
which is then translated with Acrobat Distiller into PDF.
A consequence of these steps: LaTeX => PostScript => PDF, and
the fact that PostScript code and PDFmark commands may be added
to a LaTeX file, means that we can write LaTeX commands that
process the text and then automatically
generate PostScript code using the information that LaTeX has
captured.
The PostScript code may then be passed through verbatim when the
LaTeX output is changed to PostScript, allowing us to use any of
the features available in PostScript, combined with the results
of the LaTeX commands.
Some trivial examples of LaTeX/PostScript interaction, which
nevertheless demonstrate passing information from LaTeX to PostScript:
-
The first example shows positioning a variable sized PostScript screen behind
LaTeX text, based on the size of the text, as measured with a
LaTeX command, and the results passed to the PostScript code. There
are also "cutouts" in the screen, the size determined with a LaTeX macro, and
the information passed to PostScript, which actually makes the screens.
PostScript Cut Out Screen
-
The second example shows PostScript color tabs
used on the side
of chapter opening pages is shown here, and moved down the page
with each new chapter, in a sample from MatLab documentation:
PostScript Side Tabs.
-
Prelinked PDF Generation
-
Similar to processing LaTeX/PostScript commands, we can also include
pdfmark commands in the body of LaTeX commands, a feature that
allows hypertext links to be generated
based on the information in the text. PostScript and PDFmark
commands can include color and any
other capability found in Acrobat. These commands will be passed through
the PostScript interpreter, and recognized when Distiller turns that
PostScript into PDF.
Since LaTeX commands can
generate custom PostScript code, based on LaTeX's processing
of the text, and since custom PDFmark information may also be
generated with LaTeX code, LaTeX makes an ideal text processing
program for automated report generation, with the report to be presented as a
sophisticated PDF document. The possibilities for the content
and presentation of the PDF document
abound, and we look forward to exploring them.
-
Let us know how these tools
might be useful to you!
-- Amy Hendrickson
info@TeXnology.com
617 738-8029
TeXnology Inc.
Amy Hendrickson
57 Longwood Avenue
Brookline, MA 02446
USA
|
|