Thanks for the text paste.
Must say I'm a bit lost on that site
http://www.foolabs.com/xpdf/home.html
Can you help me locate specifically the PDF to text converter please?
I'm wallowing in files with off-putting and Windows-alien names like
't1lib-1.3.tar.gz'.
Click "Download" to get to
http://www.foolabs.com/xpdf/download.html ,
then scroll down to "Precompiled binaries" and it's probably either
xpdf-3.00pl3-win32.zip (1142558 bytes) for Win32 or
xpdf-3.00pl3-dos6.zip (1775202 bytes) for DOS.
Here's a chunk of the README for the win32 version, copied without
permission:
--------<begin excerpt>----------
Xpdf
====
version 3.00
2004-jan-22
The Xpdf software and documentation are
copyright 1996-2004 Glyph & Cog, LLC.
Email:
[email protected]
WWW:
http://www.foolabs.com/xpdf/
The PDF data structures, operators, and specification are
copyright 1985-2003 Adobe Systems Inc.
What is Xpdf?
-------------
Xpdf is an open source viewer for Portable Document Format (PDF)
files. (These are also sometimes also called 'Acrobat' files, from
the name of Adobe's PDF software.) The Xpdf project also includes a
PDF text extractor, PDF-to-PostScript converter, and various other
utilities.
Xpdf runs under the X Window System on UNIX, VMS, and OS/2. The non-X
components (pdftops, pdftotext, etc.) also run on Win32 systems and
should run on pretty much any system with a decent C++ compiler.
Xpdf is designed to be small and efficient. It can use Type 1 or
TrueType fonts.
Distribution
------------
Xpdf is licensed under the GNU General Public License (GPL), version
2. In my opinion, the GPL is a convoluted, confusing, ambiguous mess.
But it's also pervasive, and I'm sick of arguing. And even if it is
confusing, the basic idea is good.
In order to cut down on the confusion a little bit, here are some
informal clarifications:
- I don't mind if you redistribute Xpdf in source and/or binary form,
as long as you include all of the documentation: README, man pages
(or help files), and COPYING. (Note that the README file contains a
pointer to a web page with the source code.)
- Selling a CD-ROM that contains Xpdf is fine with me, as long as it
includes the documentation. I wouldn't mind receiving a sample
copy, but it's not necessary.
- If you make useful changes to Xpdf, please make the source code
available -- post it on a web site, email it to me, whatever.
If you're interested in commercial licensing, please see the Glyph &
Cog web site:
http://www.glyphandcog.com/
Compatibility
-------------
Xpdf is developed and tested on a Linux 2.4 x86 system.
In addition, it has been compiled by others on Solaris, AIX, HP-UX,
Digital Unix, Irix, and numerous other Unix implementations, as well
as VMS and OS/2. It should work on pretty much any system which runs
X11 and has Unix-like libraries. You'll need ANSI C++ and C compilers
to compile it.
The non-X components of Xpdf (pdftops, pdftotext, pdfinfo, pdffonts,
pdftoppm, and pdfimages) can also be compiled on Win32 systems. See
the Xpdf web page for details.
If you compile Xpdf for a system not listed on the web page, please
let me know. If you're willing to make your binary available by ftp
or on the web, I'll be happy to add a link from the Xpdf web page. I
have decided not to host any binaries I didn't compile myself (for
disk space and support reasons).
If you can't get Xpdf to compile on your system, send me email and
I'll try to help.
Xpdf has been ported to the Acorn, Amiga, BeOS, and EPOC. See the
Xpdf web page for links.
Getting Xpdf
------------
The latest version is available from:
http://www.foolabs.com/xpdf/
or:
ftp://ftp.foolabs.com/pub/xpdf/
Source code and several precompiled executables are available.
Announcements of new versions are posted to several newsgroups
(comp.text.pdf, comp.os.linux.announce, and others) and emailed to a
list of people. If you'd like to receive email notification of new
versions, just let me know.
Running Xpdf
------------
To run xpdf, simply type:
xpdf file.pdf
To generate a PostScript file, hit the "print" button in xpdf, or run
pdftops:
pdftops file.pdf
To generate a plain text file, run pdftotext:
pdftotext file.pdf
There are four additional utilities (which are fully described in
their man pages):
pdfinfo -- dumps a PDF file's Info dictionary (plus some other
useful information)
pdffonts -- lists the fonts used in a PDF file along with various
information for each font
pdftoppm -- converts a PDF file to a series of PPM/PGM/PBM-format
bitmaps
pdfimages -- extracts the images from a PDF file
Command line options and many other details are described in the man
pages (xpdf.1, etc.) and the VMS help files (xpdf.hlp, etc.).
-------<end excerpt>-----
Good Luck!
Rich