Archive for August, 2005

xpdf and Chinese PDF Files

Sunday, August 28th, 2005

It’s been a long time since my last post here…

Many things have changed. For example, now I’ve changed my views on cloning, after watching the movie The Island (but that’s for another day).

Now, I want to share with you how I managed to display chinese characters in PDF files (PDF files without embedded fonts; those I haven’t tried yet). Basically, I’ve known 3 types of Chinese PDF files: (1) PDF containing Chinese characters with embedded fonts, (2) PDF containing Chinese characters without embedded fonts, and (3) PDF containing a scanned document with Chinese characters on it.

AFAICR, I haven’t actually encountered (1). I have a few PDF’s of type (3), they are simple to display because what they contain is an _image_ of the scanned document, not the characters themselves.

The most problematic is (2). Nowadays, most Chinese PDF files are created on the Windows platform, using Adobe Acrobat (or Distiller). What the authors forget is, there are many students who have been enlightened (like me *grin*) who try to use as little Windows as possible. Well, we can’t blame them if all they know is Windows though…

Now the problem is, the fonts used in the making of the PDF files have their own unique (or not-so-unique) names. For example, there is a free (Ar PL) font called "ar pl sungtil gb" (which stands for Arphic Public License Sungti Light GB, I think). It is similar in shape to 宋体 that many Chinese PDF files use, but xpdf can’t find the font:

Error: Couldn’t create font for ‘宋体’

After googling for an answer (and found none), I read the manpage of xpdfrc(5) again, and tried displayNamedCIDFontTT. Well, it worked, and these are the lines I need in /etc/xpdfrc to open the PDF files I got (mostly from CNKI — 中国期刊网):

displayNamedCIDFontTT	宋体		/usr/X11R6/lib/X11/fonts/TTF/gbsn00lp.ttfdisplayNamedCIDFontTT	方正综艺简体	/usr/X11R6/lib/X11/fonts/TTF/gkai00mp.ttfdisplayNamedCIDFontTT	STSong-Light	/usr/X11R6/lib/X11/fonts/TTF/gbsn00lp.ttfdisplayNamedCIDFontTT	STSong-Light,Bold	/usr/X11R6/lib/X11/fonts/TTF/gbsn00lp.ttfdisplayNamedCIDFontTT	黑体	/usr/X11R6/lib/X11/fonts/TTF/gkai00mp.ttfdisplayNamedCIDFontTT	楷体_GB2312	/usr/X11R6/lib/X11/fonts/TTF/gkai00mp.ttf