Readme for conv_doc.pl Sample external converter script for ht://Dig 3.1.4 and above, that converts MS-Word, PDF or PostScript files to text (in HTML form) so they can be indexed. Uses the "catdoc" program to extract text from Word documents, "pdftotext" to extract text from PDFs, and "ps2ascii" to extract text from PostScript. Written by Gilles Detillieux, based on the parse_word_doc.pl script by Jesse op den Brouw . External converters have two advantages over external parsers. They are easier to write, and the parsing is done in a more consistent way for all document types.