The command line is very easy to use
Extract Text
pdftotext [options]
options includes:
-layout : maintain original physical layout
-htmlmeta : generate a simple HTML file, including the meta information
Extract Images
pdfimages [options]
This extract all the images as as Portable Pixmap (PPM), Portable Bitmap (PBM), or JPEG files. However it doesn't convert any PPM or PBM files into jpg. You'll need a separate utility for that.
No comments:
Post a Comment