pdfbib readme

Max Snauth

<spams@start.no>

Revision History
Revision 0.1	09 October 2006	TB

Table of Contents

About
Requirements
Installation
Usage
Todo
Download
License

About

pdfbib is a cgi with ajax interface which reads the metadata of your pdf's and allows you to compare and synchronize this data with data in a (bibliographic) database. Functionality is provided to edit both the metadata and the data in the database. The goal of this project is threefold:

make local indexing and search software more efficient by providing accurate metadata
make sure that the data stocks which unfortunately have to be stored double - one time in the database, one time in the file itself - are at least synchronized
provide a nice interface to do this boring work

My current configuration is a web server which has direct read/write access to the pdf's and to the sqlite2 database used by bibus — which is IMHO currently the most mature bibliographic database (besides bibtex and all its related software of course) and interacts well with openoffice.org.

Requirements

As with all my scripts this cgi is heavily based on perl modules which have to be installed to make this work. Some of these are part of standard perl distributions, some are not. More specifically, you will need:

obviously a web server with
perl CGI installed and working
DBI with the SQLite2 driver (because bibus uses sqlite 2)
PDF::API2 (reading/writing metadata to pdf's)
CGI::Ajax
CAM::PDF (to extact ascii text from pdf's, so this is not strictly necessary)
out of laziness I use "find" command to search for all pdfs in the path, this should be fixed to include non-linux users, see the todo section.

Get them from CPAN

Installation

I include my bibliographic database in bibus' format (bibus.dat), but you really should start with an empty one. Get bibus and choose sqlite as database format. More information on bibus' database format can be found on bibus' community wiki.

Put the main cgi (pdfbib.pl) into your cgi directory and make sure that it is executable. Then adapt at least two parameters:

bibusdat: Where the sqlite database resides. Make sure you have read/write rights to this file.
bibpath: Where your pdfs reside. This can later be adjusted from the user interface. The path is searched recursively.
user and password for database: The shipped example database comes without password set. The username is berker.


	One naming convention is crucial: the pdf-filename consists of <Identifier>.pdf. Identifier is the "identifier" field of bibus' database and has to be unique.

Usage

The interface should be selfexplaining with so much ajax ;-)

Todo

make the whole thing faster, for instance by:
do not use "find" but builtin perl functions
write user guide
provide some mechanism which filters the pdf's which have missing metadata/database fields
many more?

Download

The current release is hosted by sourceforge: http://sourceforge.net/project/showfiles.php?group_id=178613

License

This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.