Tuesday, October 15, 2013

Duplex Printing of PDF document with T&C at the Back

Once upon a time I was asked to develop a process which can:
  1. Generate new invoices using XML Publisher every morning
  2. Print the invoices to a designated Postscript printer
  3. Invoices is duplex printed, and it have the Terms & Condition on every back page.
  4. The invoces in PDF format (no T&C) are archived in a storage server.
  5. The filename is the invoice number, and the archiving folders are in year, month, week, day hierarchy.

The PDF generation part is just the beginning; it's just a RTF file with eye-catching fancy layout with terribly complex logic to display a variety of freight, tax, discount, installment total, etc for different countries. The concurrent request has done and the PDF is generated.  The remaining parts will all handled by a Java program.

To do 2 and 3, I has given a one-page T&C PDF file (which uses smaller-than-eye-can-see font size so that users will not able to read it unless using a high-power magnify glass).  I insert this T&C page to every second page of the invoice file.  Then the T&C pages are rotated 180 degree.  This merged PDF file is converted to Postscript with a duplex printing instruction in it.  Then feed this postscript file to printer queue.  The rotation is because the printer I worked with uses the short-side flipping mechanism to do the duplex printing, so the even-pages have to be rotated in order to make both sides of the text in the same direction.  

Actually I can create the rotated T&C first and insert it into every second page.  But the the file is provided by end-users and if the content is changed, I just rather replace this file without further doing anything on it.

For PDF splitting and merging, I use the Apache pdfbox, rather than iText API. The reason is that iText could decides to change the license so that it becomes a commercial product; and Apache will never do that.

For the PDF to postscript process, I use the pdftops, a part of xpdf package, which is open-source and available for almost all platforms. In Linux/Unix environment one can use Arcobat Reader to do the PDF-to-ps conversion (acroread -toPostScript -level2 filename), but pdftops has the option of adding duplex printing command in the output file, of which acroread does not have this feature.

import org.apache.pdfbox.exceptions.COSVisitorException;
import org.apache.pdfbox.pdfwriter.COSWriter;
import org.apache.pdfbox.pdmodel.PDDocument;
import org.apache.pdfbox.pdmodel.PDPage;
import org.apache.pdfbox.util.PDFMergerUtility;
import org.apache.pdfbox.util.Splitter;

....
// the inputFile is the invoice PDF file
PDDocument inputPDF = PDDocument.load(inputFile);
int inputPageCount = inputPDF.getNumberOfPages();

// create Merger and a List of pdf files 
PDFMergerUtility merger = new PDFMergerUtility();
List pdfFiles = new ArrayList();

// create splitter and split every page into individual file
Splitter splitter = new Splitter();
splitter.setSplitAtPage(1);
 
// do the splitting
List splittedPages = splitter.split(inputPDF);
for(int i=0; i<pdf.getNumberOfPages();i++) {
  PDDocument doc  = splittedPages.get(i);
  String fileName = splittedFileName(inputFile, i);
  writeDocument(doc, fileName);
 doc.close();
         
// add this splitted page, then T&C page to the output 
pdfFiles.add(new FileInputStream(new File(fileName)));
pdfFiles.add(new FileInputStream(TandCFile));
}  

// do the merging
merger.addSources(pdfFiles);
merger.setDestinationFileName(outputFile);
merger.mergeDocuments();

// rotate even page by 180 degrees
PDDocument pdf = PDDocument.load(new File(outputFile));
List pages = pdf.getDocumentCatalog().getAllPages();
for (int i=0;i<pdf.getNumberOfPages();i++) {
  if (i%2==1) {
    int pageRotation = pages.get(i).getRotation();          
    pages.get(i).setRotation(pageRotation+180);
  }
}
pdf.save(rotatedFile);

// convert PDF to Level 2 Postscript using pdftop, with duplex printing instruction
String[] convertCmd = new String[] {"pdftops", "-level2", "-duplex", rotatedFile, postscriptFile};
ProcessBuilder convertBuilder = new ProcessBuilder(convertCmd);
Process convertProc = convertBuilder.start();

// print the postscript to printer
String[] printCmd = new String[] {"lpr", "-p", "hp4130ps", "-h", postscriptFile};
ProcessBuilder printBuilder = new ProcessBuilder(printCmd);
Process printProc = printBuilder.start();

.....

// set the filename of the splitted page as inputFile-[page no].pdf 
private String splittedFileName(String origFile, int pos) {
   return origFile.substring(0, origFile.length()-4) + "-" + (pos+1) + ".pdf";
  
}

// helper method to write splitted file
private void writeDocument(PDDocument doc, String fileName) throws IOException, COSVisitorException {
  FileOutputStream output = null;
  COSWriter writer = null;
  try {
    output = new FileOutputStream(fileName);
    writer = new COSWriter(output);
    writer.write(doc);
  } finally {
    if(output != null) output.close();
  if(writer != null) writer.close();
  }
} 

No comments :