Monday, December 12, 2005

Silent Print a PDF !! Print PDF programmatically....

Update: Please do not forget to check the other PDF and print related links in the sidebar


If you have come to this page after a Google search on the following terms - "print PDF in a programmatic method", "silent print", "automate PDF print" then it is probable that you have come to the right place. This would be true especially if you want a cheap / free / open source solution to print PDF automatically.

And if you just want the answer to question on "How to prevent users from saving a PDF document ?", then just go into the Adobe Acrobat forums where the issue has been highlighted as a "most frequently asked question". There is another interesting and well-explained article available on why it is futile to prevent website viewers from saving your "secure" PDF files. And good luck, if you plan to leave already without reading the rest of the article.


First of all, take a deep breath of air; you will be going through quite a long ride now.


Let's go through the basics first.

A printer is a device / resource in the context of operating systems. So all printing is done at the behest of the operating system. For a layman, this would mean - the user has control over what to print and what not to print, how to print and whether the print commands can be archived for printing the same document later on. You might ask - how is it possible to store the print commands ? Well, it is possible to have software / print servers that will store such printing instructions. This is usually the case in my friend's company - the print server software stores all submitted documents for a week, after which they are discarded. Point to be noted : to print something the printer will need printer-level instructions, and those instructions can be stored in printer memory to be reused later.
Coming back to the point, silent printing or for that matter even user guided printing requires code to exercise control over the Operating System. So if you want a silent print without any hassles, your success in achieving it will depend on how much control you can exercise over the Operating System from a given environment - "trusted" desktop application based printing, website based printing, "signed" applet based printing...........................................

Now on to the gory details of PDF.

PDF is a Adobe file format that is a tempered down version of Adobe's Postscript format [technically Postscript is a programming as is any other computer programming language].
So if you want to print a PDF file [with or without user interaction], you will have to convert the information in the PDF file to Postscript. This will enable the printer Postscript driver to issue printer specific commands [note the word "commands", this is one reminder that Postscript is a programming language] that produce the required output accurately.
It is a different issue for printers that come without a Postscript driver. Certain printers especially that of HP might come with PCL [ Printer Command Language ] drivers instead. And there comes the additional step of converting the Postscript data to data understood by PCL. This implies that to print a PDF, you'll have to write code to be capable of converting PDF/Postscript to the native format understood by the printer.
One such product capable of doing that is the free Adobe Acrobat Reader. This well-known product does have it's limitations however. Command line options to print the PDF are undocumented / unsupported by Adobe. The Acrobat Reader process is known to start, print and continue running even after a silent print via command line; a bug that's not yet fixed by Adobe. Which means - you should expect to see a blank Acrobat Reader window after the "silent print". There are "beat around the bush" methods to attain certain amount of success using the commandline options - not for those who want "da perfect solution".
Inter Application Communication (IAC PDF reference file available at link) with the free Acrobat Reader cannot be done via OLE automation give the fact that Acrobat Reader is not a OLE server; you can get some luck only with Acrobat Standard or Pro. Only DDE messages can help a programmer write an application to achieve this. DDE will have restrictions that are not faced by an OLE Server. [If you don't know the difference then you musn't be doing this anyway].

Commercial libraries could alleviate this problem, but they might have to installed at the client systems - licenses for such usage are known to be highly priced [Consider this option in an enterprise application]. Such libraries are available for both .Net and Java. Availability of ports to other languages / frameworks will depend on the demand for the ports.

There are other solutions as well - the Acrobat JavaBean to view and print a PDF file. You can keep away from this, given the fact that it is very obsolete and is known to produce smudgy previews and printouts. Very buggy and should not be used unless your customers don't want printouts. In short, keep away even if this "recent" article" seems to help you.

Rumours have it that Sun Microsystems has an in-house library that produces high quality output. The reason for it's unavailability is however unknown. Seems like there are some licensing issues there, which might also be one of the reasons why the org.jdesktop.jdnc.incubator.rbair packages disappeared all of a sudden. If you got excited by having a look at the Java Print Service (which was initially offered to solve all printing problems!! ) and discovered that there is a PDF Docflavor, then it is time to cool down; take a peek at this link. I'd also quote the most important part of the page (right at the end)........

"Just because the PDF DocFlavor exists doesn't mean you can use it to print a PDF file. When you lookup the PrintService for the PDF flavor, it will report unsupported flavor. That means there is no print service for the flavor. Sun doesn't provide one for PDF files. To the best of my knowledge, nobody else makes one available either. Until such a time as someone does, you can't print PDF files using the new Print Service API."

There are open source solutions as well that might get the job done, but I wouldn't recommend any, given the fact that I consider none of them are mature enough to be used in commercial projects with a heavy stake on print output quality. For some exploratory programming you could try JPedal and PDFBox.

You can otherwise think of buying commercial libraries that will rasterize [convert the PDF data into a bitmap] and print the PDF files. And if you want to avoid all this, you'll need to write your own code to rasterize and print the PDF file - the time penalty for doing such a thing is huge [Commercial rasterization libraries sell for $500 and up for developer licenses, so duplicating that effort will take that many man hours]. Links for exploratory activity include: TallComponents and ICEsoft. The list is not comprehensive though.

I have thus outlined the methods to print a PDF file via a local system application.

Now onto that "silly, but not mediocre" request of silent print of an Internet based PDF file via the browser.
First of all, the browser's job is to display HTML and probably print a HTML page and nothing else. If you want it to print a PDF file, you'll need a browser helper object [BHO]. Try coding BHOs for all browsers and you'll know you're asking for manna.
You could rely on the Adobe Acrobat IE BHO/plugin, but it's not foolproof. If you want to know why, then you ought to read even more carefully now - there are umpteen ways/hacks/methods to disrupt that method. No known document exists, which can pinpoint the Internet Explorer and Acrobat BHO settings that will disrupt the plugin's functionality. Simply turning the plugin off, and switching it back on, by using the Acrobat Reader preferences menu can disrupt that functionality !! So much for relying on Acrobat Reader to be a carefully engineered application.

There is however an easier but hack 'n slash way to do it; use a document level JavaScript print action [ for more information have a look at the Adobe Acrobat JavaScript Guide(pdf) and Reference(pdf) ]. The print action is to execute on any valid action - usually a page action. But remember, JavaScript with Acrobat Reader can be disabled by the user. And it is impossible to close Acrobat Reader [not the opened document] after the print is finished - you'll need to "deliberately crash" Reader after the print is done. And if the PDF file opens in a browser window [ on account of the Acrobat Reader BHO], then you can forget about trying to close the PDF document in a programmatic manner. One of the iText examples shows how a servlet could be written to achieve this. Try a demo here! and make sure you run it in different browsers under different settings (for example : Win XP with SP 2 and without SP2), to understand the varying functionality that it provides. Golden rule to be respected in any software : Never make it behave inconsistently; your users could have trouble describing their problems.

You are lucky if you want an IE only solution with certain restrictions - hop onto ActiveX programming using MeadCo's ScriptX. As again, this solution depends on how stable Internet Explorer will turn out to be. Windows XP SP2 will definitely throw up the security message bar regarding the webpage's attempt to execute code (remember that webpages were to display HTML, not run some virus proof of concept).

Some people also like a "PDF Preview" in the browser. My answer would be - "that's not the job of the browser"; write something to help the browser do that - an ActiveX plugin, Browser Helper Object [BHO] or a Java applet is what you want. The Adobe Acrobat IE/Mozilla BHO might do the job for you if the stars were in the right alignment at the time of your birth; please dont ask me to support the article present at the link. And yes, you can print via the ActiveX plugin / Java applet; but it's not easy given the fact that brilliance always has a limit. ActiveX controls / Java applets that do that task come with the "$$" penalty. You should invest time or money to get that done.

And stop cribbing if you still havent ;-) or ask your boss to stop ;-). Atleast Adobe has licensed the PDF format so that you can use it (naughty boy, you thought everything was free, eh?). Imagine your silly client's fate if you had to write your proprietary file format to enforce a thinly veiled DRM idea of his. Frankly, I've had enough of the people asking for solutions like - " How can I get the client to print my PDF document without saving it. My budget is 50$ ". The answer is - you don't do it with basic code. You'll need to write a DRM plugin (which will cost your client moolah; remember nothing is free) or use Adobe LiveCycle Policy Server or buy security solutions from Adobe's partners, which ever fits into your budget and needs.

And on a parting note, "See ya !! Talk to you later !!!! I hope it was a pleasure to be bored by listening to me, as much as the pleasure I felt in boring you ;-) "