Friday, January 27, 2006

Command line printing for Acrobat Reader

Update: Please check the other PDF and print related links in the sidebar

The command line options for Adobe Acrobat (Reader, Standard and Professional) are unsupported and undocumented features.
They're never mentioned anywhere in the documentation provided by Adobe except for the Acrobat Developer FAQ.
Here is the link to the Acrobat Developer FAQ (pdf file) that contains some documentation for using command line options; the relevant information can be found at the end of the document in page 27.
There's a warning though (and it's from Adobe) :
"These are unsupported command lines, but have worked for some developers. There is no documentation for these commands other than what is listed."
God speed with your quest.

Friday, January 20, 2006

Silent print a PDF file in Acrobat Reader (for PDF files served on the Internet)

Update: Please do not forget to check the other PDF and print related links in the sidebar

Some of you might have been awaiting this.
You maybe disappointed with the results of this article though, so dont expect too much - I'll explain what is possible with existing software and code ( but not voodoo programming : that's not the kind of code that's bug free ).

  • Acrobat Javascript Guide (pdf file) and Reference (pdf file),
  • Some basic knowledge of Java servlets,
  • Some knowledge of the iText PDF library or atleast some enthusiasm to learn it.

I will demonstrate how to generate a PDF document with the iText PDF library. The document will have embedded Acrobat Javascript in it. Using the Acrobat Javascript commands I will ensure that the PDF document will be printed (to the default printer) when it is opened, followed by an attempt to close the document (which may not succeed when the document is opened inside a browser using the Acrobat Reader plugin or BHO). The PDF with the embedded Acrobat Javascript is served by a Java servlet which allows for the solution to be demonstrated over the internet.

Source Code

The WAR (Web ARchive) file containing the servlet and the iText library can be downloaded here(You can mail me if the download fails). You could straightaway deploy it on Apache Tomcat or any other servlet container or even a J2EE application server.

What's going on?

The important stuff is being done here:

The output stream of the servlet to which the PDF document is sent.
ServletOutputStream out = response.getOutputStream();

Create an iText document - that's a PDF document that we're creating.
Document document = new Document();

Create a ByteArrayOutputStream to which the document will be written (the document will NOT be created on as a disk file at the server - use FileOutputStream for that.
ByteArrayOutputStream baos = new ByteArrayOutputStream();

try {

We get an instance of Pdfwriter. The instance is "listening" to the document object, and is "tied" to the byte array output stream. This means - whenever we add elements to the document the writer object picks it up and transfers it to the output stream.
PdfWriter writer = PdfWriter.getInstance(document, baos);

I'm setting the viewer preferences so that the user is not able to see the menubar or scrollbar inside the browser when this file is opened using the Acrobat Reader PDF plugin. This could be treated as a "security measure" for preventing users from saving the file or printing the file once again. Be careful - it's a viewer preference and not a security provision. It can be reverted by the user because it's only an indication to the Reader plugin on how the document is to be presented when it is initially loaded.
writer.setViewerPreferences( PdfWriter.HideMenubar | PdfWriter.HideToolbar | PdfWriter.HideWindowUI );

We have to open the document for writing information after the header and meta-information is added. Which means that we have to "prepare" the document before we are have to write the user-visible information.

We now add a document level Javascript action so that the entire action is executed when the document is opened. To see what other options are available, you will have to go through the Acrobat Javascript Reference and Guide (links are same as above).
The necessary information can be found in the Doc object provided by the Acrobat Javascript model. The Doc object can be referenced usually by using the "this" object.
"this.print({bUI: false,bSilent: false,bShrinkToFit: true});" +
"\r\n" +

We add some dummy statement to be printed.
I hate to print a blank document, but I dont waste paper.
So, if you are environment conscious, please enter whitespaces instead.
Do not (God forbid), remove the line of code below to save on paper. LOL.
document.add(new Chunk("Silent Auto Print"));

You have to close the document when you're done with it.
catch (DocumentException e)

I'm setting the content type of the response so that the browser will recognize that it is going to receive a PDF file. However, do not be too confident about this.
Some browsers - especially IE and Opera are known to do "content-sniffing" to determine what is to be done with a server's response. And that could change the equation drastically.

Some browsers are known to flip and throw up when they dont know how much data is going to be received by them. So it is wise to set the content length header before sending data to the browser.

We wrote the document to a ByteArrayOutputStream. Now flush that stream to the servlet's response object.

Flush the servlet's response object so that the servlet responds to the browser's request.

A Trial Run

You could go to this demonstration page and see how it works. Be sure to try this out in different environments - with and without Acrobat Reader browser plugin installed, with and without SP2 installed on Windows XP, different browsers (especially Firefox and Opera), and even when the user is able to save the file and then open it. You'll learn quite a lot on why I chose to write the servlet this way and not any other way. Consistency matters a lot when it comes to the internet.

Saturday, January 07, 2006

Root finding algorithms and their "near similarity" to search algorithms

Just don't know how to get started on this.
Two vast topics and I cant pinpoint the origin of this idea of mine.

This idea sprang up sometime in the 3rd semester of my CS degree during the Applied Mathematics course. No point giving credit to the course because it never exercised my mind.
Back onto the spicy stuff anyway.

The Applied Mathematics course had a section on root-finding algorithms for polynomial equations. I noticed a distinctive similarity between the bisection method to find the root of a polynomial function f(x), and the binary search algorithm to find the position of a key.
For starters, especially those who are not from the CS/Math stream, this might be a bit confusing so I'll provide extensive details as far as possible.

The bisection method is the simplest root finding algorithm. One can find the root of a function f(x)=0 by trying to approximate the range of values of the function in which there is a better probability of finding the root of the function. Read on if you still didn't understand.
Basically any function f(x)=0 can be treated as a series of values that vary with the variable (or parameter) x. So, if you "feed" in different values for x, you should get different values for f(x). The root of the function f(x) is that value of x at which f(x) equals 0. You normally don't have a lot of roots for a function f(x) - the number of roots for f(x) depends solely on the degree of the function f(x).
To demonstrate the similarity between the concept of root finding of a mathematical function f(x), and searching for key in a stream of numbers (a stream or sequence of numbers to which binary search or any other search algorithm can be applied), I make one important assumption :

Any stream of numbers [0, 1, 5, 7 , 14, 76, 196, 256, 983, 1005,.......] can be represented by an imaginary function f(x) - imaginary as in virtual/hallucination, not (-1)^(1/2).
Any function f(x) will have a corresponding stream of "data" that depends on what values of x have been applied to the function to produce the stream.

We'll put this important stumbling block behind us.
Now for the correlation between the word "root" of a function f(x)=0, and the word "key" of the binary search algorithm.

The "root" of the function f(x)=0, is that value of x that will ensure f(x) will produce a value of 0.
The "key" of the binary search algorithm is a possible value among the series of values produced by application of differing values of x to f(x). If the key element has been found in the stream of values, then our search is successful; if it hasn't been found, then the search is simply unsuccessful.
Finding the key in a stream is the same as finding the root of the function f(x) = key, x = 1,2,3,4,5........ :x denotes the position of an element in the stream [ if f(x) has the values 12, 45 , 65, 54 , 43, 78, then f(1) = 12, f(2) = 45, f(3) = 65 , f(4) = 54, f(5) = 43 , and so on].
If we find the key among the stream of values then we can find out the corresponding position in the stream, which is usually what most CS search algorithms have to do.

Similarity between the bisection method and the binary search algorithm

In both these algorithms, we divide the interval of values in half. For the binary search algorithm, the stream of values must be sorted so that it satisfies the requirement for the bisection method that can operate only on a continuous function f(x) !!

The binary search algorithm is definitely not the fastest as we know. But it definitely has predictable behavior as we all know - O(log n).

The question is now of beating this - application of algorithms that can converge on a "key" or the "root of the corresponding function f(x)" faster than binary search or bisection method can.
There are better root-finding methods than bisection method - Newton's method, Secant method, false position, linear interpolation(secant method again), polynomial interpolation(Muller's method), inverse quadratic interpolation, and Brent's method. Phew !!!!!!
Of course not all can be applied for searching, and not all can have predictable O(n) behaviors.
More research is required in this area. And certainly in designing the data structures/ databases that are optimized for such types of searches; certainly in determining if certain functions can be readily applied to existing data structures / databases.
Some theoretical problems too exist and they've been thankfully pointed out by Jim Lyon and Roman Werpachowski. At first glance, these could be ironed out by approximation and tweaking of the algorithms. But it still requires research !! And so I have to end my lecture here.

A discussion of this algorithmic technique can be found at JoS.