Sunday, June 18, 2006

Investment lessons in a single picture



Demonstrated to me by my banker friend Manmohan, at my new company. Fairly self explanatory.


Summarized as follows,
  • The chance of higher earnings/profit increases with more riskier investments.
  • The moment a person gets better earnings from an investment destination, the risk in investing w.r.t that destination virtually reduces, and it is wise to put in/ transfer more investments to that destination.
  • The bulk of investments should be in the least risky of all investment destinations.
  • It is generally better to put money in an investment destination where the principal amount is guaranteed to be returned. Direct equities and mutual funds are therefore not recomended as good investment destinations unless well understood.
Posted by Picasa

Tuesday, April 18, 2006

iText + Google Calendar = Neat Printouts!!


Google Calendar

Google Calendar (beta at time of writing) uses an Open Source library iText for printing calendars. The calendars are generated as PDF files that can be downloaded to the desktop or straightaway printed.
And this is for the programmers who work under pointy haired Dilberty bosses - the next time you're asked for internet based silent printing of PDF files, ask them to have a look at Google Calendar. You can't get more professional than that. Want anything better? You should think of writing your own proprietary browser (i.e a desktop app) or browser plugin or a Acrobat Reader plugin. Good luck buddy.

Technorati Tags: ,

Sunday, February 26, 2006

Adobe Acrobat 7.0 Browser Control Type Library (ActiveX or COM)

Update: Please check out the other PDF related links in the sidebar as well.

Some horrors are erased!! Gasp !!

I came across the Adobe Acrobat 7.0 Browser Control Type Library that could be used to display and print PDF files on desktop (Windows forms) applications [I haven't explored utilizing this in a webpage, so that will have to wait]. This is really good stuff for some people, for it is an ActiveX control that you could utilize to automate certain PDF-related actions, instead of relying on the AcroRd32.exe process commandline options or DDE messaging (or worse - Win32 programming).

However, every approach has it's pitfalls.
And someone's already had a problem -

Bah. I can't believe that Adobe hasn't stepped up to hand out a library that will print any PDF with all of the printer settings available to be changed. (You apparently can create an Interop around their Acrobat TypeLib, but its features are limited and it can only print to the default printer.)

Oh well, that's why folks can charge lots of money for libraries I guess.


And if you have funds, then take this advice from an ex-Adobe employee a bit seriously, for he says -

I can give ya a little info on this as a former Adobe Employee and specifically dealing with Acrobat. They do have a SDK that allows you to work with PDFs but its not free, it does work in both managed and unmanaged code and they give some good examples done in C# and VB.net but here again its not free. I would look to some of the free PDF stuff and that may work but this is workable if you have the SDK because you can actually create a PDF object and load in that PDF and then use c# to print it like you would anything else.

Here's the CodeProject article utilizing the Adobe Acrobat 7.0 ActiveX object for communicating with Acrobat Reader, along with the .Net example utilizing the Browser Type Library that is described in the comments.

Ok, want to know more about this library ? Hop on and download the Adobe Acrobat Inter-Application Communication (IAC) Reference from the Adobe site (pdf download). The Acrobat Browser Type Library is provided by the AxAcroPDFLib.AxAcroPDF object. This should be suitable for most of your needs. You can access it by adding a reference to AcroPDF.dll (that resides in the ActiveX directory under the Acrobat application directory) in your IDE environment.

Technical Notes
  • If you dont want to display the PDF document in a Windows Forms application, then you'll have to set the 'Visible' property to false. And if you want to send the PDF file directly to the printer, then you could use one of the methods provided by the control for that purpose - Print, PrintAll, PrintAllFit, PrintPages, PrintPagesFit. That should be suitable for your silent print needs in most cases.
  • You could experience problems when you have to print to the default printer. The default printer seem to be determined by Acrobat when it is loaded, and not by Windows. So changing the default printer mid-way through your application's runtime might cause unexpected behavior.
  • It is impossible to "name" / "determine" the printer that is used to print the document. Unless you don't want the user to access all the printer settings that could change the workflow, the only solution is to use the PrintWithDialog method (which is used to present the Acrobat Print Dialog Box that allows the user to make changes to the number of copies to be printed, or print to a file).

And to help people searching for solutions to common problems involving Adobe Acrobat (even Reader), I'm posting a link to Experts Exchange's set of time-tested Adobe Acrobat solutions.

Sunday, February 19, 2006

Internet based silent print using scripting languages (client side scripting). Why doesnt it work?

Update: Please do not forget to check the other PDF related links in the sidebar


Backdrop:
You want to allow your site visitors or customers to print documents without popping up a dialog box. You dont want to display a print dialog box because you feel it's a distraction or a devaition from normal website browsing flow. You're probably right in asking for a silent print feature in the browser object model, especially in the model provided by JavaScript (or any client side scripting language).

In this short article (that would be extended later on, depending on how many people read it and find it useful), I will detail why the browser model doesnt allow for silent printing , or automatic printing over an Internet-based website, with minimum or no user interaction.

Let's get into the details . . . . .

Basically, the job of the browser is to display webpages. Technically, this means that the browser is supposed to do only certain types of jobs - display HTML and allow for user-interaction with dynamic elements in the webpage. The browser is not allowed to do certain things by default - execute a snippet of code, download files that it doesnt understand or recognize, create system files that would execute in the background and log user activity.

You could extend the browser through plugins to do some of those jobs. You could write scripts in client-side scripting languages to perform "certain" tasks at the client's computer. But client side scripting languages would have limitations on what a scripter is able to command the client's browser to do.

That's purely because of the nature of the Internet - it's not a trusted zone.
What's a trusted zone ?
Well, basically it is a place where all the accessible resources are trusted by your computer (indirectly, you trust the resources at the place). It could be the local area network (so long as you're sure that your computer will be secure in it), or the network of your business associate or subsidary (as long as you chose to trust the resources availble on the foreign network).


So what's the thing with trusted zones and untrusted zones ?
Trusted zones as the name says are places trusted by the client. The resources in such a place are capable of interacting with the resources at the client's place. The degree of such interaction depends totally on the nature and degree of the trust. For example, you trust that executables residing on your own drive/computer are far more trustworthy than the ones you downloaded off www.hackers.com .


How does this affect automatic printing ? Or, why should this concern me ?
Code running in trusted environments have far more privileges than code running from an untrusted source. For example, executables in your computer are capable of deleting certain files from the hard disk, but code running in the browser will not be capable of doing so (until the user has granted that privilege). Printing, or better, automated printing requires that the end-user grant privileges to your code to access his/her printer (that's the user's printer not yours). The remote printer is a resource that belongs to the remote computer to which it is plugged into (even though, your office may have paid for it). To access the remote printer, it is required that the remote computer grant privileges to your code to access the remote printer.


Hell, I dont need privileges to access the remote printer. My boss says that the client trusts our website so the client should be able to print without any user interaction . Why I should believe you ?
There are websites out there that are capable of ruining your client's computer, so why not yours ? Is there anything special about your site ? And how does the client know that you are not using his printer to print advertisements (of no value to him) ?
And what is with the minimum user-interaction ? How are you so sure that the printer will always print the correct output ? What if something went wrong during printing ? Why does your client have a problem clicking a button in the print dialog box ?
The print dialog is there to -
  • grant access to your instructions to issue a print command,
  • allow the user to modify any necessary settings that are necessary for obtaining a successful printout.

I know of an environment where you're wrong !!
It's probably on ActiveX based websites. Nothing new there. ActiveX is known to be insecure, and is supported only on MS Internet Explorer. Dont be surprised if half of your users wont be able to access your website.
If you're talking about Java(not JavaScript), then again there is nothing new there. If you are able to produce a signed applet, then it is possible to automate the print process as much as possible. It's mileage depends on what libraries you are using, that is, how powerful the libraries are in their ability to access the printer options, and produce a clean printout with most of the settings provided by the programmer (that is, you) , and hence result in lesser user-interaction (because most of the settings are provided by the programmer).
There is also a possibility of Winforms applications being able to print over the Internet. Like the first option mentioned above, Winforms applications seems to work only on Microsoft Internet Explorer. Unlike ActiveX, Winforms applications written using the .Net framework are far more secure. The programmer can request for further privileges to be granted to the code during it's execution. So, if your program wants to access the printer, it must merely request for the necessary privileges. Of course, the user executing the program has to grant the privileges. But this is far better than having a code "lockout" - your code is unable to run because it cannot even request for such privileges; your client will have to explicitly make changes to his computer's configuration to allow it run.


What now?
Silent printing may be the tip of the iceberg. There are far more things that may not be possible using client side scripting languages. Most of them are not even things that should be solved using client side scripting. It's more an issue of whether people have analysed the business problem at hand, and have recognized that a browser based solution may not be the answer. Chances are that poorly architected and poorly designed solutions often call the limits of Javascript (and other scripting languages) into question. In most cases, the solution might be a Windows forms solution (or a background service or a cron job or . . . . . ), or sometimes a change in the business process (hah!).

Friday, January 27, 2006

Command line printing for Acrobat Reader

Update: Please check the other PDF and print related links in the sidebar


The command line options for Adobe Acrobat (Reader, Standard and Professional) are unsupported and undocumented features.
They're never mentioned anywhere in the documentation provided by Adobe except for the Acrobat Developer FAQ.
Here is the link to the Acrobat Developer FAQ (pdf file) that contains some documentation for using command line options; the relevant information can be found at the end of the document in page 27.
There's a warning though (and it's from Adobe) :
"These are unsupported command lines, but have worked for some developers. There is no documentation for these commands other than what is listed."
God speed with your quest.


Friday, January 20, 2006

Silent print a PDF file in Acrobat Reader (for PDF files served on the Internet)

Update: Please do not forget to check the other PDF and print related links in the sidebar


Some of you might have been awaiting this.
You maybe disappointed with the results of this article though, so dont expect too much - I'll explain what is possible with existing software and code ( but not voodoo programming : that's not the kind of code that's bug free ).

Prerequisites
  • Acrobat Javascript Guide (pdf file) and Reference (pdf file),
  • Some basic knowledge of Java servlets,
  • Some knowledge of the iText PDF library or atleast some enthusiasm to learn it.
Précis

I will demonstrate how to generate a PDF document with the iText PDF library. The document will have embedded Acrobat Javascript in it. Using the Acrobat Javascript commands I will ensure that the PDF document will be printed (to the default printer) when it is opened, followed by an attempt to close the document (which may not succeed when the document is opened inside a browser using the Acrobat Reader plugin or BHO). The PDF with the embedded Acrobat Javascript is served by a Java servlet which allows for the solution to be demonstrated over the internet.

Source Code

The WAR (Web ARchive) file containing the servlet and the iText library can be downloaded here(You can mail me if the download fails). You could straightaway deploy it on Apache Tomcat or any other servlet container or even a J2EE application server.

What's going on?

The important stuff is being done here:

/*
The output stream of the servlet to which the PDF document is sent.
*/
ServletOutputStream out = response.getOutputStream();

/*
Create an iText document - that's a PDF document that we're creating.
*/
Document document = new Document();

/*
Create a ByteArrayOutputStream to which the document will be written (the document will NOT be created on as a disk file at the server - use FileOutputStream for that.
*/
ByteArrayOutputStream baos = new ByteArrayOutputStream();

try {

/*
We get an instance of Pdfwriter. The instance is "listening" to the document object, and is "tied" to the byte array output stream. This means - whenever we add elements to the document the writer object picks it up and transfers it to the output stream.
*/
PdfWriter writer = PdfWriter.getInstance(document, baos);

/*
I'm setting the viewer preferences so that the user is not able to see the menubar or scrollbar inside the browser when this file is opened using the Acrobat Reader PDF plugin. This could be treated as a "security measure" for preventing users from saving the file or printing the file once again. Be careful - it's a viewer preference and not a security provision. It can be reverted by the user because it's only an indication to the Reader plugin on how the document is to be presented when it is initially loaded.
*/
writer.setViewerPreferences( PdfWriter.HideMenubar | PdfWriter.HideToolbar | PdfWriter.HideWindowUI );

/*
We have to open the document for writing information after the header and meta-information is added. Which means that we have to "prepare" the document before we are have to write the user-visible information.
*/
document.open();

/*
We now add a document level Javascript action so that the entire action is executed when the document is opened. To see what other options are available, you will have to go through the Acrobat Javascript Reference and Guide (links are same as above).
The necessary information can be found in the Doc object provided by the Acrobat Javascript model. The Doc object can be referenced usually by using the "this" object.
*/
writer.addJavaScript(
"this.print({bUI: false,bSilent: false,bShrinkToFit: true});" +
"\r\n" +
"this.closeDoc();"
);

/*
We add some dummy statement to be printed.
I hate to print a blank document, but I dont waste paper.
So, if you are environment conscious, please enter whitespaces instead.
Do not (God forbid), remove the line of code below to save on paper. LOL.
*/
document.add(new Chunk("Silent Auto Print"));

/*
You have to close the document when you're done with it.
*/
document.close();
}
catch (DocumentException e)
{
e.printStackTrace();
}

/*
I'm setting the content type of the response so that the browser will recognize that it is going to receive a PDF file. However, do not be too confident about this.
Some browsers - especially IE and Opera are known to do "content-sniffing" to determine what is to be done with a server's response. And that could change the equation drastically.
*/
response.setContentType("application/pdf");

/*
Some browsers are known to flip and throw up when they dont know how much data is going to be received by them. So it is wise to set the content length header before sending data to the browser.
*/
response.setContentLength(baos.size());

/*
We wrote the document to a ByteArrayOutputStream. Now flush that stream to the servlet's response object.
*/
baos.writeTo(out);

/*
Flush the servlet's response object so that the servlet responds to the browser's request.
*/
out.flush();


A Trial Run

You could go to this demonstration page and see how it works. Be sure to try this out in different environments - with and without Acrobat Reader browser plugin installed, with and without SP2 installed on Windows XP, different browsers (especially Firefox and Opera), and even when the user is able to save the file and then open it. You'll learn quite a lot on why I chose to write the servlet this way and not any other way. Consistency matters a lot when it comes to the internet.


Saturday, January 07, 2006

Root finding algorithms and their "near similarity" to search algorithms

Just don't know how to get started on this.
Two vast topics and I cant pinpoint the origin of this idea of mine.

This idea sprang up sometime in the 3rd semester of my CS degree during the Applied Mathematics course. No point giving credit to the course because it never exercised my mind.
Back onto the spicy stuff anyway.

The Applied Mathematics course had a section on root-finding algorithms for polynomial equations. I noticed a distinctive similarity between the bisection method to find the root of a polynomial function f(x), and the binary search algorithm to find the position of a key.
For starters, especially those who are not from the CS/Math stream, this might be a bit confusing so I'll provide extensive details as far as possible.

The bisection method is the simplest root finding algorithm. One can find the root of a function f(x)=0 by trying to approximate the range of values of the function in which there is a better probability of finding the root of the function. Read on if you still didn't understand.
Basically any function f(x)=0 can be treated as a series of values that vary with the variable (or parameter) x. So, if you "feed" in different values for x, you should get different values for f(x). The root of the function f(x) is that value of x at which f(x) equals 0. You normally don't have a lot of roots for a function f(x) - the number of roots for f(x) depends solely on the degree of the function f(x).
To demonstrate the similarity between the concept of root finding of a mathematical function f(x), and searching for key in a stream of numbers (a stream or sequence of numbers to which binary search or any other search algorithm can be applied), I make one important assumption :

Any stream of numbers [0, 1, 5, 7 , 14, 76, 196, 256, 983, 1005,.......] can be represented by an imaginary function f(x) - imaginary as in virtual/hallucination, not (-1)^(1/2).
OR
Any function f(x) will have a corresponding stream of "data" that depends on what values of x have been applied to the function to produce the stream.

We'll put this important stumbling block behind us.
Now for the correlation between the word "root" of a function f(x)=0, and the word "key" of the binary search algorithm.

The "root" of the function f(x)=0, is that value of x that will ensure f(x) will produce a value of 0.
The "key" of the binary search algorithm is a possible value among the series of values produced by application of differing values of x to f(x). If the key element has been found in the stream of values, then our search is successful; if it hasn't been found, then the search is simply unsuccessful.
Finding the key in a stream is the same as finding the root of the function f(x) = key, x = 1,2,3,4,5........ :x denotes the position of an element in the stream [ if f(x) has the values 12, 45 , 65, 54 , 43, 78, then f(1) = 12, f(2) = 45, f(3) = 65 , f(4) = 54, f(5) = 43 , and so on].
If we find the key among the stream of values then we can find out the corresponding position in the stream, which is usually what most CS search algorithms have to do.

Similarity between the bisection method and the binary search algorithm

In both these algorithms, we divide the interval of values in half. For the binary search algorithm, the stream of values must be sorted so that it satisfies the requirement for the bisection method that can operate only on a continuous function f(x) !!

The binary search algorithm is definitely not the fastest as we know. But it definitely has predictable behavior as we all know - O(log n).

The question is now of beating this - application of algorithms that can converge on a "key" or the "root of the corresponding function f(x)" faster than binary search or bisection method can.
There are better root-finding methods than bisection method - Newton's method, Secant method, false position, linear interpolation(secant method again), polynomial interpolation(Muller's method), inverse quadratic interpolation, and Brent's method. Phew !!!!!!
Of course not all can be applied for searching, and not all can have predictable O(n) behaviors.
More research is required in this area. And certainly in designing the data structures/ databases that are optimized for such types of searches; certainly in determining if certain functions can be readily applied to existing data structures / databases.
Some theoretical problems too exist and they've been thankfully pointed out by Jim Lyon and Roman Werpachowski. At first glance, these could be ironed out by approximation and tweaking of the algorithms. But it still requires research !! And so I have to end my lecture here.

A discussion of this algorithmic technique can be found at JoS.