As a background task for our web application BeeBole, I was looking for an easy and efficient way to produce PDF documents without being stuck with postscript like syntax or being feature limited in a design point of view.
Last week I discovered the tool I was looking for : WKHTMLTOPDF
By leveraging the power of the webkit engine through QtWebKit module, this thing is converting HTML with full CSS support to PDF the same way you “Save as PDF” from your browser
In this article, I’ll show you the very first prototype I did of a possible WKHTMLTOPDF integration with our application.
Let’s start by installing this nifty tool the easy way (tested on Ubuntu Hardy and Jaunty 64 bit):
wget http://wkhtmltopdf.googlecode.com/files/wkhtmltopdf-0.8.3-static.tar.bz2
tar -jxvf wkhtmltopdf-0.8.3-static.tar.bz2
sudo aptitude install ia32-libsNext, you’ll have to make a symbolic link pointing to WKHTMLTOPDF in /usr/local/bin.
sudo ln -s /full_path/WKHTMLTOPDF /usr/local/bin/WKHTMLTOPDFEt voilà, we are ready to go!
Here is a way to post an arbitrary HTML portion of our current document to the server and have it converted as a PDF on the fly. We will use the following workflow:
- Post the innerHTML to the server
- Write a temporary HTML file with the posted data (file:write_file)
- Convert this file to PDF using the command line WKHTMLTOPDF (os:cmd)
- Read and store the PDF file content to a variable (file:read_file)
- Delete the temporary HTML and PDF files (file:delete)
- Send back the final document to the client using the ‘application/pdf’ content type header
We will be using Mochiweb as our web server (You can follow our tutorial about Mochiweb if you feel lost).
Just create priv/www/index.html and copy the following HTML in it :
http://friendpaste.com/7Kvh0SWRMyCgTMMEOy4x2B
and here is the Erlang server code which will handle the post request (myapp_web.erl):
http://friendpaste.com/538mWzeOfBKxiE2uScDjT4
No magic trick! Regarding the Javascript, I’m just filling an input text value with the innerHTML source and submit the form to the server.
The Erlang code is really straight forward. We defined a new case clause under the ‘POST’ part named ‘pdf’.
In order to generate the temporary file names, I’m simply concatenating the current date time. The important part here is to set the response’s content type to ‘application/pdf’.
Now, we can test our page by connecting to http://127.0.0.1:8000/pdf and clicking the ‘>>>PDF’ button (tested under Firefox).
I hope this was useful for you to start toying with that tool which is certainly going to replace their licensed versions counterpart
Pingback: Convert HTML to PDF with Full CSS Support, an OpenSource Alternative Based on Webkit | BeeBole « Netcrema - creme de la social news via digg + delicious + stumpleupon + reddit
I’m really grateful to you guys for pointing this out – it will be so useful to me in my current project.
Awesome! Thanks for posting this.
-JP
I just tried it and the results are great!
Pingback: Links creativos para el 06.08 | Eliseos.net
Pingback: links for 2009-08-07 | synapsenschnappsen
I doscovered one extremely nice aspect of this tool is that it will execute jQuery *before* printing – so if you have any table striping etc. it will still work.
Probably the best HTML to PDF solution I have found.
Pingback: 27 fresh links for my friends, as always webdesign and tech related. « Adrian Zyzik’s Weblog
Pingback: The Abarentos Narrative » links for 2009-08-07
Pingback: Web Design South Africa » Blog Archive » Convert HTML to PDF with Full CSS Support, an OpenSource …
“wkhtmltopdf” is definitely a recommended tool for converting html to pdf. I tried it and it has lotsa useful features.
But too bad our server runs on Ubuntu LTS 6.06, and they are using slightly older version required by wkhtmltopdf.
Pingback: Can you create a great professional website using free open source software? | Open Source Project Management Software
Pingback: Dvd 2 IPod – Convert DVDs To IPod Format. | Used Review
Pingback: Destillat KW33-2009 | duetsch.info - GNU/Linux, Open Source, Softwareentwicklung, Selbstmanagement, Vim ...
Thanks! Been searching for this for months!!
Works perfectly out of box on linux machine.
Pingback: Cool articles – SEO, blogging, internet marketing(august17th-august31st 2009) « Stefanm, my link collection
I ran it and so far so good, except one thing.
the pdf output doesnt display letters or words, only boxes.
My server doesnt have x windows installed either.
Is X windows required in order to supply the fonts?
Thanks
@leo
I encountered a similar problem on my 64 bit box and had to install the 32 bit libraries :
sudo aptitude install ia32-libs
If the static version doesn’t work, you can try to compile it
http://code.google.com/p/wkhtmltopdf/wiki/compilation
Hope it helps
It’s really an interesting tool, but I can’t get it to output my PDFs properly. The kerning is totally screwed up. There is also a bug ticket over there (http://code.google.com/p/wkhtmltopdf/issues/detail?id=72), but the author is unfortunately not really eager on fixing this. And nobody knows what the problem is.
Pingback: Html+Css to Pdf by mebae - Pearltrees
Pingback: pdf by fogus - Pearltrees