PDF Generation using NodeJS

Contributor - 23 February 2018 - 12 Mins
Contributor - 23 February 2018 - 12 Mins
This blog is about a feature request we received some time ago from one of our clients. The request was to generate a PDF with a given template, with dynamic data. A fair request one might say. In this read, I will briefly take you through the procedure we took to generate the PDF.
Oh and just to get you all intrigued before the whole read, you can check out a sample PDF that was generated at: https://github.com/satyendra-singh-talentica/node-pdf/blob/master/output/pdf-1516858170954.pdf
Some PDF features are available at https://github.com/satyendra-singh-talentica/node-pdf/blob/master/pdf-features.PNG
Coming back to the main agenda of the blog, we had to generate a PDF. Obviously the first thing we did is dig up the most popular libraries to generate a PDF. An important point before we proceed, our project was in NodeJS. So the search for the most popular libraries narrowed down to the following two choices:
o PDFKit
o jsPDF
No doubt these were great libraries with lots of downloads and good reviews, we had one problem. Both the libraries treat your pages as canvas with X and Y coordinates. With this approach, you have to deal with a lot of numbers as values for x and y, for positioning an image, text, etc on a page.
We could foresee a lot of numbers in the code which could be difficult to track and hence will lead to slow development. It’d be a very tedious job to get a sophisticated PDF ready in short time.
On a quest for a better and easier solution, we realized that there were a few good node libraries to convert from HTML to PDF and then came up with an approach to this problem
1. Generate a server side HTML with dynamic data
2. Convert it to PDF with a good library
The obvious advantage of this approach is that developers working with HTML, JS, CSS were very comfortable. Another advantage was use of Bootstrap for layouts and styling, font-awesome icons, web fonts, and an overall better control over the positioning and placement of images, texts and hyperlinks on the page.
To generate server-side HTML with dynamic data, we choose EJS(embedded Javascript). EJS is exactly like HTML, except it has placeholders with special syntax and these placeholders can be replaced with data by the server, to give out our standard HTML code that browsers understand. Let me try and explain with a small example:
EJS code + Data = HTML
EJS code:
<h1> <%= title %> </h1>
Data:
{ title: "PDF Generator" }
Resulting HTML:
<h1>PDF Generator</h1>
For more on EJS, please follow http://ejs.co/
In our code snippet, “dataGathererService()” gets the data. You can have the data coming from an API or database.
“htmlGenerator()” takes the responsibility of converting an EJS template + data to give HTML.
Now that we have the HTML, we have to convert it to PDF. For this purpose, we choose html-pdf-chrome, a node library that uses Google Chrome to convert HTML to PDF.
Because html-pdf-chrome uses Google Chrome:
It is strongly recommended that you keep Chrome running side-by-side with Node.js. There is a significant overhead starting up Chrome for each PDF generation which can be easily avoided.
It is suggested to use PM2 to ensure Chrome continues to run. If it crashes, it restarts automatically.
To running Chrome headless, read https://developers.google.com/web/updates/2017/04/headless-chrome
I am on Windows and so I keep my Chrome running with the following command:
"C:\Program Files (x86)\Google\Chrome\Application\chrome.exe" --headless --disable-gpu --enable-logging --remote-debugging-port=9222
“generatePDF()” does it’s part to convert HTML to PDF..
As you can see, the printOptions control the dimensions of how your converted PDF will be.
One small but important thing to note is that the pageHeight is 8in and in our templete.ejs, the div for each page has a height of 8in. It’s better if these go together, else you will get a distorted PDF.
Flow:
Try out the sample code from https://github.com/satyendra-singh-talentica/node-pdf
Do comment or write to me at satyendra.singh@talentica.com if you have any questions.
"The challenges mentioned is very realistic and I believe everyone who is trying to genereate PDF on node are facing the same. However the solution provided requires chrome running all the time which won't be suitable for everyone to do on their server. I found the better declarative node based library which doesn't require x,y position and works like charm as of now. Although it requires learning the properties as they are not exact html/css but at least they are in json format so that will be easy for developer. Check http://pdfmake.org".
"Hi Satyendra, I am trying to generate the pdf from a dynamic json file. I know the tags from where I need to pick the data. For this purpose is it good to first make html and then pdf or straight away pdf can be generated?".
"using PDFKit can i create multiple pages in one pdf using node.js or Angular5.Let me know".
"Thank you, this is exactly what I was looking for, surprisingly up to date resource.".
"Really helpful, Thank you for posting. Exactly what I need.".
"Thank you for publishing this tutorial.".
"Thank you for providing the valuable information. Node JS Online Training".
"excelent".
"Thanks for good tutorial.".