Close

Generating a Downloadable Word Document in the Browser

As a developer, generating files for a user to download from a website is a fairly common task. The use cases range from providing a PDF of event tickets or a receipt, to generating a custom image from user-entered data. Regardless of the purpose, a developer is faced with a decision, “Do I generate the file server- or client-side?” In most cases, generated files for user download should be created on the server.

One reason to generate files server-side is security. For example, when applying an image watermark client-side anyone can use a proxy or browser developer tools to get the original image. If the watermark is applied server-side the user never has access to the original image and cannot strip out the watermark.

For most situations, however, the biggest issue is cross-browser compatability. For example, generating an image with <canvas> doesn’t work in Internet Explorer 8 (IE8), and triggering a download doesn’t fully work in Safari (tested in v7.1.3).

A few months ago I was building a web application to support modern browsers and back to IE9. In the web app, the user fills out a long form (which is persisted with localStorage) and can then export their form responses to a PDF or Microsoft Word file (I’ll use the term “DOC file” for brevity). However, I was not permitted to write any server-side code, so I had to figure out how to generate and download these files completely inside a user’s browser.

To generate the PDF file I used jsPDF, a widely used solution for client-side PDF generation. The provided methods for layout are very low-level, so to use it you need to implement your own line wrapping, text alignment, bullets, etc. I could write an entire blog post on using that library, but in this post I want to talk about generating the DOC file.

There are three parts to this problem, which I will cover in detail below.

1. Generating a basic DOC file

2. Styling the output

3. Triggering a download of the file with a “.doc” extension

 

Generating a basic doc file

The first thing I looked for was a library to make my work easier. I found a few that didn’t work including:

• officegen – NodeJS only.

DOCX.js (created by MrRio, who also created jsPDF) – basic prototype, not production-ready.

DOCX.js fork (by Stephen Hardy) – unworkable license.

I was really hoping to use Stephen Hardy’s DOCX.js library, however the license states “The licenses…extend only to the software or derivative works that (1) are run on a Microsoft Windows operating system product, and (2) are not Excluded Products” (emphasis mine). This clause means if you use this library and a random user accesses your site on a Mac (or iPhone or Android Phone or Linux or …), you are in violation of the license. For a frontend application that’s absolutely a non-starter unless you’re developing for a tightly-controlled intranet.

After a bit more research I knew I needed to implement my own solution.

A “.docx” file is, under the hood, just a ZIP file containing a collection of custom XML files and any embedded images or attachments. Finding a cross-browser compatible library to generate a ZIP file is almost as difficult as finding one to generate a DOC file, so I decided that wasn’t the way to go. I knew Word is capable of opening HTML files, so I tried taking an HTML file, renaming it with a “.doc” extension, and opening it with Word. Hallelujah, a solution! Word will render list elements, header elements, italics, etc. properly, so all that is needed is a semantically written HTML string saved with a “.doc” extension.

With jQuery included in the project, generating an HTML string is easy:

var htmlString = $(‘<html>’).html(‘<body>’ +

‘<h1>A word document</h1>’ +

‘<p>This is the content of the word document</p>’ +

‘</body>’

).get().outerHTML;

Generating the <html> and <body> elements is crucial — otherwise Word won’t interpret it as an HTML file and instead will show the raw HTML code as the document content.

Styling the output

Word supports a small, outdated subset of CSS. Properties such as background-color(aka highlighting), color, font-size, font-weight, text-align, margin, and padding are supported. CSS3 selectors, as well as properties like float and positiondon’t work. You can give elements class names and IDs to target with a stylesheet. Styles can either be inline or embedded in the <head> of the document. I recommend putting styles in the head of the document for a consistent looking document.

Word has several different visual layouts when editing a document. Most users are used to the “Print Layout”, which mimics how the document will look if printed. By default, when Word interprets an HTML document it will show the document in “Web Layout”. Some users might find this disconcerting, so I recommend forcing Print Layout in your document. To do this, add the following attributes to the <html> element (be sure to include the xmlns namespaces):

xmlns:office="urn:schemas-microsoft-com:office:office"
• xmlns:word="urn:schemas-microsoft-com:office:word"
• xmlns="http://www.w3.org/TR/REC-html40"

and add the following XML markup inside the <head> of the HTML document:

<xml>

<word:WordDocument>

<word:View>Print</word:View>

<word:Zoom>90</word:Zoom>

<word:DoNotOptimizeForBrowser/>

</word:WordDocument>

</xml>

Note: I had a bit of trouble getting jQuery to insert namespaced elements. There might be a plugin or another library to help make this work. I ended up just inserting it into the output HTML string using string manipulation.

Triggering a download of the file with a .doc extension

Now that I had a document that Word would open properly, I needed to trigger a download in the user’s browser.

If I naively trigger a download, the browser will either assume the file is an HTML file or it will assume it is a plaintext file. That’s not a great solution, so I needed to force, or at least suggest, a “.doc” extension to the user.

The easiest way of triggering a download is using the download attribute of the <a> tag, specifying a data URI of the document as the href value. The value of the downloadattribute will be the filename of the downloaded file. Unfortunately, this is only supported in Chrome and Firefox at the moment. For modern browsers that don’t support the download attribute (IE10+, Safari), a Blob of the file can be created, and then saved with the window.saveAs() method.

That was a whole lot of information really fast; let’s break it down. A data URI is a representation of binary or string data in a format that can be used as a URL. For instance, an image encoded as a data URI can be used as the src value of an <img> tag. In this case, we want to encode an HTML document:

var htmlDocument = ‘<html><body>content here</body></html>’;
var dataUri = ‘data:text/html,’ + encodeURIComponent(htmlDocument);
// Output - data:text/html,%3Chtml%3E%3Cbody%3Econtent%20here%3C%2Fbody%3E%3C%2Fhtml%3E

This data URI can now be used as the href value in our <a> tag with the download attribute:

<a download=”export.doc” href=”data:text/html,%3Chtml%3E%3Cbody%3Econtent%20here%3C%2Fbody%3E%3C%2Fhtml%3E”>Download</a>

A Blob is a container for binary data, usually representing a file. After putting the HTML string content inside a Blob instance the file can be saved using the window.saveAs()method.


Creating the blob

var htmlString = ‘<html><body>content here</body></html>’;

var byteNumbers = new Uint8Array(htmlString.length);

for (var i = 0; i < htmlString.length; i++) {

byteNumbers[i] = htmlString.charCodeAt(i);

}
var blob = new Blob([byteNumbers], {type: ‘text/html’});


Saving the file

window.saveAs(blob, ‘export.doc’);

This strategy doesn’t work in Safari or IE9. In IE9, Blob and window.saveAs() aren’t supported and can’t be polyfilled. In Safari, the file is opened in a new tab instead of being downloaded. This means that the user needs to manually save the file themselves. However, Safari will use the mimetype to suggest a file extension, which is “.html” in this case. That’s no good! So another solution is required.

Note: Implementing the download attribute method as well as the Blob method might be excessive, since all browsers that support the download attribute also support Blob. I decided to implement both, but only implementing Blob will work as well

Downloadify

Downloadify is a flash-based shim for downloading files in unsupported browsers (not to be confused with downloadify, which is a patched Spotify exploit). The library uses a Flash object to download a file passed to it via JavaScript. It works in any browser that has Flash 10 or newer installed, which means it can be used all the way back to IE6! See a demo here.

If the user doesn’t have flash installed an error message should be displayed to the user. Flash support can be checked with the following function:

function isFlashInstalled() {

var version = window.swfobject.getFlashPlayerVersion();

// flash v10 is minimum that Downloadify works with. Any

// lower than that and we can treat it as if it isn't installed

return version && version.major && version.major >= 10;

}

Forgetting this cost me about two hours wondering why Downloadify wasn’t working in a Virtual Machine that didn’t have Flash installed. Save yourself and your users the trouble and ensure you have a fallback before proceeding.

Downloadify takes a provided button image and displays that in the Flash object. When a user clicks on the Flash object, Flash triggers a download. Once the Downloadify settings are provided you can’t change anything except to delete the entire button and start over. Also, the download cannot be triggered programmatically; only user interaction can start the download.

I’ll let you read the Downloadify documentation to get it working for you, but here are some things to remember:

• Not responsive – since the button image is a static width, the buttons don’t resize well. They also don’t work properly when increasing browser zoom level.
• Use transparent images – using transparent button images allows you to use CSS properties to fully style the button’s background color and hover state.
• Specify a “.doc” filename – that’s the whole point!
• Use a “string” dataType value.

Since Downloadify should be used when the download attribute and window.saveAs()both don’t work, we can use feature detection to identify when we should use Downloadify. However, we need to use it in Safari, which will report full support for window.saveAs(). Unfortunately, this means we need to fall back on User Agent sniffing to identify Safari browsers and use Downloadify in those cases.

Summary

Combining these three options for downloading a file results in complete browser coverage while utilizing modern web standards when available. Generating a file client-side has many advantages over server-side generation, and as client machines become more powerful it will become an ever-more attractive option.