Server side Gists with PHP and Smarty

The introduction of PJAX & CSS3 page loads on this site back in January was on the whole a smooth and pleasant learning experience—though admittedly not without a few (largely unexpected) bumps in the road. The last of those was the issue of embedded gists not displaying when a page was loaded asynchronously, which like all good bugs it only reared its ugly head at the 11th hour after the PJAX logic had been deployed—meaning a quick fix was called for. This article discusses that fix, still in place (ahem) nearly two months later…

Normal gisting

The official way to embed a gist is by inserting a script tag in the DOM where you want the code snippet to appear, like so: <script src="https://gist.github.com/1972983.js"></script>. The contents of this script file are about as simple as it gets—a couple of calls to document.write to add a css file to the DOM (to ensure the subsequent gist is styled correctly) and render out the relevant markup of the gist itself. It's not pretty but looks a bit like this:

Note that of course the embedding mechanism used by gists is determined entirely by GitHub—it is not something I have any control over.

What’s the problem?

So far, so good—simple and effective. During a normal page load your browser encounters each gist and fetches, executes and renders its contents as expected. The issue surfaces when any content containing one of these script tags is loaded asynchronously, which in the case of this website will quite often be when a user clicks through to a code-heavy article from any other page on the site. Although jQuery will ordinarily process and excecute script tags, it stubbornly refuses to process any calls to document.write. A little background reading explains why this is the case: invoking it on a ‘closed’ document—one which has already loaded—will overwrite the entire thing with the contents of the document.write call. Nasty.

The A Solution

Yes, a solution. It's not pretty in places and not particularly robust, and there are countless other approachess which would doubtless be more elegant, but time was of the essence. To be honest I quite fancied getting my hands dirty anyway. A little bit of mud is fun sometimes.

Truth be told, I'd never been overjoyed at having lots of script tags littered throughout the more code-heavy articles on the site, especially on the odd occasion where gist.github.com would take a second or two to respond and thus block the rest of the page loading (we can't defer the scripts either because of those pesky document.write calls). This seemed like the perfect excuse to shift the responsibility for loading the gists to the server—especially as the result of each embedded snippet is just plain old HTML anyway. After playing around with a few ideas, I settled on the following approach:

  1. Create a custom ‘tag’ I could use in an article to identify a gist I'd like to embed server side
  2. Create a PHP method to fetch and process a gist
  3. Create a Smarty modifier to invoke the method from step 2 on any tags matching the pattern in step 1.

Step one: Creating a custom embed tag

This was pretty trivial—I just had to come up with a suitable pattern which I could use when writing an article to signify a gist which should be embedded by the server. The only real rule here was to make sure that a) the patten wasn't something which would occur in natural writing, and b) that it wasn't already a valid HTML5 tag (or likely to become one). Previously when embedding a gist I'd simply written something like the following:

I could have just chosen to process every gist server side by looking for script tags pointing to gist.github.com, stripping them out and replacing them with the contents of my (as yet unwritten) PHP method, but I wanted the change to be opt-in rather than blindly processing all existing gists. I went for a Wordpress style embed tag, which ended up looking like this:

Nice and simple, and backwards compatible—meaning I could still embed gists the old fashioned way if I really wanted to.

Step two: Server side processing

Next up was a PHP method to search for these tags within an article and replace them with contents of the JavaScript file to which they refer. I'm not going to pretend for a moment that the following code is the best way of accomplishing this and if you're seriously looking at doing something similar then I'd urge you to think about failure handling, caching and the usual downsides of ‘scraping’ Vs using official APIs:

The above method is crude but given time constraints it formed an acceptable basis for a temporary (ahem) solution. It simply replaces any occurrence of [gist id=xxxx] with the contents of the relevant JavaScript file on gist.github.com, along with a bit of quick and dirty post-processing of the HTML in the returned JavaScript to clean it up a little. It's worth noting that we don't execute the returned JavaScript at all (nor can we, this is PHP and the V8 extension didn't exist), but we don't have to because it's just HTML wrapped in document.write calls.

Step three: Create a Smarty modifier

The last step was to create a simple Smarty modifier to hook everything together. Smarty modifiers are described as:

Modifiers are little functions that are applied to a variable in the template before it is displayed or used in some other context. Modifiers can be chained together.

This sounds good to me—we can write something which we apply to the content of each article to look through its content and ‘modify’ it by re-writing any occurence of [gist id=xxxx] with its correct content. The current implementation of this is as follows:

This implementation is pretty similar to the first stab at the method, albeit with some rather aggressive caching and woefully poor error handling. Its mere presence is a bit of a stain on the otherwise clean paynedigital codebase, but for a temporary (ahem) fix, I can live with it.

All that remains is to invoke this modifier wherever we might want to replace [gist id=xxxx]. In my case this is only ever when rendering an article's full content, meaning a one line change to views/partials/post.tpl was all that was required. From this:

To this:

Now whenever this partial template is rendered Smarty will pass the value of $article->content to our modifier et voila—server side gists!

Pitfalls, downsides and alternatives

As I've already mentioned, this approach is not without its shortcomings and I'd struggle to recommend it unless you understand all its drawbacks.

Caching

Without aggressive caching this mechanism can be woefully slow, particularly when you have numerous gists in an article as PHP has to process each gist synchronously whereas your browser will fetch anything from 2 to 9 resources simultaneously. Similarly, with aggressive caching (such as that employed on this site) there is no way to invalidate a cached gist, meaning that if the target gist changes after it has been cached, it will never be updated unless we manually purge the cache (or APC fills up and gets rid of it).

Fragility

Scraping solutions like this should really be a last resort. Whilst getting your hands dirty with a bit of low level hackery is strangely rewarding, solutions like this aren't built to last. Github only have to change how their embeds work and I'm out of luck. Similarly, post-processing a JavaScript file in PHP is just plain dirty (the PHP V8 extension wasn't available when I did this work, though that may have been interesting).

So, why didn't I use Github's perfectly decent API instead? Well, I probably should have—except that you can only get the raw contents of the gist and I rather like how embeds are styled up and nicely syntax highlighted. I was also happy enough to take the risk of Github changing how the embeds worked at any moment, since it would only break new gists (old ones are permanently cached) at which point I could simply revise the solution.

Lastly as you can see the method doesn't handle failure well at all. A non 200 response from Github or a gist in an unexpected format and we're in trouble. Tut tut—shame on me.

The end result

The end result is pretty much as you'd expect—normal looking gists, albeit those rendered by the server as opposed to your browser (tiny bonus: JS is no longer needed to load the gist). Since we've eliminated that nasty document.write, our gists now render correctly when a page is loaded via PJAX, and all is well with the world. It isn't the only solution and nor is it the best—but it was quick, simple and fun to experiment with. Let's see how long it lasts before I actually implement a ‘proper’ fix—two months and counting…

Comments

Matt Andrews
This is a nice effort at a fix, but why not simply petition GitHub to rethink how they render embedded Gists instead? The guys there are geniuses and I'm sure they'll see the problems that using document.write causes. Maybe ask them to provide an iframe version or similar to avoid this? Your method is a nice fallback but if they change their code your regex will fail and it'll be a pain to fix it on all the sites using this technique.
dan shearmur
This is pretty cool, document.write sure does suck!

Another approach is to override document.write

I put together a quick hack that seems to work in most modern browsers: https://gist.github.com/1993718

Comments are now closed.