During the time before latest release of highlight.js 6.0 I decided — for the first time in more than 4 years — to actually look at other highlighting libraries. Sure I knew of their existence before but nonetheless never felt compelled to do any serious comparison because highlight.js is a fun project and I'm quite happy with the result. In fact this comparison has also been made for fun more than for anything else. I just wondered how actually good (or bad) highlight.js was looking among similar libraries.

I decided not to take into account highly subjective things like visual appeal (I'm not a good judge here), installation simplicity and documentation clarity (don't know how to measure them). Also I didn't evaluate number of supported languages. While it is a measurable quantity it doesn't mean much for an end user: if a tool doesn't support the language you need you don't care about dozens of others that it does support. Instead I concentrated on universally measurable things that make sense to everyone: size, speed and correctness.

Why "completely unfair" then, you ask? Because I knew who'd win before I even started :-).

Contenders

If you go to trouble of searching the Internet for "javascript syntax highlighter" you'll inevitably stumble upon hoards of posts all ingeniously similarly titled "N useful/beautiful javascript tools" where N varies from 4 to 20-something. Those were circulating the network for years but, predictably, aren't a very good source of information because they don't actually evaluate usefulness or beauty of solutions they link to.

So I've just picked up those names that I've got used to seeing around in blogs and forums where people try to find such a tool:

I've compiled an enterprisey-looking matrix of features supported by these libraries. It isn't intended for comparison per se because there are different use-cases and sometimes lack of features is a feature too. It's here to give you a general idea on what goal each one can serve.

 highlight.js SyntaxHighlighter SHJS Google Code Prettify
User markup in code snippets yes no 1) yes yes
Line numbers no yes no yes
Striped background no yes no yes
Replacing indenting TABs with spaces yes yes no no
Language detection yes no no yes 2)
Multi-language code yes yes 3) no yes
Arbitrary HTML container for codeyes no no no
HTML5 compatibility 4) yes no no no

Notes:

  1. SyntaxHighlighter doesn't support arbitrary markup but has two special features that cover some use-cases: turning URLs into links and highlighting lines of code that require attention.

  2. Prettify doesn't actually do any detection. Instead it employs an interesting approach of generalized highlighting that works independent of language. Though this makes it more prone to errors than the heuristic detection mechanism found in highlight.js.

  3. I wasn't able to configure SyntaxHighlighter to do this but I attribute it to my lack of persistence. It works fine on the demo page.

  4. Surely one couldn't expect being taken seriously these days without shoving trendy "HTML5" moniker somewhere! What it actually means here is that highlight.js automatically recognizes code snippets marked up according to HTML5 recommendation with <pre><code class="language-something"> .. </code></pre>.

Test case

The test page consists of code snippets using 7 popular languages: Python, Ruby, PHP, XML, HTML, CSS and Javascript. The "completely unfair" part of the article shows up here full-scale since those snippets come from highlight.js' own test suit! Anyway I think it was a good idea to use them because they were designed to be short and to exercise as many features of a language as possible. Here are four versions of the test case using highlight.js, SyntaxHighlighter, SHJS and Google Code Prettify in all their styled-by-default glory.

Size

All libraries have their way to include only required languages definitions on the page: simple linking to language files, on-demand loading, packing into a single file. Also all of them provide minified/packed production versions of files. Gzip compression wasn't used for no specific reason. The following table shows the overall size of all Javascript needed to highlight test snippets.

 highlight.js SyntaxHighlighter SHJS Google Code Prettify
Size (KB) 16.4 34.6 16.8 19.2

I didn't include CSS into calculation because it's not actually required: a site can define highlighting style within its main stylesheet.

Speed

To be honest modern browsers have made this test irrelevant. All highlighters are pretty fast to the point where highlighting is applied instantly. The only exception was SHJS that was configured to load language files on-demand which led in a couple of test runs to raw un-highlighted code being visible for a split-second. It doesn't tell anything bad about the speed of SHJS itself but rather shows that on-demand loading was a bad idea for the task.

I've measured the speed of highlighting using Firebug. It wasn't as straight-forward as counting size because there are more things to take into account here. After some tinkering I've decided on the following method:

 highlight.js SyntaxHighlighter SHJS Google Code Prettify
Load time (msecs) 870 1394 1008 1007
Highlighting time (msecs) 55 67 54 72

Richness and correctness

Here is where things get interesting. Size and speed turned out not to affect user experience significantly but the difference in richness and correctness is plainly visible. There won't be any numbers though, just some notes.

I should note that the notion of "correctness" differs from library to library. While there are plain bugs there are also missing features that could be left out deliberately. Here I tried to adhere to my personal views on the subject and you may well be in disagreement with me. That's fine!

SyntaxHighlighter doesn't produce very rich highlighting to begin with. No Python decorators, no Javascript regexps, no CSS @-rules etc… Also it seems to being downright unable to highlight things that require more sophisticated parsing than a regular grammar, like names in function and class definitions. This is not bad by itself. The result still looks useful and leaves fewer places to screw up :-). But there are some issues with correctness anyway:


Not much is highlighted in Javascript.

SHJS was looking promising since it uses language definitions from the GNU source-highlight project and I thought those guys would do their job rather meticulously. But in practice it mishandled highlighting the most of all others:


Class inheritance (A < B) in Ruby breaks the whole line.

Google Code Prettify works very well both in terms of richness and correctness. It can highlight CSS and Javascript within HTML, recognizes Python decorators, Javascript regexps. Speaking of the latter, it was Prettify where I borrowed ideas on how to implement those in highlight.js.

I've found very few issues with it:


That <not> inside CDATA shouldn't be highlighted as tag.

As for highlight.js, it's pushed down to the end of the comparison for a reason :-). Obviously there won't be any correctness issues since I used code snippets from its own test suit which it successfully passes. Of course it doesn't in any way mean it's bug-free. But where the library really stands out is highlighting richness. It just knows much more about languages than others. Here are just those features visible only in this very test case that are unique to highlight.js:

Some of the recognized features (like variables in PHP) are deliberately not styled to maintain visual sanity. Most of these features (and those in other languages) are the result of elaborate effort of many highlight.js contributors in defining most intricate parsing rules (just look at Perl definition for example).


HTML with emedded Javascript and CSS. All sorts of ways to define tag attributes are supported.

Completely balanced conclusion

If you need a solid syntax highlighter (and don't care about line numbers or striped backgrounds) use highlight.js. It is small, fast, rich and correct!

And if you don't like something about it — contribute!

Comments: 17 (feed)

  1. mktums

    Поздравляю с новым релизом! =)

  2. David

    Okay, I took a look at your source code, and it looks like a few bits of CSS would give you line numbers and alternating row colors:

    code {
        counter-reset: code-lines;
        counter-increment: code-lines;
    }
    
    code span {
        background-color: grey;
        counter-increment: code-lines;
    }
    
    code span:nth-child(even) {
        background-color: lightgrey;
    }
    
    code span:nth-child(5n):before {
        content: counter(code-lines) ". ";
    }
    

    This is off the top of my head, but that should produce a different set of line numbers per <code> block with alternating colors and only number every 5th line.

  3. tmont

    I'd be interested in hearing your opinion on Sunlight: http://sunlightjs.com/demo.html.

    disclaimer: I'm the author.

    And if you think it sucks, I wouldn't mind hearing some bug reports, too :).

  4. mikesamuel@gmail.com

    Google Code Prettify allows one to specify a language and the language extensions map to file extensions. So if you know the filename that you're highlighting, you know which language extension to load and which to apply to a given code block. I think it also support the HTML5 idiom that you use as your definition for HTML 5 compatibility as long as the code element also has the class prettyprint.

  5. David:

    Okay, I took a look at your source code, and it looks like a few bits of CSS would give you line numbers and alternating row colors:

    David, your solution won't work because <span>s don't corresponds to lines. I tried to do this some time ago with CSS counters but it gets harder than it seems pretty quickly.

    tmont:

    I'd be interested in hearing your opinion on Sunlight: http://sunlightjs.com/demo.html

    It looks pretty interesting! I'll try to look at it in detail after my vacation. Thanks!

    mikesamuel@gmail.com:

    Google Code Prettify allows one to specify a language and the language extensions map to file extensions

    Mike, I know, I used it for CSS in the test. It's pretty obvious that no heuristic can be 100% reliable so you have to have some way to explicitly specify the language.

    I think it also support the HTML5 idiom that you use as your definition for HTML 5 compatibility as long as the code element also has the class prettyprint.

    You can look at it that way. But my goal with highlight.js was more practical: I wanted code marked-up according to the spec to be highlighted without additional hassle. And even more practical need was to highlight code in just <pre><code>..</code></pre> without classes because it's what Markdown does for code fragments.

  6. ActionJake

    An interesting CSS trick for zebra-striping would be to count the number of matches for newline and then create a background gradient with the appropriate color stops. Example below for 5 lines of code.

    .someCodeBlock { background-image: linear-gradient(top, #000, #000 20%, #fff 20%, #fff 40%, #000 40%, #000 60%, #fff 60%, #fff 80%, #000 80%, #000); }

    Should be easy enough to build these gradients up programmatically. Might have to do some adjustments for padding, etc.

    Oh crap! Wrapping! hm... the plot thickens.

  7. ActionJake

    Nevermind, code blocks don't wrap and none of these seem to have wrapping as a feature :) So my suggestion could still work.

  8. leaverou@gmail.com

    Thank you, I've been looking for a syntax highlighter and this looks quite interesting.

    @ActionJake: You just need 2 color stops at 1.2em (assuming your line-height is 1.2) and background-size to set the size of the gradient to 2.4em. Then it's repeated and matches every line.

    I've used this technique in my blog: http://leaverou.me/

  9. emwendelin@openid.com

    Considering how much smaller highlight.js is compared to SyntaxHighlighter, it might be worth switching to.

    Also, +1 to Lea :)

  10. Github uses pygments to highlight syntax. Pygments is running on the server, instead of a pure Javascript client solution. If you’re looking for a Javascript solution check out this review of the various options.

  11. Маниакальный веблог » Completely unfair comparison of …Completely unfair comparison of Javascript syntax highlighters. Маниакальный веблог, 22.05.2011. During the time before latest release of highlight.js 6.0 I …

  12. Základem bylo zvolit, jakou knihovnou vlastně ve výsledku kód obarvovat. Měl jsem jasno, že to má dělat JavaScript (serverové obarvování má IMO víc nevýhod než výhod a Gisty mají zase nevýhodu, že se nezobrazí v GReaderu), a jako nejméně intruzivní a relativně nejvíc rozšířená mi přišla knihovna Prettify od Googlu (mimochodem, krásné porovnání JS knihoven je zde). Používá ji Stack Overflow, na tomto webu plugin pro komentáře a navíc má výhodu, že není potřeba určovat jazyk kódu – Prettify nějak umí obarvit cokoliv, ačkoliv přesně nechápu, jak to vlastně funguje. Kvůli jednoduchosti syntaxe (v podstatě se jen elementu <pre> přidá class="prettyprint") je zásadně jednodušší (a funkční) i plugin pro Live Writer a rovněž přidání do stránky je snadné.

  13. Details for highlight.js ... Step 1. intall wp-highlight.js WordPress plugin Step 2. optionally, replace wp/wp-content/wp-highlights/ with a custom configured wp-highlight.js Step 3. review highlight style demo for the plugin setting Step 4. usage: <pre><code class="java">...</code></pre> no-highlight, applescript, bash, cpp, css, http, java, javascript, json, markdown, matlab, objectivec, php, python, ruby, sql, xml

  14. There are a ton of syntax highlighting options available. I came across highlight.js reading the Completely Unfair Comparison of JavaScript Syntax Highlighters. After looking at various options, I decided to give highlight.js a try within the workflow of working with Markdown.

  15. And a comparing blog post on the Software Maniacs Blog.

  16. Raghavendra Samant

    SyntaxHighlighter 3x seems to have improved significantly since this analysis was done.

    You can easily select-all : copy the code snippet. pretify XML easily.. highlight lines of codes etc.. so found it much for user-friendly !

Add comment

Text delimited with a blank line becomes paragraphs, quoting is done with > on the left, list consists of items with a minus on the left, italic is marked with * from both sides, bold -- with **, code blocks are indented with 4 spaces