Pages

Thursday, March 19, 2015

Working with Syntax Highlighter in Blogger Simplified

I’ve been attempting to clean up my blog and make it a little more user friendly for myself and visitors, and given that it’s a blog by a developer, there is bound to be a code sample or two posted on it. When I see a code sample on a site, I like to be able to scroll the code block to see the full thing, or highlight sections to copy if I like. Needless to say, it’s important to display code samples properly on a blog for coding.

When I was first throwing this blog together I searched for the best approach to displaying code samples. Some people were using Syntax Highlighter with some tweaks and troubles here and there; others said the simplest approach was to just create a GitHub Gist and copy the Gist embed URL into your post. The Gist approach seemed the simplest so I decided to go with that to start. Well, now that I tackled that with all the effort it took (none, seriously) I decided to try my hand at Syntax Highlighter.


The Problem

With Gist embeds being so easy to approach, why would you want to consider using Syntax Highlighter? The Gist approach has some really good benefits:
  1. Gists reside in GitHub which means the code is not stored in your blog.
  2. It also means you can update your Gist and the updated version will appear in your blog.
  3. It’s really easy to copy the embed code and paste it in your blog.
My issue, however, is that I sometimes want to display little blurbs of code throughout a post. When I show an example of something I like to explain the blocks individually before throwing an entire solution at your face. To that point, Syntax Highlighter has some advantages:
  1. You don’t need a GitHub account to include code samples on your blog.
  2. If you want to display a bunch of little blocks of code, you don’t have to create a bunch of little Gists.
  3. It’s fewer steps. 
The fewer steps thing for me is key. Navigating to GitHub to create Gists for each sample I need, then back to my post, then back to Gist, then back to my post, just seemed too annoying, to be honest.

I find that making something more straightforward or quicker to use often leads to increased use. Apple has built an empire on this concept. The biggest drawback to me, though, for Syntax Highlighter, was that all the instructions I saw said you have to switch into HTML mode of your post to create a <pre> tag to include your code. And, yes, that’s a problem for me because, yes, I can be that lazy.

First Things First

First, to get it out of the way, most of the information I have seen says that Syntax Highlighter has a problem with Dynamic Templates in Blogger. I don’t have a Dynamic Template, so I can’t comment to this point. Second, read up on the details of Syntax Highlighter at http://alexgorbatchev.com/SyntaxHighlighter/, especially the Themes and Autoloader.

As a third and final point, specific to this solution, I utilize the jQuery library so we will have to include that. Syntax Highlighter does not require jQuery to function, but I am doing some work with it to make working with Syntax Highlighter in Blogger easier, so it’s necessary for this.

Now, most instructions I have seen require copying and pasting a bunch of script includes into your header and, while this solution involves some of that, I wanted better performance than that offers. When you are dealing with multiple programming languages - JavaScript, XML, C# - you have to include the brush scripts for each one. Thankfully, a bunch of other people wanted to avoid having all that load right away too, so, being the great supporter he is, Alex Gorbatchev created an Autoloader script to solve this.

With the Autoloader script, you specify an array of “brush mappings” that map the brush types to the script. Only when you use a specific brush in your page is the brush script loaded. You still have to load the core JS files required, including the additional SHAutoloader.js, but this greatly simplifies and speeds up the page load.

My Solution

With that in mind, to kick off this solution you need to load up your Blogger Template into HTML edit mode. Follow instructions HERE or below:
  1. From your Blogger dashboard, click the Blog name you want to work with. This will take you to the Blog Overview page.
  2. On the left menu, select Template.
  3. Beneath your Template preview, click the Edit HTML button
Next, scroll to the </head> tag in the template. You can find this quickly by pressing Ctrl+F and searching for "</head>". Right before this tag we need to paste the core JavaScript and CSS needed. This is going to include the jQuery core as well. Alex graciously hosts the resources for Syntax Highlighter on a public server so you can use his hosted links. Read up on it here.
[pre class="brush:xml" title="jQuery core, Syntax Highlighter Core and Default CSS, Syntax Highlighter Core JS, and the Autoloader"]
<script src='http://code.jquery.com/jquery-1.11.2.min.js' type='text/javascript'/>
<link href='http://alexgorbatchev.com/pub/sh/current/styles/shCore.css' rel='stylesheet' type='text/css'/>
<link href='http://alexgorbatchev.com/pub/sh/current/styles/shThemeDefault.css' rel='stylesheet' type='text/css'/>
<script src='http://alexgorbatchev.com/pub/sh/current/scripts/shCore.js' type='text/javascript'/>
<script src='http://alexgorbatchev.com/pub/sh/current/scripts/shAutoloader.js' type='text/javascript'/>
[/pre]
Without Autoloader, we would have to include an additional script for each brush we want to use. In my example case, it would end up adding the following to the above:
[pre class="brush:xml" title="Additional brush scripts"]
<script src='http://alexgorbatchev.com/pub/sh/current/scripts/shBrushCSharp.js' type='text/javascript'/>
<script src='http://alexgorbatchev.com/pub/sh/current/scripts/shBrushJScript.js' type='text/javascript'/>
<script src='http://alexgorbatchev.com/pub/sh/current/scripts/shBrushXml.js' type='text/javascript'/>
[/pre]
Because we are using the Autoloader, though, we can ignore those. We add "brush mappings" later to deal with them.

Next, to activate Syntax Highlighter, there are 2 possible options you can choose:
  1. The typical approach of just setting up Syntax Highlighter, then editing your post HTML to add your <pre> tags and code blocks.
  2. My approach allowing you to add a "pre token" in the blog HTML, eliminating the need to edit post HTML.

Option 1

For option 1, the following block would be all you need, and you can add it right after your script includes from above, right before the </head> tag. It's also worth mentioning that, if you stop here and don't implement Option 2, you don't need jQuery as it's only used for my additional approach.
[pre class="brush:jscript;html-script:true;" title="Syntax Highlighter activation"]
<script type="text/javascript">
    SyntaxHighlighter.config.bloggerMode = true;
    SyntaxHighlighter.autoloader(
    'c-sharp csharp http://alexgorbatchev.com/pub/sh/current/scripts/shBrushCSharp.js',
    'javascript jscript js http://alexgorbatchev.com/pub/sh/current/scripts/shBrushJScript.js',
    'xml xhtml xslt html xhtml http://alexgorbatchev.com/pub/sh/current/scripts/shBrushXml.js'
    );
    SyntaxHighlighter.all();
</script>
[/pre]
There are 3 pieces to the above code:
  1. bloggerMode: For Syntax Highlighter to work with Blogger, you must set “bloggerMode” to true. It’s pretty self-explanatory, but this allows the script to work within Blogger.
  2. autoloader(): This function sets the array of “brush mappings”. In my block, I am mapping the brushes for C#, JavaScript, and XML/XHTML. Each mapping is it's own string and only the brushes I use on the page will load. Read more about this here.
  3. all(): This tells Syntax Highlighter to process all the HTML on the page. Essentially, make my code pretty.

Option 2

As I mentioned, I wanted a more simplified approach to adding my sample code to my page, rather than having to edit the HTML, find my spot, and add my <pre> tag and code block each time. And since Blogger doesn’t support a <pre> tag, I came up with a token-replace approach.

Using jQuery and a JavaScript RegEx, I take the post body, and look for a [pre] token that also has a “class” specified which is used to indicate the brush for Syntax Highlighter, along with other config options. It also must have a closing [/pre] token for this to work. Everything in between the tokens is then expected to be code.
[pre class="brush:xml" title='Code block in Blogger "Compose" mode']
&#91;pre class="brush:jscript" title="Test code block"&#93;
function doSomething() {
    alert('Syntax Highlighter Fun');
}
&#91;/pre&#93;
[/pre]
With my script, the tokens are replaced with actual <pre> and </pre> tags, with the class attribute, and any other attributes specified, added. Your code block is then in between. Additionally, because we are dealing with Blogger "Compose" mode, there are <br> tags inserted in your code block, as well as double <br> tags after the block in some cases. To clean this up, I have another RegEx object that removes <br> tags at the end of the line from within the code block, and a condition to clean up double breaks that follow the code block.

As an additional aside, if you want to add a HTML entity to your code like I did above with the square brackets ("[" and "]"), you can use the "&#[code];" method, and an additional string.replace will swap the encoded "&" that Blogger puts there in "Compose" mode to an actual "&", allowing the proper character to display in your block (Note: I wrote about an alternative approach to this here: Properly adding Javascript in Blogger templates). The above example then becomes the below in your post:
[pre class="brush:jscript" title="Test code block"]
function doSomething() {
    alert('Syntax Highlighter Fun');
}
[/pre]
All of this RegEx replace action is done in a temp div at execution time to keep from affecting the page too much. After this token-replace occurs, we then take the Syntax Highlighter code, from Option 1, and include it immediately after. This entire block, wrapped in a jQuery.ready() function, then goes before the </body> tag in the template.
[pre class="brush:jscript;html-script:true;" title="Complete Syntax Highlighter script"]
<script type="text/javascript">
    (function($) {
        $(function() {
            var regX = /\[pre (class=".*?".*?)\]([\s\S]*?)\[\/pre\]/mig;
            var preRegX = /&lt;br\/?&gt;$/mig;
            $('div.post-body').each(function() {
                if (regX.test($(this).html())) {
                    var htmlStr = $(this).html().replace(regX,"&lt;pre $1&gt;\n$2\n&lt;/pre&gt;");
                    var tmp = $('<div></div>');
                    tmp.html(htmlStr);
                    tmp.find('pre').each(function() {
                        var preStr = $(this).html().replace(preRegX,'');
                        preStr = preStr.replace(/&amp;amp;/g,'&amp;');
                        $(this).html(preStr);
                        // nextSibling detects TextNode - jQuery.next() does not
                        var n1 = $(this)[0].nextSibling;
                        var n2 = (n1 ? n1.nextSibling : null);
                        if ((n1 != null &amp;&amp; n2 != null) &amp;&amp; (n1.tagName == n2.tagName)) {
                            n2.parentNode.removeChild(n2);
                        }
                    });
                    $(this).html(tmp.html());
                }
            });
            SyntaxHighlighter.config.bloggerMode = true;
            SyntaxHighlighter.autoloader(
            'c-sharp csharp http://alexgorbatchev.com/pub/sh/current/scripts/shBrushCSharp.js',
            'javascript jscript js http://alexgorbatchev.com/pub/sh/current/scripts/shBrushJScript.js',
            'xml xhtml xslt html xhtml http://alexgorbatchev.com/pub/sh/current/scripts/shBrushXml.js'
            );
            SyntaxHighlighter.all();
        });
    })(jQuery);
</script>
[/pre]
One other important note to make is that Blogger uses an XML parser to process your post markup, so some characters like angle-brackets and ampersand have to be encoded in the script. When you paste the above code block and try to save the template, you might get a XML parsing error. If you do, just encode the appropriate characters with the proper HTML number encoding (refer to this chart).

Update: An alternative approach to using the entity references is to use CDATA Sections in your Javascript. Read my other post for information on this: Properly adding Javascript in Blogger templates.

With this extra bit of setup in place I can now create a blog post, add my [PRE] tokens with the specified brushes and titles for my code blocks, save, and publish without touching the HTML of the post. Now, I am sure there are situations I haven't accounted for, and hiccups that might be encountered, but this should account for most scenarios I need. And I think it's a pretty good start.

13 comments:

  1. When I pasted your code on my blog template, BLOGGER throw this error :
    Error parsing XML, line 1008, column 31: Element type "br" must be followed by either attribute specifications, ">" or "/>".

    Can you help me to fix it?
    screenshot : http://i.imgur.com/QjfIbrh.png


    ReplyDelete
  2. more words
    I did replace the code with HTML Codes in the chart you gave, but got error in this line, and can not update (cause i dont know javascript).

    if ((n1 != null && n2 != null) && (n1.tagName == n2.tagName)) {
                                n2.parentNode.removeChild(n2);
                            }

    ReplyDelete
    Replies
    1. For the issue on the BR tag, replace the greater-than and less-than signs with their encoded counterparts: "&lt;" and "&gt;" in the code. For the second issue, replace the ampersand signs "&" with the encoded values as well "&amp;". This means, where you have "&&" it should read "&amp;&amp;".

      It's a little silly, right now, because the template markup is basically a XML document parsed by Blogger/Google to render the content. Because it's a XML document, I am going to test an alternative that might allow us to avoid this encoding going forward.

      Delete
  3. I added a new post that talks about a better way to add Javascript to a Blogger template so you can avoid the entity encoding that I mention. Read it here: http://beendaved.blogspot.com/2015/07/properly-adding-javascript-in-blogger.html.

    Using CDATA Sections inside your script tags, you can avoid the parsing errors you would receive otherwise.

    ReplyDelete
  4. Hi,

    Thanks for good explanation. I've noticed that "<script src='http://code.jquery.com/jquery-1.11.2.min.js' type='text/javascript'/>" line breaks "Dynamic" template, although works fine with "Simple" or "Awesome Inc." templates.

    ReplyDelete
    Replies
    1. I haven't looked into it, but it could simply be because a version of jQuery is already being loaded into those templates. If that is the case, that line can be omitted, as it will cause a jQuery version conflict. And since a version would already be included, it would be unnecessary to include another.

      Delete
  5. Ok, newbie here, be gentle. Managed to follow all the instructions, but to highlight C# code block would the class be
    [pre class="brush:csharp" title="my first C# test highlight"]

    looking at the syntaxhighlighter page, it looks to me like I could use c-sharp or c# also, is that correct?

    lastly, my blogger template seems to block the colour high lighting, is this because I haven't got the [pre] class header correct?

    ReplyDelete
    Replies
    1. Funny enough, that first sentence is something I saw often when jumping between platforms. We're all newbies at some point. I always try to be gentle enough. ;)

      For the brush, you can use either "csharp" or "c-sharp". I don't I prefer "csharp", but it's a personal preference. The line you pasted looks accurate. As for the highlighting, is the script executing and creating the code block for you? Do you have the correct brush script file included on your page? Remember, you have to tell SyntaxHighlighter which brushes you want to support (refer to Option 1 at top). If you forget to load/include a brush script, it won't do anything.

      Do you have a sample page available for public viewing that I could inspect and help with?

      Delete
  6. Excellent article! But it needs a minor correction: according to my own experience, and this Stack Overflow answer here, all of the "Syntax Highlighter activation" code must come AFTER any "<pre...>" statements in your post body. Therefore, you must place the "Syntax Highlighter activation" code near the end of your HTML template, just before "</body>" instead of just before "</head>".

    ReplyDelete
    Replies
    1. Thanks for the comment and suggestion. However, in the case of the approach used here, it's actually not a concern. You can tell, because my blog is set up exactly how I specify in the posts for this, and it's working. ;)

      The reason is the way the implementation takes effect. In the link you provided for Stack Overflow, it's the raw method for loading and applying Syntax Highlighter, with no modifications. When you call "SyntaxHighlighter.All()" it immediately tries to locate the "<pre>" tags and create the script blocks in the HTML. It would makes sense, then, to have this block at the bottom, because you want to wait for the entire page to load. SyntaxHighlighter is not framework dependent, so you don't need something like jQuery for it to work, so this approach makes sense.

      If you look at my complete script block a little up, however, you'll see it's wrapped in a jQuery "Ready" function using the shorthand approach of "$(function() {})". This is because, before I call the "SyntaxHighlighter.All()" function, I actually modify my HTML and do a find-replace of a "[pre]" token to make it a "<pre>" tag. This approach is what allows me to stay in WYSIWYG mode in Blogger, and not have to go into source view to add the SyntaxHighlighter blocks. After I modify my HTML and make all the "<pre>" tags, I then call "SyntaxHighlighter.All()" which executes on the entire document like it should. And because all of this is wrapped in the jQuery "Ready" function, it happens after the document has loaded the HTML.

      Delete
  7. Thanks again for this article. By the way, I built off of your information and took it one step further--customizing background colors and things for the syntax highlighting. See here: http://www.electricrcaircraftguy.com/2016/10/syntaxhighlighter.html.

    ReplyDelete
  8. Ohh Man, Thanks for very detailed information. Though I'm already using the syntax highlighter in my website http://www.dotnet4techies.com/, I have extended where it saves my time by enclosing the code in [pre][/pre] expressions. Thank you very much once again.

    ReplyDelete
  9. Thanks for finding this out. I tested

    ReplyDelete