Friday, April 26, 2013

String Extensions: Split Qualified

Splitting on a string is easy.
Respecting qualified (quoted) strings can be hard.
Identifying escaped characters in qualified strings is very tricky.
Splitting on a qualified string that takes escape characters into account is really difficult!

Unit Tests

[Theory]
[InlineData(null,                   new string[0])]
[InlineData("",                     new string[0])]
[InlineData("hello world",          new[] { "hello", "world" })]
[InlineData("hello   world",        new[] { "hello", "world" })]
[InlineData("\"hello world\"",      new[] { "\"hello world\"" })]
[InlineData("\"hello  world\"",     new[] { "\"hello  world\"" })]
[InlineData("hello \"goodnight moon\" world", new[]
{
    "hello", 
    "\"goodnight moon\"", 
    "world", 
})]
[InlineData("hello \"goodnight \\\" moon\" world", new[]
{
    "hello", 
    "\"goodnight \\\" moon\"", 
    "world", 
})]
[InlineData("hello \"goodnight \\\\\" moon\" world", new[]
{
    "hello", 
    "\"goodnight \\\\\"", 
    "moon\"", 
    "world", 
})]
public void SplitQualified(string input, IList<string> expected)
{
    var actual = input
        .SplitQualified(' ', '"')
        .ToList();
 
    Assert.Equal(expected.Count, actual.Count);
 
    for (var i = 0; i < actual.Count; i++)
        Assert.Equal(expected[i], actual[i]);
}

String Extension Methods

public static IEnumerable<string> SplitQualified(
    this string input, 
    char separator, 
    char qualifier, 
    StringSplitOptions options = StringSplitOptions.RemoveEmptyEntries, 
    char escape = '\\')
{
    if (String.IsNullOrWhiteSpace(input))
        return new string[0];
 
    var results = SplitQualified(input, separator, qualifier, escape);
 
    return options == StringSplitOptions.None
        ? results
        : results.Where(r => !String.IsNullOrWhiteSpace(r));
}
 
private static IEnumerable<string> SplitQualified(
    string input, 
    char separator, 
    char qualifier, 
    char escape)
{
    var separatorIndexes = input
        .IndexesOf(separator)
        .ToList();
 
    var qualifierIndexes = input
        .IndexesOf(qualifier)
        .ToList();
 
    // Remove Escaped Qualifiers
    for (var i = 0; i < qualifierIndexes.Count; i++)
    {
        var qualifierIndex = qualifierIndexes[i];
        if (qualifierIndex == 0)
            continue;
 
        if (input[qualifierIndex - 1] != escape)
            continue;
 
        // Watch out for a series of escaped escape characters.
        var escapeResult = false;
        for (var j = 2; qualifierIndex - j > 0; j++)
        {
            if (input[qualifierIndex - j] == escape)
                continue;
 
            escapeResult = j % 2 == 1;
            break;
        }
 
        if (qualifierIndex > 1 && escapeResult)
            continue;
 
        qualifierIndexes.RemoveAt(i);
        i--;
    }
 
    // Remove Qualified Separators
    if (qualifierIndexes.Count > 1)
        for (var i = 0; i < separatorIndexes.Count; i++)
        {
            var separatorIndex = separatorIndexes[i];
 
            for (var j = 0; j < qualifierIndexes.Count - 1; j += 2)
            {
                if (separatorIndex <= qualifierIndexes[j])
                    continue;
 
                if (separatorIndex >= qualifierIndexes[j + 1])
                    continue;
 
                separatorIndexes.RemoveAt(i);
                i--;
            }
        }
 
    // Split String On Separators
    var previousSeparatorIndex = 0;
    foreach (var separatorIndex in separatorIndexes)
    {
        var startIndex = previousSeparatorIndex == 0
            ? previousSeparatorIndex
            : previousSeparatorIndex + 1;
 
        var endIndex = separatorIndex == input.Length - 1
            || previousSeparatorIndex == 0
            ? separatorIndex - previousSeparatorIndex
            : separatorIndex - previousSeparatorIndex - 1;
 
        yield return input.Substring(startIndex, endIndex);
 
        previousSeparatorIndex = separatorIndex;
    }
 
    if (previousSeparatorIndex == 0)
        yield return input;
    else
        yield return input.Substring(previousSeparatorIndex + 1);
}
 
public static IEnumerable<int> IndexesOf(
    this string input, 
    char value)
{
    if (!String.IsNullOrWhiteSpace(input))
    {
        var index = -1;
        do
        {
            index++;
            index = input.IndexOf(value, index);
 
            if (index > -1)
                yield return index;
            else
                break;
        }
        while (index < input.Length);
    }
}
Shout it

Enjoy,
Tom

Saturday, April 13, 2013

Report Unhandled Errors from JavaScript

Logging and aggregating error reports is one of the most important things you can do when building software: 80% of customer issues can be solved by fixing 20% of the top-reported bugs.

Almost all websites at least has some form of error logging on their servers, but what about the client side of those websites? People have a tendency to brush over best practices for client side web development because "it's just some scripts." That, is, WRONG! Your JavaScript is your client application, it is how users experience your website, and as such it needs the proper attention and maintenance as any other rich desktop application.

So then, how do you actually know when your users are experiencing errors in their browser? If you are like the vast majority of websites out there...

You don't know about JavaScript errors, and it's time to fix that!

window.onerror

Browsers do offer a way to get notified of all unhandled exceptions, that is the window.onerror event handler. You can wire a listener up to this global event handler and get back three parameters: the error message, the URL of the file in which the script broke, and the line number where the exception was thrown.

window.onerror = function myErrorHandler(errorMsg, url, lineNumber) {
  // TODO: Something with this exception!
  // Just let default handler run.
  return false;
}

StackTrace.js

JavaScript can throw exceptions like any other language; browser debugging tools often show you a full stack trace for unhandled exceptions, but gathering that information programmatically is a bit more tricky. To learn a bit more about the JavaScript language and how to gather this information yourself, I suggest taking a look at this article by Helen Emerson. However in practice I would strongly suggest you use a more robust tool...

StackTrace.js is a very powerful library that will build a fully detailed stack trace from an exception object. It has a simple API, cross browser support, it handles fringe cases, and is very light weight and unobtrusive to your other JS libraries.

try {
    // error producing code
} catch(error) {
   // Returns stacktrace from error!
   var stackTrace = printStackTrace({e: error});
}

Two Big Problems

  1. The window.onerror callback does not contain the actual error object.

This is a big problem because without the error object you cannot rebuild the stack trace. The error message is always useful, but file names and line numbers will be completely useless once you have minified your code in production. Currently, the only way you can bring additional information up to the onerror callback is to try catch any exceptions that you can and store the error object in a closure or global variable.

  1. If you globally try catch event handlers it will be harder to use a debugger.

It would not be ideal to wrap every single piece of code that you write in an individual try catch block, and if you try to wrap your generic event handling methods in try catches then those catch blocks will interrupt your debugger when you are working with code in development.

Currently my suggestion is to go with the latter option, but only deploy those interceptors with your minified or production code.

jQuery Solution

This global error handling implementation for jQuery and ASP.NET MVC is only 91 lines of JavaScript and 62 lines of C#.

Download JavaScriptErrorReporter from GitHub

To get as much information as possible, you need to wire up to three things:
(Again, I suggest that you only include this when your code is minified!)

  1. window.onerror
  2. $.fn.ready
  3. $.event.dispatch

Here is the meat of those wireups:

var lastStackTrace,
    reportUrl = null,
    prevOnError = window.onerror,
    prevReady = $.fn.ready,
    prevDispatch = $.event.dispatch;
 
// Send global methods with our wrappers.
window.onerror = onError;
$.fn.ready = readyHook;
$.event.dispatch = dispatchHook;
 
function onError(error, url, line) {
    var result = false;
    try {
        // If there was a previous onError handler, fire it.
        if (typeof prevOnError == 'function') {
            result = prevOnError(error, url, line);
        }
        // If the report URL is not loaded, load it.
        if (reportUrl === null) {
            reportUrl = $(document.body).attr('data-report-url') || false;
        }
        // If there is a rport URL, send the stack trace there.
        if (reportUrl !== false) {
            var stackTrace = getStackTrace(error, url, line, lastStackTrace);
            report(error, stackTrace);
        }
    } catch (e) {
        // Something went wrong, log it.
        if (console && console.log) {
            console.log(e);
        }
    } finally {
        // Clear the wrapped stack so it does get reused.
        lastStackTrace = null;
    }
    return result;
}
 
function readyHook(fn) {
    // Call the original ready method, but with our wrapped interceptor.
    return prevReady.call(this, fnHook);
 
    function fnHook() {
        try {
            fn.apply(this, arguments);
        } catch (e) {
            lastStackTrace = printStackTrace({ e: e });
            throw e;
        }
    }
}
 
function dispatchHook() {
    // Call the original dispatch method.
    try {
        prevDispatch.apply(this, arguments);
    } catch (e) {
        lastStackTrace = printStackTrace({ e: e });
        throw e;
    }
}

Identifying Duplicate Errors

One last thing to mention is that when your stack trace arrives on the server it will contain file names and line numbers. The inconsistency of these numbers will make it difficult to identify duplicate errors. I suggest that you "clean" the stack traces by removing this extra information when trying to create a unique error hash.

private static readonly Regex LineCleaner
    = new Regex(@"\([^\)]+\)$", RegexOptions.Compiled);
 
private int GetUniqueHash(string[] stackTrace)
{
    var sb = new StringBuilder();
 
    foreach (var stackLine in stackTrace)
    {
        var cleanLine = LineCleaner
            .Replace(stackLine, String.Empty)
            .Trim();
 
        if (!String.IsNullOrWhiteSpace(cleanLine))
            sb.AppendLine(cleanLine);
    }
 
    return sb
        .ToString()
        .ToLowerInvariant()
        .GetHashCode();
}

Integration Steps

This article was meant to be more informational than tutorial; but if you are interested in trying to apply this to your site, here are the steps that you would need to take:

  1. Download JavaScriptErrorReporter from GitHub.
  2. Include StackTrace.js as a resource in your website.
  3. Include ErrorReporter.js as a resource in your website.
    • Again, to prevent it interfering with your JavaScript debugger, I suggest only including this resource when your scripts are being minified.
  4. Add a report error action to an appropriate controller. (Use the ReportError action on the HomeController as an example.)
  5. Add a "data-report-url" attribute with the fully qualified path to your report error action to the body tag of your pages.
  6. Log any errors that your site reports!
Shout it

Enjoy,
Tom

Real Time Web Analytics