ScriptDom parsing and NoViableAltExceptions

If you have ever tried to debug a program that used the TSql Script Dom to parse some T-SQL you will know that the process is extremely slow and this is due to the volume of NoViableAltExceptions (and others) that are thrown and then caught. Because these are first chance exceptions they are being handled and it is the way that the script dom interacts with Antlr and the Lexer that they use. When you debug a program what happens is you have two processes, process one is the debuger, this starts (or attaches) to process two, the debugee.

The debugger calls a windows function WaitForDebugEvent typically in a “while(true)” loop (everyone should write a windows debugger at some point in their lives you learn so much, in fact put down ssms and go write your first debugger loop: https://msdn.microsoft.com/en-us/library/windows/desktop/ms681675(v=vs.85).aspx). The debugee app is then run and when something interesting like an exception or a dll is loaded/unloaded the debuggee is paused (i.e. all threads stopped), then WaitForDebugEvent returns and the debugger can look at the child process and either do something or call WaitForDebugEvent again. Even if the debugger doesn’t care about the exceptions the debugee is still paused and when you parse T-SQL under a debugger, even if you tell Visual Studio to ignore the exceptions, the debugee is still paused for every exception just so Visual Studio (or your home baked debugger) can decide that it wants to ignore that exception and the debugee is started up again.

What this means for an app that throws lots of first chance exceptions is a constant start, stop, start, stop which is so so painful for performance - it is basically impossible to debug a TSql Script Dom parse on a large project, I typically debug a project with like one table and one proc and hope it gives me everything I need or do other tricks like letting the parsing happen without a debugger attached then attach a debugger at the right point after the parsing has happened but then again I don’t have to debug the TSql Lexer’s!

So where is this leading?

I was wondering what effect these first chance exceptions had on T-SQL and even in normal operations where we don’t have a debugger attached, is there something we can do to speed up the processing?

The first thing I wanted to do was to try to reproduce a NoViableAltException, I kind of thought it would take me a few goes but actually the first statement I wrote caused one:

“select 1;”

This got me curious so I tried just:

“select 1”

Guess what? no NoViableAltException the second time - this didn’t look good, should we remove all the semi-colon’s from our code (spoiler no!).

Ok so we have a reproducable query that causes a first chance exception, what if we parse this like 1000 times and see the times and then another 1000 times with the semi-colon replaced with a space (so it is the same length)?

Guess what? The processing without the semi-colon took just over half the time of the queries with semi-colons, the average time to process a small query with a semi-colon in took 700ms and the query without the semi-colon took 420ms so much faster but who cares about 300 milli seconds? it is less than 1 second and really won’t make much difference in the overall processing time to publish a dacpac.

I thought I would just have one more go at validating a real life(ish) database so I grabbed the world wide importers database and scriptied out the objects and broke it into batches, splitting on GO and either leaving semi-colons or removing all semi-colons - when I had semi-colons in the time it took to process was 620 ms and there were 2403 first chance exceptions. The second run without semi-colons which would likely create invalid sql in some cases - took 550 ms and there were still 1323 first chance exceptions, I think if we could get rid of all teh first chance exceptions the processing would be much faster but ho hum - to handle the first chance exceptions you just need a fast CPU and not to be a process that is being debugged.