If you write code to SQL Server then you might be interested in this: I have written a tSQLt tdd training course which has helped over 500 people learn both tSQLt and how to apply TDD practices to their SQL Server T-SQL development, you can join the course at https://courses.agilesql.club.

How do you debug your spark-dotnet app in visual studio?

When you run an application using spark-dotnet, to launch the application you need to use spark-submit to start a java virtual machine which starts the spark-dotnet driver which then runs your program so that leaves us a problem, how to write our programs in visual studio and press f5 to debug?

There are two approaches, one I have used for years with dotnet when I want to debug something that is challenging to get a debugger attached - think apps which spawn other processes and they fail in the startup routine. You can add a Debugger.Launch() to your program then when spark executes it, a prompt will be displayed and you can attach Visual Studio to your program. (as an aside I used to do this manually a lot by writing an __asm int 3 into an app to get it to break at an appropriate point, great memories but we don’t need to do that anymore luckily :).

The second approach is to start the spark-dotnet driver in debug mode which instead of launching your app, it starts and listens for incoming requests - you can then run your program as normal (f5), set a breakpoint and your breakpoint will be hit.

Changing your spark-submit command to:

spark-submit --class org.apache.spark.deploy.dotnet.DotnetRunner --master local microsoft-spark-2.4.x-0.4.0.jar debug

Then sit back and you should see something like this:

spark-dotnet debug mode https://the.agilesql.club/assets/images/spark/spark-dotnet-debug-mode.png


spark-submit --class org.apache.spark.deploy.dotnet.DotnetRunner --master local microsoft-spark-2.4.x-0.4.0.jar debug
19/08/20 17:21:21 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
19/08/20 17:21:22 INFO DotnetRunner: Starting DotnetBackend with .
19/08/20 17:21:23 INFO DotnetRunner: Port number used by DotnetBackend is 5567
* .NET Backend running debug mode. Press enter to exit *

Then you can press F5 in Visual Studio and you’ll end up with the console output from spark, the console output from your app (assuming it is a console app!) and the visual studio debugger which has everything like breakpoints, watches, etc:

spark-dotnet f5 debugging https://the.agilesql.club/assets/images/spark/debug-f5.png

Now, I didn’t tell you but it gets even more exciting - once you kill your app you can execute more apps against the same spark instance so you no longer have to create a new spark instance everytime you want to run an app - wowsers, just think how much faster that will make your integration tests :)


* indicates required

Please select all the ways you would like to hear from Agile Sql Club:

You can unsubscribe at any time by clicking the link in the footer of our emails. For information about our privacy practices, please visit our website.

We use Mailchimp as our marketing platform. By clicking below to subscribe, you acknowledge that your information will be transferred to Mailchimp for processing. Learn more about Mailchimp's privacy practices here.