If you write code to SQL Server then you might be interested in this: at the end of 2017 I wrote a tSQLt tdd training course which has helped over 300 people learn both tSQLt and how to apply TDD practices to their SQL Server T-SQL development, you can join the course at https://courses.agilesql.club. The course is free if you are happy to wait 10 weeks to complete it, with 1 lesson being made available per week - if you are in more of a hurry or you would like to help support the project you can purchase the course which makes it instantly available to you.

What collation variables take on in T-SQL

I was asked an interesting question about collations in sql server recently about where variables in a script got their collation’s from. I really wasn’t too sure of the answer, I generally try to ensure that databases are in the company default collation although I realise that this is not always possible.

Given how important this is I thought I would investigate it a little:

The first question I had was what collaction is the actual T-Sql script compiled in, is it the database or server?

Given this script (guess how the sensitive and non-sensitive databases have been setup!):

use sensitive

declare @var int
select @VAR = 1

go
use nonsensitive

declare @var int
select @VAR = 1

On a server that is case sensitive, both fail:

Msg 137, Level 15, State 1, Line 14
Must declare the scalar variable “@VAR”.

On a server that is case insensitive, I get:

Command(s) completed successfully.

So the actual script is compiled in the collation of the server.

What about the collation of variables? Are the values of variables defined in script, the same as the database or server?

use sensitive

select case when ‘i’ = ‘I’ then ‘equal’ else ‘not equal’ end as equality_case_sensitive_db;
go
use nonsensitive

select case when ‘i’ = ‘I’ then ‘equal’ else ‘not equal’ end as equality_non_case_sensitive_db;
go

For this, the case sensitive one shows “not equal” and the case insensitive one shows “equal”

Great so values are set using the database.

Next I wanted to know if values populated from a table followed the database collation or used the collation that the table used, if it was different from the database. First of all in the sensitive database I check a value in a table with a collation that is not case sensitive:

use sensitive;
go

if object_id(‘ins’) is not null
drop table ins;
go

create table ins (
data varchar(25) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL
)
go
insert into ins
select ‘i’

declare @I varchar(25)
select @I = ‘I’

declare @i_from_data varchar(25)
select top 1 @i_from_data = data from ins

select case when @i = @i_from_data then ‘equal’ else ‘not equal’ end as equality;

This resulted in “not equal”

Then in the case insensitive database I check whether a value read from a case sensitive table is equal:

use nonsensitive

if object_id(‘ins’) is not null
drop table ins;
go

create table ins (
data varchar(25) COLLATE SQL_Latin1_General_CP1_CS_AS NOT NULL
)
go
insert into ins
select ‘i’

go

declare @I varchar(25)
select @I = ‘I’

declare @i_from_data varchar(25)
select top 1 @i_from_data = data from ins

select case when @i = @i_from_data then ‘equal’ else ‘not equal’ end as equality;

This resulted in “equal”, then I wanted to check if two values read from a table and compared to each other used their own collation of the database collation:

use sensitive

if object_id(‘ins’) is not null
drop table ins;
go

create table ins (
data_ci varchar(25) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL,
data_cs varchar(25) COLLATE SQL_Latin1_General_CP1_CS_AS NOT NULL
)
go
insert into ins
select ‘i’, ‘I’

go

declare @I varchar(25)
select @I = ‘I’

declare @i_ci varchar(25)
declare @i_cs varchar(25)
select top 1 @i_ci = data_ci, @i_cs = data_cs from ins

select case when @i_ci = @i_cs then ‘equal’ else ‘not equal’ end as equality;

“not equal”, hmm so is it that that the database collation affects it?

use nonsensitive

if object_id(‘ins’) is not null
drop table ins;
go

create table ins (
data_ci varchar(25) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL,
data_cs varchar(25) COLLATE SQL_Latin1_General_CP1_CS_AS NOT NULL
)
go
insert into ins
select ‘i’, ‘I’

go

declare @i_ci varchar(25)
declare @i_cs varchar(25)
select top 1 @i_ci = data_ci, @i_cs = data_cs from ins

select case when @i_ci = @i_cs then ‘equal’ else ‘not equal’ end as equality;

“equal” so the collation of data read in from a table into a variable takes on the collation of the database!

If we now look at comparing values to columns and values, in the insensitive database we get a “not equal” when comparing a value to a case sensitive column:

use nonsensitive

if object_id(‘ins’) is not null
drop table ins;
go

create table ins (
data_ci varchar(25) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL,
data_cs varchar(25) COLLATE SQL_Latin1_General_CP1_CS_AS NOT NULL
)
go
insert into ins
select ‘i’, ‘i’

go

declare @I varchar(25)
select @I = ‘I’

select top 1 case when data_cs = @I then ‘equal’ else ‘not equal’ end as equality
from ins

We get the same result in the case sensitive database which makes sense.

So to summarize:

  • scripts are compiled using the server collation so variable and table names etc follow what the server is set to.
  • variable values use the database default, although this is only really evident if you do a comparison with something.
  • variable values read from a table also uses the database default.
  • comparing values to a column result in the collation that is on the column or table if it is not set on the column, taking precedence.

Finally comparing two columns with different collations is not possible as you get an error in sql to say that they do not match.