Comparisons are odious — unless it’s in SQL.

One of the most frustrating things to debug is data type conversion errors when selecting a bunch of rows from one SQL table to another. In the real world, tables may have hundreds of columns, and Microsoft SQL’s error reporting is… laughable? If you’re importing 20000 rows in a query that runs for 10 minutes, nothing is as heartbreaking as seeing this message:

(0 rows affected)
Msg 8114, Level 16, State 5, Line 39
Error converting data type nvarchar to numeric.

There are only so many ways to troubleshoot this, but one way is to ensure that the types of the columns in the source and destination match exactly. Doing this by hand for hundreds of columns is un-fun, so I wrote a stored procedure to do it for me:

CREATE OR ALTER PROCEDURE DiffObjects 
	@o1 nvarchar(max),
	@o2 nvarchar(max)
AS
BEGIN
	SET NOCOUNT ON;

	if OBJECT_ID(@o1) is null
	begin
		select concat(@o1, ' does not exist') message
		return
	end

	if OBJECT_ID(@o2) is null
	begin
		select concat(@o2, ' does not exist') message
		return
	end

	select distinct iif(c2.object_id is null, OBJECT_NAME(c1.object_id), 'both') object, c1.name, c1.column_id, c2.column_id,
		iif(c1.system_type_id <> c2.system_type_id, 'X', '') as system_type_id, iif(c1.user_type_id <> c2.user_type_id, 'X', '') as user_type_id, iif(c1.max_length <> c2.max_length, 'X', '') as max_length, iif(c1.precision <> c2.precision, 'X', '') as precision, iif(c1.scale <> c2.scale, 'X', '') as scale, iif(c1.collation_name <> c2.collation_name, 'X', '') as collation_name, iif(c1.is_nullable <> c2.is_nullable, 'X', '') as is_nullable, iif(c1.is_ansi_padded <> c2.is_ansi_padded, 'X', '') as is_ansi_padded, iif(c1.is_rowguidcol <> c2.is_rowguidcol, 'X', '') as is_rowguidcol, iif(c1.is_identity <> c2.is_identity, 'X', '') as is_identity, iif(c1.is_computed <> c2.is_computed, 'X', '') as is_computed, iif(c1.is_filestream <> c2.is_filestream, 'X', '') as is_filestream, iif(c1.is_replicated <> c2.is_replicated, 'X', '') as is_replicated, iif(c1.is_non_sql_subscribed <> c2.is_non_sql_subscribed, 'X', '') as is_non_sql_subscribed, iif(c1.is_merge_published <> c2.is_merge_published, 'X', '') as is_merge_published, iif(c1.is_dts_replicated <> c2.is_dts_replicated, 'X', '') as is_dts_replicated, iif(c1.is_xml_document <> c2.is_xml_document, 'X', '') as is_xml_document, iif(c1.xml_collection_id <> c2.xml_collection_id, 'X', '') as xml_collection_id, iif(c1.default_object_id <> c2.default_object_id, 'X', '') as default_object_id, iif(c1.rule_object_id <> c2.rule_object_id, 'X', '') as rule_object_id, iif(c1.is_sparse <> c2.is_sparse, 'X', '') as is_sparse, iif(c1.is_column_set <> c2.is_column_set, 'X', '') as is_column_set, iif(c1.generated_always_type <> c2.generated_always_type, 'X', '') as generated_always_type, iif(c1.generated_always_type_desc <> c2.generated_always_type_desc, 'X', '') as generated_always_type_desc, iif(c1.encryption_type <> c2.encryption_type, 'X', '') as encryption_type, iif(c1.encryption_type_desc <> c2.encryption_type_desc, 'X', '') as encryption_type_desc, iif(c1.encryption_algorithm_name <> c2.encryption_algorithm_name, 'X', '') as encryption_algorithm_name, iif(c1.column_encryption_key_id <> c2.column_encryption_key_id, 'X', '') as column_encryption_key_id, iif(c1.column_encryption_key_database_name <> c2.column_encryption_key_database_name, 'X', '') as column_encryption_key_database_name, iif(c1.is_hidden <> c2.is_hidden, 'X', '') as is_hidden, iif(c1.is_masked <> c2.is_masked, 'X', '') as is_masked, iif(c1.graph_type <> c2.graph_type, 'X', '') as graph_type, iif(c1.graph_type_desc <> c2.graph_type_desc, 'X', '') as graph_type_desc
		from sys.columns c1
		left outer join sys.columns c2 on c2.object_id = iif(c1.object_id = OBJECT_ID(@o1), OBJECT_ID(@o2), OBJECT_ID(@o1))
			and c1.name = c2.name
	where c1.object_id in (OBJECT_ID(@o1), OBJECT_ID(@o2)) and (c2.object_id is null or (
			c1.column_id <> c2.column_id or c1.system_type_id <> c2.system_type_id or c1.user_type_id <> c2.user_type_id or c1.max_length <> c2.max_length or c1.precision <> c2.precision or c1.scale <> c2.scale or c1.collation_name <> c2.collation_name or c1.is_nullable <> c2.is_nullable or c1.is_ansi_padded <> c2.is_ansi_padded or c1.is_rowguidcol <> c2.is_rowguidcol or c1.is_identity <> c2.is_identity or c1.is_computed <> c2.is_computed or c1.is_filestream <> c2.is_filestream or c1.is_replicated <> c2.is_replicated or c1.is_non_sql_subscribed <> c2.is_non_sql_subscribed or c1.is_merge_published <> c2.is_merge_published or c1.is_dts_replicated <> c2.is_dts_replicated or c1.is_xml_document <> c2.is_xml_document or c1.xml_collection_id <> c2.xml_collection_id or c1.default_object_id <> c2.default_object_id or c1.rule_object_id <> c2.rule_object_id or c1.is_sparse <> c2.is_sparse or c1.is_column_set <> c2.is_column_set or c1.generated_always_type <> c2.generated_always_type or c1.generated_always_type_desc <> c2.generated_always_type_desc or c1.encryption_type <> c2.encryption_type or c1.encryption_type_desc <> c2.encryption_type_desc or c1.encryption_algorithm_name <> c2.encryption_algorithm_name or c1.column_encryption_key_id <> c2.column_encryption_key_id or c1.column_encryption_key_database_name <> c2.column_encryption_key_database_name or c1.is_hidden <> c2.is_hidden or c1.is_masked <> c2.is_masked or c1.graph_type <> c2.graph_type or c1.graph_type_desc <> c2.graph_type_desc
		))
	order by c1.column_id
END

It’s actually a very simple stored procedure, where all of the work is being done by a single select statement. But that select statement looks a little tricky at first. To understand it, it’s best to ignore the repetitive parts where I have to reference each column from the sys.columns table and focus on the join and where clauses.

In plain English, the query selects every row from sys.columns for both of the object names passed in to the procedure. You can use it compare two tables, a view and a table, etc. For each column in each object, it attempts to join the corresponding column from the other object by name:

left outer join sys.columns c2 on c2.object_id = iif(c1.object_id = OBJECT_ID(@o1), OBJECT_ID(@o2), OBJECT_ID(@o1))
			and c1.name = c2.name

If the column is present in both objects, the output clauses above will show an ‘X’ for each value that doesn’t agree between the two objects. The first column shows which object the column appears in, or ‘both’ if it appears in both objects.

And that’s pretty much it! It’s already saved me a bunch of time since it simplifies my workflow for modifying my tables to agree. I make a change, re-run the stored proc, and see the differences disappear in real time. Nice!

I hope you found this useful, I published the source for the stored proc as a gist, which you can find here. If you want to know more about what I’m up to, follow me on LinkedIN or check out my GitHub.

Scott Means's Blog

Leave a Reply Cancel reply