What is the easiest way to compare data in a SQL Server table and fix any differences?

Sync Table Data

It is the sort of problem that you often get; you have 2 tables and you want to make sure that the data matches, however there are millions of rows and lots of columns to deal with. Aghhh. Nightmare. So you start writing a script comparing the tables row by row, column by column looking for differences…surely there must be a simple way to do this.

Yes there is, but you probably do not know it is even there. The tool is called tablediff and it is located here:

“%ProgramFiles%\Microsoft SQL Server\100\COM\”

Note that the 100 refers to the SQL Server version, so it may be slighlty different however if you do a search on the Microsoft SQL Server directory you will find tablediff.exe

A Quick example

ok, what we will do is copy the contents of sys.databases to a table in tempdb, and delete the system databases from our list as well as update the create_date and collation for the example table. (I am doing this so that if you forget to go and drop the tempdb.dbo.databases later on it will automagically dissapear next time you restart sql server when tempdb is recreated).

SELECT * INTO tempdb.dbo.databases FROM sys.databases
DELETE FROM tempdb.dbo.databases WHERE database_id in(1,2,3,4)
UPDATE tempdb.dbo.databases SET create_date = GETDATE()
UPDATE tempdb.dbo.databases SET collation_name = 'Nonsence_General_CI_AS'
SELECT * FROM tempdb.dbo.databases

So running tablediff

tablediff.exe -sourceserver (local) -sourcedatabase master -sourceschema sys -sourcetable databases -destinationserver (local) -destinationdatabase tempdb -destinationschema dbo -destinationtable databases

Table [master].[sys].[databases] on (local) and Table [tempdb].[dbo].[databases]
on (local) have 11 differences.
 Err database_id
 Src. Only 1
 Src. Only 2
 Src. Only 3
 Src. Only 4
 Mismatch 5
 Mismatch 6
 Mismatch 8
 Mismatch 10
 Mismatch 11
 Mismatch 14
 Mismatch 15
 The requested operation took 0.0624 seconds.

Great so you can see that I deleted the 4 system databases entries, and there is a mismatch on the other rows. Oops looks like I didnt specify -c to show column information.

Table [master].[sys].[databases] on (local) and Table [tempdb].[dbo].[databases]
on (local) have 11 differences.
 Err database_id Col
 Src. Only 1
 Src. Only 2
 Src. Only 3
 Src. Only 4
 Mismatch 5 collation_name create_date
 Mismatch 6 collation_name create_date
 Mismatch 8 collation_name create_date
 Mismatch 10 collation_name create_date
 Mismatch 11 collation_name create_date
 Mismatch 14 collation_name create_date
 Mismatch 15 collation_name create_date
 The requested operation took 0.1092 seconds.

That is better, I can now clearly see which columns are different.

I wish there was a way I could fix these mismatches… if only there was a generate fix flag. Oh but there is! -f which will generate a fix script.

-- Source Options --
 -sourceserver Source Host
 -sourcedatabase Source Database
 -sourceschema Source Schema Name
 -sourcetable Source Table or View
 -sourceuser Source Login
 -sourcepassword Source Password
 -sourcelocked Lock the source table/view durring tablediff
-- Destination Options --
 -destinationserver Destination Host
 -destinationdatabase Destination Database
 -destinationschema Destination Schema Name
 -destinationtable Destination Table or View
 -destinationuser Destination Login
 -destinationpassword Destination Password
 -destinationlocked Lock the destination table/view durring tablediff
-- Misc Options --
 -t Timeout
 -c Column Level Diff
 -f Generate Fix SQL (You may also specify a file name and path)
 -q Quick Row Count
 -et Specify a table to put the errors into
 -dt Drop the error table if it exists
 -o Output file
 -b Number of bytes to read for blob data types
 -strict Strict compare of source and destination schema
 -rc Number of retries
 -ri Retry interval

tablediff Utility
-bf number_of_statements

Leave a Reply