Help, I have lost a SQL data file and I have no backup

This week I stumbled across a post on the forum that was quite interesting in terms of recovery in dire situations. The op basically had lost a SQL data file and was in a situation where they had no backups. The question asked could they rebuild the missing data file, which unfortunately is not possible as the very nature of the problem is data loss.

Anyway the loss of a data file = data loss, and the best route is to use the backups. You there. Yes reader I am talking to you. BACKUPS, BACKUPS and did I mention BACKUPS!!! (Also do not forget to test them!!!)

When you create a database using the defaults you will get a data file (Test.mdf) and a log file (Test_log.ldf). There are several posts on the internet for last resort scenarios when you have lost a log file, but what happens for a data files. If you lose Test.mdf then as you can imagine you are pretty much dead in the water.

Now if you have additional data files you can still recover to some degree the database. In SQL 2005 on wards a new restore process was made possible called Piecemeal Restores. In which you can restore the database in pieces, providing that you first restored the PRIMARY file group.

Well this got me thinking about the ability to offline a file, and so this post will show how to do just that.

Example on a non production server for a missing non primary file group file.

	CREATE DATABASE [Test]
	 ON  PRIMARY
	( NAME = N'Test', FILENAME = N'C:\SQL\MSSQL11.MSSQLSERVER\MSSQL\DATA\Test.mdf' , SIZE = 5120KB , FILEGROWTH = 1024KB ),
	( NAME = N'Test_PRIMARY', FILENAME = N'C:\SQL\MSSQL11.MSSQLSERVER\MSSQL\DATA\Test_PRIMARY.ndf' , SIZE = 17408KB , FILEGROWTH = 1024KB ),
	 FILEGROUP [NONPRIMARY]
	( NAME = N'Test_NONPRIMARY', FILENAME = N'C:\SQL\MSSQL11.MSSQLSERVER\MSSQL\DATA\Test_NONPRIMARY.ndf' , SIZE = 17408KB , FILEGROWTH = 1024KB )
	 LOG ON
	( NAME = N'Test_log', FILENAME = N'C:\SQL\MSSQL11.MSSQLSERVER\MSSQL\DATA\Test_log.ldf' , SIZE = 3072KB , FILEGROWTH = 10%)
	GO
	USE [Test]
	CREATE TABLE [dbo].[Found]([i] INT) ON [Primary]
	CREATE TABLE [dbo].[Lost]([i] INT) ON [NONPRIMARY]
	GO
	INSERT INTO [dbo].[Found]
	SELECT TOP (1000000)  ROW_NUMBER() OVER(ORDER BY(SELECT NULL)) FROM sys.all_objects x,sys.all_objects y
	INSERT INTO [dbo].[Lost]
	SELECT TOP (1000000)  ROW_NUMBER() OVER(ORDER BY(SELECT NULL)) FROM sys.all_objects x,sys.all_objects y

In the example above a database is created with two additional files, one sits on the PRIMARY file group and the other on a file group called NONPRIMARY. Then two tables are created and populated with 1 Million rows each.

Once this is done, stop the SQL Server (MSSQLSERVER) service for the instance and delete the file called Test_NONPRIMARY.ndf before starting SQL again.

Upon resuming you can check the logs to confirm, or run the following:

ALTER DATABASE [Test] SET ONLINE

This will throw the following error:

Msg 5120, Level 16, State 5, Line 1
Unable to open the physical file “C:\SQL\MSSQL11.MSSQLSERVER\MSSQL\DATA\Test_NONPRIMARY.ndf”. Operating system error 2: “2(The system cannot find the file specified.)”.
Msg 5181, Level 16, State 5, Line 1
Could not restart database “Test”. Reverting to the previous status.
Msg 5069, Level 16, State 1, Line 1
ALTER DATABASE statement failed.

Surprised? Well we did just delete that. Lets tell SQL Server to offline that file.

	ALTER DATABASE [Test] MODIFY FILE (NAME = 'Test_NONPRIMARY', OFFLINE);
SELECT * FROM sys.master_files WHERE [database_id] = DB_ID('Test');
	ALTER DATABASE [Test] SET ONLINE;

Excellent so that worked, and when checked the table on our intact files called dbo.Found has 1 Million rows.

	SELECT COUNT(*) FROM [Test].[dbo].[Found]--1000000
	GO
	SELECT COUNT(*) FROM [Test].[dbo].[Lost]

The lost table throws this:

Msg 8653, Level 16, State 1, Line 1
The query processor is unable to produce a plan for the table or view ‘lost’ because the table resides in a file group which is not online.

 

Example on a non production server for a missing primary file group file.

When you attempt the same process as above but delete Test_PRIMARY.ndf you will get an error when you try to offline it.

Msg 5077, Level 16, State 1, Line 1
Cannot change the state of non-data files or files in the primary filegroup.

However you can trick SQL into doing it, by creating a dummy database. Stop SQL, Copy your database files somewhere. Then start SQL and create a dummy database following the same layout but this time the file that was/is on the Primary file group put on a dummy one.

 

CREATE DATABASE [Test]
	ON  PRIMARY
( NAME = N'Test', FILENAME = N'C:\SQL\MSSQL11.MSSQLSERVER\MSSQL\DATA\Test.mdf' , SIZE = 5120KB , FILEGROWTH = 1024KB ),
	FILEGROUP [ItDoesNotMatter] --CHANGE the Test_PRIMARY file to be on the ItDoesNotMatter filegroup
( NAME = N'Test_PRIMARY', FILENAME = N'C:\SQL\MSSQL11.MSSQLSERVER\MSSQL\DATA\Test_PRIMARY.ndf' , SIZE = 17408KB , FILEGROWTH = 1024KB ),
	FILEGROUP [NONPRIMARY]
( NAME = N'Test_NONPRIMARY', FILENAME = N'C:\SQL\MSSQL11.MSSQLSERVER\MSSQL\DATA\Test_NONPRIMARY.ndf' , SIZE = 17408KB , FILEGROWTH = 1024KB )
	LOG ON
( NAME = N'Test_log', FILENAME = N'C:\SQL\MSSQL11.MSSQLSERVER\MSSQL\DATA\Test_log.ldf' , SIZE = 3072KB , FILEGROWTH = 10%)
GO

Once done, stop sql, copy your original files over the dummy files. Making sure to delete the dummy Test_Primary.ndf else it will error like this.

Msg 5173, Level 16, State 1, Line 1
One or more files do not match the primary file of the database. If you are attempting to attach a database, retry the operation with the correct files. If this is an existing database, the file may be corrupted and should be restored from a backup.
Msg 5181, Level 16, State 5, Line 1
Could not restart database “Test”. Reverting to the previous status.
Msg 5069, Level 16, State 1, Line 1
ALTER DATABASE statement failed.

Then carry on as previous, except get ready for the corruption that is going to be caused by losing a vital piece of the database.

ALTER DATABASE [Test] MODIFY FILE (NAME = 'Test_PRIMARY', OFFLINE);
ALTER DATABASE [Test] SET ONLINE;--The Service Broker in database "Test" will be disabled because the Service Broker GUID in the database () does not match the one in sys.databases ().
DBCC CHECKDB('Test') WITH ALL_ERRORMSGS, NO_INFOMSGS;
/*
ALTER DATABASE [Test] SET SINGLE_USER;
ALTER DATABASE [Test] SET EMERGENCY;
DBCC CHECKDB('Test',repair_allow_data_loss) WITH ALL_ERRORMSGS, NO_INFOMSGS;
ALTER DATABASE [Test] SET MULTI_USER;
GO
SELECT COUNT(*) FROM [Test].[dbo].[Found]--1000000
GO
SELECT COUNT(*) FROM [Test].[dbo].[Lost]
*/

In my example the recovery completed, but of my two tables I could only access 144`304 from the dbo.found table, with dbo.lost being inaccessible like I had deleted it. I did not delete the file with the dbo.lost data in, but the corruption has meant that I might have well have.  Where the corruption occurs and what is lost/saved is mostly down to luck.

The database is no longer production viable after these steps, and this post is just showing a way to get data out.

I will finish by saying that this post is really a last chance saloon, and you would be crazy to do this for any other reasons than that. (Unless you are strange and you like playing with corruption in test environments like me).

BACKUPS BACKUPS BACKUPS