Database fails to generate a Checkpoint

This seems to be one of the many strange errors within SQL Server that you may encounter. You might’ve lately restored this DB from the server where Replication flag would have been enabled – That’s the closest of the scenarios which happened in my case. Although system DB’s were not restored as part of the exercise.

So, here’s what the error message will scare you with:
One or more recovery units belonging to database ‘mydatabase’ failed to generate a checkpoint. This is typically caused by lack of system resources such as disk or memory, or in some cases due to database corruption. Examine previous entries in the error log for more detailed information on this failure.

The log scan number (643555:241:0) passed to log scan in database ‘ mydatabase ‘ is not valid. This error may indicate data corruption or that the log file (.ldf) does not match the data file (.mdf). If this error occurred during replication, re-create the publication. Otherwise, restore from backup if the problem results in a failure during startup

Couple of options to start with:

  1. Check within the sys.databases table and see if the log_reuse_wait_desc is marked with Replication. If so, then we must work to get rid of it.
    SELECT name, log_reuse_wait_desc FROM sys.databases where log_reuse_wait_desc = ‘Replication’
  2. You may execute this stored procedure to remove replication from this DB if you’re sure that it’s not a replication participating DB.
    EXEC sp_removedbreplication ‘mydatabasename’
  3. After executing #2, if DB entry is still showing up in sys.databases table within log_reuse_wait_desc as ‘Replication’ then try marking the transactions as replicated using this command:
    EXEC sp_repldone @xactid = NULL, @xact_segno = NULL, @numtrans = 0, @time = 0, @reset = 1
  4. After executing #3, if you’re still hitting up this error, then advise would be to install Replication component using the setup.exe and put the DB into publication (Use any dummy table) and then remove replication using the same command as mentioned in #2.

Off course, prior going into this series of steps you’ll have to verify a certain things respective to your environment:

  • Ensure that database integrity is rightly checked and validated.
  • Error log is rightly checked and there are no more errors available there, apart of this one.
  • Server resources are available in plenty.
  • Clean restoration was performed earlier while bringing this DB online.

Hope this helps!

Happy leaning.

Aman Kharbanda

Why is my transaction log file growing rapidly?

Blog’s headline must have given you a brief idea on the context of this issue.

Many a times this issue comes across us, if the settings are not in correct order and we end up in getting ‘Error: 9002, Severity: 17, State: 2 Transaction Log of XYZ database is full’.

With this error, SQL Server may mark databases as suspect because of a lack of space for transaction log expansion.

Probable causes and adverse effects of this issue
i) A very large transaction log file which can lead the transactions to fail and may start to roll back.
ii)Transactions may take a long time to complete.
iii)Performance issues may occur.
iv)Blocking may occur.

How do I stop it from eating up all my disk space? What should I do?
1) Truncate the inactive transactions in your transaction log by performing an immediate log backup. (The inactive part of the transaction log file contains the completed transactions, so the transaction log file is no longer used by SQL Server during the recovery process. SQL Server reuses this truncated, inactive space in the transaction log instead of letting the transaction log continue to grow and use more space).
2) Shrink the transaction log file (Backup or reducing the size doesn’t really truncate the log, so you have to opt for shrink task).
3) Preventing log file to grow unexpectedly –
    a) Expand the log file size to allow it grow until we have space on the drive.
    b) Configure auto-grow setting (Still need to be careful with the free space on disk).
    c) Consider changing the recovery model to Simple (You can change the recovery model from full to simple if you do not want to use the transaction log files during a disaster recovery operation).

Hope this clarifies.

Aman Kharbanda

Move the DB Files without taking the database into offline mode

Move the DB Files without taking the database into offline mode-

With time when you see the drive on which DB is hosted is running out of space, and you tend to approach the Wintel/Storage teams to add more LUN in order to allocate more disk space or you ask the application teams to housekeep some of their data.

What if there is no such possibility and you are stuck in a situation where movement of files to another bigger drive becomes your ultimate challenge without any downtime. Yes, management hates the word ‘Downtime’ 🙂

We all must have used attach/detach method which involves detaching a database, moving the files, then re-attaching the database. There is another method which involves taking a database offline, running the ALTER DATABASE command to change file locations, moving the files, and bringing the database back online.

Common limitation with these methods – The database has to be offline.

There is another approach to skip such situation – DBCC Shrinkfile (Logical FileName, EmptyFile).

Ceate a new file using the ALTER DATABASE command, then move the data using the DBCC SHRINKFILE command with the EMPTYFILE option.
Secondly, use ALTER DATABASE command to remove the empty file.

Excerpts from BOL-
EMPTYFILE – Migrates all data from the specified file to other files in the same filegroup. Because the Database Engine no longer allows data to be placed in the empty file, the file can be removed by using the ALTER DATABASE statement.

Use Test;
— Create a data file and assume it contains data.
    NAME = Testdata,
    FILENAME = ‘X:\data.ndf’,
    SIZE = 100MB
— Empty the data file.
— Remove the data file from the database.

Simple isn’t it! Hope this helps.

Aman Kharbanda

Shrink DB takes more time?

We have often seen occurrences where DB shrink task takes more time than we actually estimate and think of. Imagine shrink operation is going smoothly when you take a look at it’s progress, and after a span of time say a few hours, when you again look at the percentage value, you come to know that shrink progress has been stalled.

What could have been the reason for this slowness? You would like to see if there are any blocked processes at the back end or any unavoidable reasons for this very very slow progress.

Try to have a look at perfmon and verify the Disk performance counters – Avg. Disk sec/Transfer to understand if there are any disk performance issues (Look for the drives where Data/Log file were hosted) and catch hold of Windows fellows, if there are similar issues.
Criteria suggested by Microsoft is – If Avg. Disk Sec/Transfer is larger than 0.09, then it indicates that we have disk performance issue.

Also, enable the trace and verify what kind of data is your shrink task moving? At times, when Blob data comes into picture, then Shrink Operation is bound to take more time than expected.

Below blog article (by Paul Randal) explain why LOB data makes shrink run slowly.

What to do to get space released to the OS quickly?
Instead of running shrink DB, try to execute DBCC Shrinkfile and look to release space in small chunks. Say, after the application data housekeeping you see that the DB has 40GB of free space, then don’t go directly for 40GB at one shot. Try for 5GB batches and run it in multiple iterations.
Also, Shrink works in batch sizes of ~32 pages so it can be cancelled and only the last batch would be rolled back. This means that we can start it, kill it and restart at a later time.

Best Practices
1) DBCC SHRINKFILE or DBCC SHRINKDATABASE hold very small transactions. Better not to run at the same time with DDL statements like index rebuild, which may require schema lock, and cause waiting.
2) Though DB Shrink is not generally recommended, since it causes a lot of fragmentation and ultimately becomes a major culprit in slowing down your SQL Server.
3) Consider using trace flag -T2548, which will skip the compact step during the shrink. (LOB data are compacted by default during shrink operation).  If we disable the LOB data compaction, the shrink time will be reduced. (Recommendation is to verify the usage in test environment first, before going to PRD Servers).

Hope this helps.

Aman Kharbanda