Thursday, January 14, 2010

Setting up Change Data Capture in SQL Server 2008

By: Tim Chapman

Change Data Capture (CDC) is a new native feature in SQL Server 2008 Enterprise Edition.
——————————————————————————————
Auditing database data is no easy task, but it’s absolutely required for most industries. In a previous column, I showed how you can use some native SQL Server functionality to do a decent job of capturing data changes.
The inherent problem with capturing these data changes is that it requires triggers on the base table. These triggers can either immediately enter the data into auditing tables or use Service Broker functionality to capture the data at a later time. In either event, the initial capture has to be done in the scope of the original transaction. In SQL Server 2008, the Change Data Capture (CDC) feature allows you to capture data changes much more easily and without the use and overhead of triggers.

Setting up CDC

First, I want to create a database for the use of testing my scenarios. The statement below creates a database named CaptureChanges on your database instance:
CREATE DATABASE CaptureChanges
GO
To use CDC, I’ll need to enable it at the database level. The statement below calls the sp_cdc_enable_db system stored procedure, which enables CDC for the current database scope:
USE CaptureChanges
GO
EXEC sys.sp_cdc_enable_db
GO
To illustrate how CDC captures data changes, I’ll need a table to perform data changes. The script below creates a table named SalesHistory and inserts data into it:
IF OBJECT_ID('SalesHistory') IS NOT NULL
DROP TABLE SalesHistory
GO
CREATE TABLE [dbo].[SalesHistory]
(               
            SaleID int IDENTITY(1,1) PRIMARY KEY,               
            Product varchar(10) NULL,               
            SaleDate datetime NULL,               
            StatusID TINYINT NULL,               
            SalePrice money NULL
)
            GO
SET NOCOUNT ON
BEGIN TRANSACTION
            DECLARE @i INT
            SET @i = 1
            WHILE (@i <=500)
            BEGIN               
              INSERT INTO [SalesHistory](Product, SaleDate, SalePrice, StatusID)               
              VALUES ('Computer', DATEADD(ww, @i, '3/11/1919'), DATEPART(ms, GETDATE()) + (@i + 57), 1)               
                       
              INSERT INTO [SalesHistory](Product, SaleDate, SalePrice, StatusID)               
              VALUES('BigScreen', DATEADD(ww, @i, '3/11/1927'), DATEPART(ms, GETDATE()) + (@i + 13),5)               
                       
              INSERT INTO [SalesHistory](Product, SaleDate, SalePrice, StatusID)               
              VALUES('PoolTable', DATEADD(ww, @i, '3/11/1908'), DATEPART(ms, GETDATE()) + (@i + 29),8)               
                       
              SET @i = @i + 1
                       
            END
COMMIT TRANSACTION
GO
At this point, nothing in the SalesHistory table has been captured, as it is yet to be enabled for CDC. To use CDC for a particular table, the table must contain a Primary Key constraint or a unique constraint to uniquely identify the rows in the table. In the example above, I use an IDENTITY column for my Primary Key constraint.
To enable the table for data capturing, I’ll need to run a system stored procedure and pass in the table name. (The SQL Server Agent should be running when you run this stored procedure.)
EXEC sys.sp_cdc_enable_table
@source_schema = N'dbo',
@source_name   = N'SalesHistory',
@role_name     = NULL,
@supports_net_changes = 1
GO
Now I’m going to update an arbitrary set of records in the SalesHistory table. Because I now have CDC enabled for this table, the data changes should be captured by the CDC system.
UPDATE s
SET SalePrice = SalePrice + 1
FROM SalesHistory s
where SaleID % 13 = 0
If the CDC schema had not already existed in the database, the above procedure call would create it. All CDC tables will belong to this schema.
When the CapturesChanges table is enabled for CDC, two SQL Server Agent jobs are created. One job uses the LogReader to capture changes while the other job is used to clean up older messages. The job names created are listed below:
cdc.CaptureChanges_capture
cdc.CaptureChanges_cleanup
CDC has system tables that are created when CDC is enabled. The main one to concern yourself with at first is the actual data that stores the audited data. A SalesHistory_CT table is created when I enable the SalesHistory table for CDC. Data changes are captured in this table, along with a few housekeeping columns that CDC uses for reporting changes to data. Notice that this table belongs to the cdc schema.

Credits: www.techrepublic.com

0 comments:

Post a Comment