There is misleading information in two system views (sys.data_spaces & sys.destination_data_spaces) about the physical location of data after a partitioning MERGE and before an INDEX REBUILD operation on a partitioned table. In SQL Server 2012 SP1 CU6, the script below (SQLCMD mode, set DataDrive & LogDrive variables for the runtime environment) will create a test database with file groups and files to support a partitioned table. The partition function and scheme spread the test data across 4 files groups, an empty partition, file group and file are maintained at the start and end of the range. A problem occurs after the SWITCH and MERGE RANGE operations, the views sys.data_spaces & sys.destination_data_spaces show the logical, not the physical, location of data.
--================================================================================= -- PartitionLabSetup_RangeRight.sql -- 001. Create test database -- 002. Add file groups and files -- 003. Create partition function and schema -- 004. Create and populate a test table --================================================================================= USE [master] GO ----------------------------------------------------------------------------------- -- 001 - Create Test Database ----------------------------------------------------------------------------------- :SETVAR DataDrive "D:\SQL\Data\" :SETVAR LogDrive "D:\SQL\Logs\" :SETVAR DatabaseName "workspace" :SETVAR TableName "TestTable" -- Drop if exists and create Database IF DATABASEPROPERTYEX(N'$(databasename)','Status') IS NOT NULL BEGIN ALTER DATABASE $(DatabaseName) SET SINGLE_USER WITH ROLLBACK IMMEDIATE DROP DATABASE $(DatabaseName) END CREATE DATABASE $(DatabaseName) ON ( NAME = $(DatabaseName)_data, FILENAME = N'$(DataDrive)$(DatabaseName)_data.mdf', SIZE = 10, MAXSIZE = 500, FILEGROWTH = 5 ) LOG ON ( NAME = $(DatabaseName)_log, FILENAME = N'$(LogDrive)$(DatabaseName).ldf', SIZE = 5MB, MAXSIZE = 5000MB, FILEGROWTH = 5MB ) ; GO ----------------------------------------------------------------------------------- -- 002. Add file groups and files ----------------------------------------------------------------------------------- --:SETVAR DatabaseName "workspace" --:SETVAR TableName "TestTable" --:SETVAR DataDrive "D:\SQL\Data\" --:SETVAR LogDrive "D:\SQL\Logs\" DECLARE @nSQL NVARCHAR(2000) ; DECLARE @x INT = 1; WHILE @x <= 6 BEGIN SELECT @nSQL = 'ALTER DATABASE $(DatabaseName) ADD FILEGROUP $(TableName)_fg' + RTRIM(CAST(@x AS CHAR(5))) + '; ALTER DATABASE $(DatabaseName) ADD FILE ( NAME= ''$(TableName)_f' + CAST(@x AS CHAR(5)) + ''', FILENAME = ''$(DataDrive)\$(TableName)_f' + RTRIM(CAST(@x AS CHAR(5))) + '.ndf'' ) TO FILEGROUP $(TableName)_fg' + RTRIM(CAST(@x AS CHAR(5))) + ';' EXEC sp_executeSQL @nSQL; SET @x = @x + 1; END ----------------------------------------------------------------------------------- -- 003. Create partition function and schema ----------------------------------------------------------------------------------- --:SETVAR TableName "TestTable" --:SETVAR DatabaseName "workspace" USE $(DatabaseName); CREATE PARTITION FUNCTION $(TableName)_func (int) AS RANGE RIGHT FOR VALUES ( 0, 15, 30, 45, 60 ); CREATE PARTITION SCHEME $(TableName)_scheme AS PARTITION $(TableName)_func TO ( $(TableName)_fg1, $(TableName)_fg2, $(TableName)_fg3, $(TableName)_fg4, $(TableName)_fg5, $(TableName)_fg6 ); ----------------------------------------------------------------------------------- -- Create TestTable ----------------------------------------------------------------------------------- --:SETVAR TableName "TestTable" --:SETVAR BackupDrive "D:\SQL\Backups\" --:SETVAR DatabaseName "workspace" CREATE TABLE [dbo].$(TableName)( [Partition_PK] [int] NOT NULL, [GUID_PK] [uniqueidentifier] NOT NULL, [CreateDate] [datetime] NULL, [CreateServer] [nvarchar](50) NULL, [RandomNbr] [int] NULL, CONSTRAINT [PK_$(TableName)] PRIMARY KEY CLUSTERED ( [Partition_PK] ASC, [GUID_PK] ASC ) ON $(TableName)_scheme(Partition_PK) ) ON $(TableName)_scheme(Partition_PK) ALTER TABLE [dbo].$(TableName) ADD CONSTRAINT [DF_$(TableName)_GUID_PK] DEFAULT (newid()) FOR [GUID_PK] ALTER TABLE [dbo].$(TableName) ADD CONSTRAINT [DF_$(TableName)_CreateDate] DEFAULT (getdate()) FOR [CreateDate] ALTER TABLE [dbo].$(TableName) ADD CONSTRAINT [DF_$(TableName)_CreateServer] DEFAULT (@@servername) FOR [CreateServer] ----------------------------------------------------------------------------------- -- 004. Create and populate a test table -- Load TestTable Data - Seconds 0-59 are used as the Partitoning Key --:SETVAR TableName "TestTable" SET NOCOUNT ON; DECLARE @Now DATETIME = GETDATE() WHILE @Now > DATEADD(minute,-1,GETDATE()) BEGIN INSERT INTO [dbo].$(TableName) ([Partition_PK] ,[RandomNbr]) VALUES ( DATEPART(second,GETDATE()) ,ROUND((RAND() * 100),0) ) END ----------------------------------------------------------------------------------- -- Confirm table partitioning - http://lextonr.wordpress.com/tag/sys-destination_data_spaces/ SELECT N'DatabaseName' = DB_NAME() , N'SchemaName' = s.name , N'TableName' = o.name , N'IndexName' = i.name , N'IndexType' = i.type_desc , N'PartitionScheme' = ps.name , N'DataSpaceName' = ds.name , N'DataSpaceType' = ds.type_desc , N'PartitionFunction' = pf.name , N'PartitionNumber' = dds.destination_id , N'BoundaryValue' = prv.value , N'RightBoundary' = pf.boundary_value_on_right , N'PartitionFileGroup' = ds2.name , N'RowsOfData' = p.[rows] FROM sys.objects AS o INNER JOIN sys.schemas AS s ON o.[schema_id] = s.[schema_id] INNER JOIN sys.partitions AS p ON o.[object_id] = p.[object_id] INNER JOIN sys.indexes AS i ON p.[object_id] = i.[object_id] AND p.index_id = i.index_id INNER JOIN sys.data_spaces AS ds ON i.data_space_id = ds.data_space_id INNER JOIN sys.partition_schemes AS ps ON ds.data_space_id = ps.data_space_id INNER JOIN sys.partition_functions AS pf ON ps.function_id = pf.function_id LEFT OUTER JOIN sys.partition_range_values AS prv ON pf.function_id = prv.function_id AND p.partition_number = prv.boundary_id LEFT OUTER JOIN sys.destination_data_spaces AS dds ON ps.data_space_id = dds.partition_scheme_id AND p.partition_number = dds.destination_id LEFT OUTER JOIN sys.data_spaces AS ds2 ON dds.data_space_id = ds2.data_space_id ORDER BY DatabaseName ,SchemaName ,TableName ,IndexName ,PartitionNumber --================================================================================= -- SECTION 2 - SWITCH OUT -- 001 - Create TestTableOut -- 002 - Switch out partition in range 0-14 -- 003 - Merge range 0 -29 ----------------------------------------------------------------------------------- -- 001. TestTableOut :SETVAR TableName "TestTable" IF OBJECT_ID('dbo.$(TableName)Out') IS NOT NULL DROP TABLE [dbo].[$(TableName)Out] CREATE TABLE [dbo].[$(TableName)Out]( [Partition_PK] [int] NOT NULL, [GUID_PK] [uniqueidentifier] NOT NULL, [CreateDate] [datetime] NULL, [CreateServer] [nvarchar](50) NULL, [RandomNbr] [int] NULL, CONSTRAINT [PK_$(TableName)Out] PRIMARY KEY CLUSTERED ( [Partition_PK] ASC, [GUID_PK] ASC ) ) ON $(TableName)_fg2; GO ----------------------------------------------------------------------------------- -- 002 - Switch out partition in range 0-14 --:SETVAR TableName "TestTable" ALTER TABLE dbo.$(TableName) SWITCH PARTITION 2 TO dbo.$(TableName)Out; ----------------------------------------------------------------------------------- -- 003 - Merge range 0 - 29 --:SETVAR TableName "TestTable" ALTER PARTITION FUNCTION $(TableName)_func() MERGE RANGE (15); ----------------------------------------------------------------------------------- -- Confirm table partitioning -- Original source of this query - http://lextonr.wordpress.com/tag/sys-destination_data_spaces/ SELECT N'DatabaseName' = DB_NAME() , N'SchemaName' = s.name , N'TableName' = o.name , N'IndexName' = i.name , N'IndexType' = i.type_desc , N'PartitionScheme' = ps.name , N'DataSpaceName' = ds.name , N'DataSpaceType' = ds.type_desc , N'PartitionFunction' = pf.name , N'PartitionNumber' = dds.destination_id , N'BoundaryValue' = prv.value , N'RightBoundary' = pf.boundary_value_on_right , N'PartitionFileGroup' = ds2.name , N'RowsOfData' = p.[rows] FROM sys.objects AS o INNER JOIN sys.schemas AS s ON o.[schema_id] = s.[schema_id] INNER JOIN sys.partitions AS p ON o.[object_id] = p.[object_id] INNER JOIN sys.indexes AS i ON p.[object_id] = i.[object_id] AND p.index_id = i.index_id INNER JOIN sys.data_spaces AS ds ON i.data_space_id = ds.data_space_id INNER JOIN sys.partition_schemes AS ps ON ds.data_space_id = ps.data_space_id INNER JOIN sys.partition_functions AS pf ON ps.function_id = pf.function_id LEFT OUTER JOIN sys.partition_range_values AS prv ON pf.function_id = prv.function_id AND p.partition_number = prv.boundary_id LEFT OUTER JOIN sys.destination_data_spaces AS dds ON ps.data_space_id = dds.partition_scheme_id AND p.partition_number = dds.destination_id LEFT OUTER JOIN sys.data_spaces AS ds2 ON dds.data_space_id = ds2.data_space_id ORDER BY DatabaseName ,SchemaName ,TableName ,IndexName ,PartitionNumber
The table below shows the results of the ‘Confirm Table Partitioning’ query, before and after the MERGE.
The T-SQL code below illustrates the problem.
----------------------------------------------------------------------------------- -- PartitionLab_RangeRight USE workspace; DROP TABLE dbo.TestTableOut; USE master; ALTER DATABASE workspace REMOVE FILE TestTable_f3 ; -- ERROR --Msg 5042, Level 16, State 1, Line 1 --The file 'TestTable_f3 ' cannot be removed because it is not empty. ALTER DATABASE workspace REMOVE FILE TestTable_f2 ; -- Works surprisingly!! use workspace; ALTER INDEX [PK_TestTable] ON [dbo].[TestTable] REBUILD PARTITION = 2; --Msg 622, Level 16, State 3, Line 2 --The filegroup "TestTable_fg2" has no files assigned to it. Tables, indexes, text columns, ntext columns, and image columns cannot be populated on this filegroup until a file is added. --The statement has been terminated.
If you run ALTER INDEX REBUILD before trying to remove files from File Group 3, it works. Rerun the database setup script then the code below.
----------------------------------------------------------------------------------- -- RANGE RIGHT -- Rerun PartitionLabSetup_RangeRight.sql before the code below USE workspace; DROP TABLE dbo.TestTableOut; ALTER INDEX [PK_TestTable] ON [dbo].[TestTable] REBUILD PARTITION = 2; USE master; ALTER DATABASE workspace REMOVE FILE TestTable_f3; -- Works as expected!!
The file in File Group 2 appears to contain data but it can be dropped. Although the system views are reporting the data in File Group 2, it still physically resides in File Group 3 and isn’t moved until the index is rebuilt. The RANGE RIGHT function means the left file group (File Group 2) is retained when splitting ranges.
RANGE LEFT would have retained the data in File Group 3 where it already resided, no INDEX REBUILD is necessary to effectively complete the MERGE operation. The script below implements the same partitioning strategy (data distribution between partitions) on the test table but uses different boundary definitions and RANGE LEFT.
--================================================================================= -- PartitionLabSetup_RangeLeft.sql -- 001. Create test database -- 002. Add file groups and files -- 003. Create partition function and schema -- 004. Create and populate a test table --================================================================================= USE [master] GO ----------------------------------------------------------------------------------- -- 001 - Create Test Database ----------------------------------------------------------------------------------- :SETVAR DataDrive "D:\SQL\Data\" :SETVAR LogDrive "D:\SQL\Logs\" :SETVAR DatabaseName "workspace" :SETVAR TableName "TestTable" -- Drop if exists and create Database IF DATABASEPROPERTYEX(N'$(databasename)','Status') IS NOT NULL BEGIN ALTER DATABASE $(DatabaseName) SET SINGLE_USER WITH ROLLBACK IMMEDIATE DROP DATABASE $(DatabaseName) END CREATE DATABASE $(DatabaseName) ON ( NAME = $(DatabaseName)_data, FILENAME = N'$(DataDrive)$(DatabaseName)_data.mdf', SIZE = 10, MAXSIZE = 500, FILEGROWTH = 5 ) LOG ON ( NAME = $(DatabaseName)_log, FILENAME = N'$(LogDrive)$(DatabaseName).ldf', SIZE = 5MB, MAXSIZE = 5000MB, FILEGROWTH = 5MB ) ; GO ----------------------------------------------------------------------------------- -- 002. Add file groups and files ----------------------------------------------------------------------------------- --:SETVAR DatabaseName "workspace" --:SETVAR TableName "TestTable" --:SETVAR DataDrive "D:\SQL\Data\" --:SETVAR LogDrive "D:\SQL\Logs\" DECLARE @nSQL NVARCHAR(2000) ; DECLARE @x INT = 1; WHILE @x <= 6 BEGIN SELECT @nSQL = 'ALTER DATABASE $(DatabaseName) ADD FILEGROUP $(TableName)_fg' + RTRIM(CAST(@x AS CHAR(5))) + '; ALTER DATABASE $(DatabaseName) ADD FILE ( NAME= ''$(TableName)_f' + CAST(@x AS CHAR(5)) + ''', FILENAME = ''$(DataDrive)\$(TableName)_f' + RTRIM(CAST(@x AS CHAR(5))) + '.ndf'' ) TO FILEGROUP $(TableName)_fg' + RTRIM(CAST(@x AS CHAR(5))) + ';' EXEC sp_executeSQL @nSQL; SET @x = @x + 1; END ----------------------------------------------------------------------------------- -- 003. Create partition function and schema ----------------------------------------------------------------------------------- --:SETVAR TableName "TestTable" --:SETVAR DatabaseName "workspace" USE $(DatabaseName); CREATE PARTITION FUNCTION $(TableName)_func (int) AS RANGE LEFT FOR VALUES ( -1, 14, 29, 44, 59 ); CREATE PARTITION SCHEME $(TableName)_scheme AS PARTITION $(TableName)_func TO ( $(TableName)_fg1, $(TableName)_fg2, $(TableName)_fg3, $(TableName)_fg4, $(TableName)_fg5, $(TableName)_fg6 ); ----------------------------------------------------------------------------------- -- Create TestTable ----------------------------------------------------------------------------------- --:SETVAR TableName "TestTable" --:SETVAR BackupDrive "D:\SQL\Backups\" --:SETVAR DatabaseName "workspace" CREATE TABLE [dbo].$(TableName)( [Partition_PK] [int] NOT NULL, [GUID_PK] [uniqueidentifier] NOT NULL, [CreateDate] [datetime] NULL, [CreateServer] [nvarchar](50) NULL, [RandomNbr] [int] NULL, CONSTRAINT [PK_$(TableName)] PRIMARY KEY CLUSTERED ( [Partition_PK] ASC, [GUID_PK] ASC ) ON $(TableName)_scheme(Partition_PK) ) ON $(TableName)_scheme(Partition_PK) ALTER TABLE [dbo].$(TableName) ADD CONSTRAINT [DF_$(TableName)_GUID_PK] DEFAULT (newid()) FOR [GUID_PK] ALTER TABLE [dbo].$(TableName) ADD CONSTRAINT [DF_$(TableName)_CreateDate] DEFAULT (getdate()) FOR [CreateDate] ALTER TABLE [dbo].$(TableName) ADD CONSTRAINT [DF_$(TableName)_CreateServer] DEFAULT (@@servername) FOR [CreateServer] ----------------------------------------------------------------------------------- -- 004. Create and populate a test table -- Load TestTable Data - Seconds 0-59 are used as the Partitoning Key --:SETVAR TableName "TestTable" SET NOCOUNT ON; DECLARE @Now DATETIME = GETDATE() WHILE @Now > DATEADD(minute,-1,GETDATE()) BEGIN INSERT INTO [dbo].$(TableName) ([Partition_PK] ,[RandomNbr]) VALUES ( DATEPART(second,GETDATE()) ,ROUND((RAND() * 100),0) ) END ----------------------------------------------------------------------------------- -- Confirm table partitioning - http://lextonr.wordpress.com/tag/sys-destination_data_spaces/ SELECT N'DatabaseName' = DB_NAME() , N'SchemaName' = s.name , N'TableName' = o.name , N'IndexName' = i.name , N'IndexType' = i.type_desc , N'PartitionScheme' = ps.name , N'DataSpaceName' = ds.name , N'DataSpaceType' = ds.type_desc , N'PartitionFunction' = pf.name , N'PartitionNumber' = dds.destination_id , N'BoundaryValue' = prv.value , N'RightBoundary' = pf.boundary_value_on_right , N'PartitionFileGroup' = ds2.name , N'RowsOfData' = p.[rows] FROM sys.objects AS o INNER JOIN sys.schemas AS s ON o.[schema_id] = s.[schema_id] INNER JOIN sys.partitions AS p ON o.[object_id] = p.[object_id] INNER JOIN sys.indexes AS i ON p.[object_id] = i.[object_id] AND p.index_id = i.index_id INNER JOIN sys.data_spaces AS ds ON i.data_space_id = ds.data_space_id INNER JOIN sys.partition_schemes AS ps ON ds.data_space_id = ps.data_space_id INNER JOIN sys.partition_functions AS pf ON ps.function_id = pf.function_id LEFT OUTER JOIN sys.partition_range_values AS prv ON pf.function_id = prv.function_id AND p.partition_number = prv.boundary_id LEFT OUTER JOIN sys.destination_data_spaces AS dds ON ps.data_space_id = dds.partition_scheme_id AND p.partition_number = dds.destination_id LEFT OUTER JOIN sys.data_spaces AS ds2 ON dds.data_space_id = ds2.data_space_id ORDER BY DatabaseName ,SchemaName ,TableName ,IndexName ,PartitionNumber --================================================================================= -- SECTION 2 - SWITCH OUT -- 001 - Create TestTableOut -- 002 - Switch out partition in range 0-14 -- 003 - Merge range 0 -29 ----------------------------------------------------------------------------------- -- 001. TestTableOut :SETVAR TableName "TestTable" IF OBJECT_ID('dbo.$(TableName)Out') IS NOT NULL DROP TABLE [dbo].[$(TableName)Out] CREATE TABLE [dbo].[$(TableName)Out]( [Partition_PK] [int] NOT NULL, [GUID_PK] [uniqueidentifier] NOT NULL, [CreateDate] [datetime] NULL, [CreateServer] [nvarchar](50) NULL, [RandomNbr] [int] NULL, CONSTRAINT [PK_$(TableName)Out] PRIMARY KEY CLUSTERED ( [Partition_PK] ASC, [GUID_PK] ASC ) ) ON $(TableName)_fg2; GO ----------------------------------------------------------------------------------- -- 002 - Switch out partition in range 0-14 --:SETVAR TableName "TestTable" ALTER TABLE dbo.$(TableName) SWITCH PARTITION 2 TO dbo.$(TableName)Out; ----------------------------------------------------------------------------------- -- 003 - Merge range 0 - 29 :SETVAR TableName "TestTable" ALTER PARTITION FUNCTION $(TableName)_func() MERGE RANGE (14); ----------------------------------------------------------------------------------- -- Confirm table partitioning -- Original source of this query - http://lextonr.wordpress.com/tag/sys-destination_data_spaces/ SELECT N'DatabaseName' = DB_NAME() , N'SchemaName' = s.name , N'TableName' = o.name , N'IndexName' = i.name , N'IndexType' = i.type_desc , N'PartitionScheme' = ps.name , N'DataSpaceName' = ds.name , N'DataSpaceType' = ds.type_desc , N'PartitionFunction' = pf.name , N'PartitionNumber' = dds.destination_id , N'BoundaryValue' = prv.value , N'RightBoundary' = pf.boundary_value_on_right , N'PartitionFileGroup' = ds2.name , N'RowsOfData' = p.[rows] FROM sys.objects AS o INNER JOIN sys.schemas AS s ON o.[schema_id] = s.[schema_id] INNER JOIN sys.partitions AS p ON o.[object_id] = p.[object_id] INNER JOIN sys.indexes AS i ON p.[object_id] = i.[object_id] AND p.index_id = i.index_id INNER JOIN sys.data_spaces AS ds ON i.data_space_id = ds.data_space_id INNER JOIN sys.partition_schemes AS ps ON ds.data_space_id = ps.data_space_id INNER JOIN sys.partition_functions AS pf ON ps.function_id = pf.function_id LEFT OUTER JOIN sys.partition_range_values AS prv ON pf.function_id = prv.function_id AND p.partition_number = prv.boundary_id LEFT OUTER JOIN sys.destination_data_spaces AS dds ON ps.data_space_id = dds.partition_scheme_id AND p.partition_number = dds.destination_id LEFT OUTER JOIN sys.data_spaces AS ds2 ON dds.data_space_id = ds2.data_space_id ORDER BY DatabaseName ,SchemaName ,TableName ,IndexName ,PartitionNumber
The table below shows the results of the ‘Confirm Table Partitioning’ query, before and after the MERGE.
The data in the File and File Group to be dropped (File Group 2) has already been switched out; File Group 3 contains the data so no index rebuild is needed to move data and complete the MERGE.
RANGE RIGHT would not be a problem in a ‘Sliding Window’ if the same file group is used for all partitions, when they are created and dropped it introduces a dependency on full index rebuilds. Larger tables are typically partitioned and a full index rebuild might be an expensive operation. I’m not sure how a RANGE RIGHT partitioning strategy could be implemented, with an ascending partitioning key, using multiple file groups without having to move data. Using a single file group (multiple files) for all partitions within a table would avoid physically moving data between file groups; no index rebuild would be necessary to complete a MERGE and system views would accurately reflect the physical location of data.
If a RANGE RIGHT partition function is used, the data is physically in the wrong file group after the MERGE assuming a typical ascending partitioning key, and the 'Data Spaces' system views might be misleading. Thanks to Manuj and Chris for a lot of help investigating this.
NOTE 10/03/2014 - The solution
The solution is so easy it's embarrassing, I was using the wrong boundary points for the MERGE (both RANGE LEFT & RANGE RIGHT) to get rid of historic data.
-- Wrong Boundary Point Range Right
--ALTER PARTITION FUNCTION $(TableName)_func()
--MERGE RANGE (15);
-- Wrong Boundary Point Range Left
--ALTER PARTITION FUNCTION $(TableName)_func()
--MERGE RANGE (14);
-- Correct Boundary Pounts for MERGE
ALTER PARTITION FUNCTION $(TableName)_func()
MERGE RANGE (0); -- or -1 for RANGE LEFT
The empty, switched out partition (on File Group 2) is then MERGED with the empty partition maintained at the start of the range and no data movement is necessary. I retract the suggestion that a problem exists with RANGE RIGHT Sliding Windows using multiple file groups and apologize :-)