en-UShe-IL
You are here:   Blog
Register   |  Login

Blog Archive:

Maximize
* Can be used in order to search for older blogs Entries

Search in blogs


Blog Categories:

Maximize
* Can be used in order to search for blogs Entries by Categories

Blog Tags:

Maximize
* Can be used in order to search for blogs by keywords

TNWikiSummit


Awared MVP 


 


Microsoft® Community Contributor 


Microsoft® Community Contributor


 Read first, before you you use the blog! Maximize
מאי7

Written by: ronen ariely
07/05/2015 14:17 RssIcon

Background

Hello,

It all started with an answer that I wrote in the forum. I wrote a short post on the issue in this link. A CTO of one of my client saw the forum thread, and was interested in the actual storing format. He found the answer for DateTime type but not for DateTime2 type. I wasted more than one hour trying to find the answer myself, using Google and Bing and other limited closed interfaces/groups of Microsoft. Finally, I came to decision that it will be faster for me, and it will be a nice exercise, to examine the data myself to find the solution.

In this blog I will show step by step how I examined the DateTime2 data, in order to find the actual storing format. The basic process is based on reverse engineer by examining the binary data that we get in CONVERT to VARBINARY(16)

-- check the data as binary show us the format behind the scenes:
DECLARE @d DATETIME2 = '2016-02-27 00:00:00.000' -- My next birthday :-)
SELECT CONVERT(VARBINARY(16),@d)
GO

* Result: 0x070000000000143B0B

The goals in this blog are: (1) understand how DateTime2 stored in the database, (2) to provide a small glimpse on the process that we can use in order to examine what is going behind the scenes in the Page Data File (for specific data type).

  

Preparation

As a preparation, I chose to create a new database (ExamineDateTime2), and a new table (MyDateTime2).

create database ExamineDateTime2
GO
 
use ExamineDateTime2
GO
 
create table MyDateTime2 (
    MyDate0 datetime2(0),
    MyDate1 datetime2(1),
    MyDate2 datetime2(2),
    MyDate3 datetime2(3),
    MyDate4 datetime2(4),
    MyDate5 datetime2(5),
    MyDate6 datetime2(6),
    MyDate7 datetime2(7))
GO

  

USER-SPECIFIED precision

Datetime2 is flexible and include an optional "user-specified precision" parameter, which ranged between 0 through 7. The default precision is 7 digits.

I decided to examine the binary data using different precisions, with specific Date and Time.

DECLARE @d DATETIME2 = '2016-02-27 00:00:00.000' -- My next birthday :-)
DECLARE @d0 DATETIME2(0) = CONVERT(DATETIME2(0),@d)
DECLARE @d1 DATETIME2(1) = CONVERT(DATETIME2(1),@d)
DECLARE @d2 DATETIME2(2) = CONVERT(DATETIME2(2),@d)
DECLARE @d3 DATETIME2(3) = CONVERT(DATETIME2(3),@d)
DECLARE @d4 DATETIME2(4) = CONVERT(DATETIME2(4),@d)
DECLARE @d5 DATETIME2(5) = CONVERT(DATETIME2(5),@d)
DECLARE @d6 DATETIME2(6) = CONVERT(DATETIME2(6),@d)
DECLARE @d7 DATETIME2(7) = CONVERT(DATETIME2(7),@d)
 
SELECT 'DATETIME2(0)', @d0, CONVERT(VARBINARY(16),@d0), DATALENGTH(@d0), DATALENGTH(CONVERT(VARBINARY(16),@d0)) UNION ALl
SELECT 'DATETIME2(1)', @d1, CONVERT(VARBINARY(16),@d1), DATALENGTH(@d1), DATALENGTH(CONVERT(VARBINARY(16),@d1)) UNION ALl
SELECT 'DATETIME2(2)', @d2, CONVERT(VARBINARY(16),@d2), DATALENGTH(@d2), DATALENGTH(CONVERT(VARBINARY(16),@d2)) UNION ALl
SELECT 'DATETIME2(3)', @d3, CONVERT(VARBINARY(16),@d3), DATALENGTH(@d3), DATALENGTH(CONVERT(VARBINARY(16),@d3)) UNION ALl
SELECT 'DATETIME2(4)', @d4, CONVERT(VARBINARY(16),@d4), DATALENGTH(@d4), DATALENGTH(CONVERT(VARBINARY(16),@d4)) UNION ALl
SELECT 'DATETIME2(5)', @d5, CONVERT(VARBINARY(16),@d5), DATALENGTH(@d5), DATALENGTH(CONVERT(VARBINARY(16),@d5)) UNION ALl
SELECT 'DATETIME2(6)', @d6, CONVERT(VARBINARY(16),@d6), DATALENGTH(@d6), DATALENGTH(CONVERT(VARBINARY(16),@d6)) UNION ALl
SELECT 'DATETIME2(7)', @d7, CONVERT(VARBINARY(16),@d7), DATALENGTH(@d7), DATALENGTH(CONVERT(VARBINARY(16),@d7))
GO

  

Let’s examine the binary value in the result:

TypeOriginal Displaying FormatBinaryOriginal
LENGTH
Binary
LENGTH
DATETIME2(0)2016-02-27 00:00:00.00000000x00000000143B0B67
DATETIME2(1)2016-02-27 00:00:00.00000000x01000000143B0B67
DATETIME2(2)2016-02-27 00:00:00.00000000x02000000143B0B67
DATETIME2(3)2016-02-27 00:00:00.00000000x0300000000143B0B78
DATETIME2(4)2016-02-27 00:00:00.00000000x0400000000143B0B78
DATETIME2(5)2016-02-27 00:00:00.00000000x050000000000143B0B89
DATETIME2(6)2016-02-27 00:00:00.00000000x060000000000143B0B89
DATETIME2(7)2016-02-27 00:00:00.00000000x070000000000143B0B89

  

It is easy to see that all the binary values have the same closing value 143B0B with several zeros before. The obvious differences that we can see are in the first byte (each byte represented as 2 characters in the binary displayed format).

My conclusion is that the first byte give us the precision value (00 throw 07).

* This remind me that Variable-Length data like VARCHAR and NVARCHAR types uses 2 byte to store the length. But, is this the same behavior?!? Something look strange with the DATALENGTH.

 

CONVERT to Binary vs Data store

As we can see in the last two columns in the result above, DateTime2 takes 6, 7 or 8 bytes, depending on the precision (the BOL documentation confirms it) , but the binary data uses 1 extra byte!

We can check and confirm this by checking the actual page in the database file. Let’s insert some data to our pre-prepared table and examine the database pages that store our data in the data file:

DECLARE @d DATETIME2 = '2016-02-27 00:00:00.000' -- My next birthday :-)
INSERT MyDateTime2 (MyDate0,MyDate1,MyDate2,MyDate3,MyDate4,MyDate5,MyDate6,MyDate7)
select @d,@d,@d,@d,@d,@d,@d,@d
GO
 
select * from MyDateTime2
GO

 

First I need to find which pages store our new data (I am inserting only with the PageID and PageType columns):

DBCC IND('ExamineDateTime2', 'dbo.MyDateTime2', 0);
GO

PageFIDPagePIDPageType
129710
12961

 

The PageType with the value 1 is a data page, and that is the page that store our current data (296 in my case). Let’s examine the content of this page.

DBCC TRACEON(3604, -1);
DBCC PAGE(ExamineDateTime2, 1, 296, 3);
GO

 

In the result of the above query you can find the information regarding each row (in our case we have only one row), and each column. As expected the result confirm that DateTime2 takes 6, 7 or 8 bytes depending witch precision we used.

Slot 0 Column 1 Offset 0x4 Length 6 Length (physical) 6
MyDate0 = 2016-02-27 00:00:00      
Slot 0 Column 2 Offset 0xa Length 6 Length (physical) 6
MyDate1 = 2016-02-27 00:00:00.0    
Slot 0 Column 3 Offset 0x10 Length 6 Length (physical) 6
MyDate2 = 2016-02-27 00:00:00.00   
Slot 0 Column 4 Offset 0x16 Length 7 Length (physical) 7
MyDate3 = 2016-02-27 00:00:00.000  
Slot 0 Column 5 Offset 0x1d Length 7 Length (physical) 7
MyDate4 = 2016-02-27 00:00:00.0000 
Slot 0 Column 6 Offset 0x24 Length 8 Length (physical) 8
MyDate5 = 2016-02-27 00:00:00.00000
Slot 0 Column 7 Offset 0x2c Length 8 Length (physical) 8
MyDate6 = 2016-02-27 00:00:00.000000
Slot 0 Column 8 Offset 0x34 Length 8 Length (physical) 8
MyDate7 = 2016-02-27 00:00:00.0000000

 

Why the data length in the table shorter than the same data while using CONVERT to binary?

While using tables, we store the data in a columns, which include headers. The information regarding the precision is the same for all the values in the same column (not like VARCHAR/NVARCHAR types). Therefore there is no need to store this information in each value again. It stored once for each column, and it is not part of the stored row data.

Most if not all documentation, regarding how the data stored, use the procedure of CONVERT to binary, assuming that the binary information will be the same as in the data file. This is a common mistake, which lots of great professional posts do not focus on. It leads us to think that by checking the result of CONVERT to binary, we always looks at the data, as it is store in the database. In fact the CONVERT returns result that fit our use, according to built-in rules.

In our case the convert action had to add one extra byte, since: (1) the precision must be part of the value as I will show later, (2) using the binary value we do not have the column’s headers to get this information. The precision give us the information regarding the rounding that we done in order to get this value. Without this information we will have no way to understand what date & time this value represent.

Let’s execute the same script as above with the date '2016-02-27 11:11:11.111' (I added time):

-- Let’s execute the same script as above with the date '2016-02-27 11:11:11.111'
DECLARE @d DATETIME2 = '2016-02-27 11:11:11.111' -- My next birthday :-)
DECLARE @d0 DATETIME2(0) = CONVERT(DATETIME2(0),@d)
DECLARE @d1 DATETIME2(1) = CONVERT(DATETIME2(1),@d)
DECLARE @d2 DATETIME2(2) = CONVERT(DATETIME2(2),@d)
DECLARE @d3 DATETIME2(3) = CONVERT(DATETIME2(3),@d)
DECLARE @d4 DATETIME2(4) = CONVERT(DATETIME2(4),@d)
DECLARE @d5 DATETIME2(5) = CONVERT(DATETIME2(5),@d)
DECLARE @d6 DATETIME2(6) = CONVERT(DATETIME2(6),@d)
DECLARE @d7 DATETIME2(7) = CONVERT(DATETIME2(7),@d)
 
SELECT 'DATETIME2(0)', @d0, CONVERT(VARBINARY(16),@d0), DATALENGTH(@d0), DATALENGTH(CONVERT(VARBINARY(16),@d0)) UNION ALl
SELECT 'DATETIME2(1)', @d1, CONVERT(VARBINARY(16),@d1), DATALENGTH(@d1), DATALENGTH(CONVERT(VARBINARY(16),@d1)) UNION ALl
SELECT 'DATETIME2(2)', @d2, CONVERT(VARBINARY(16),@d2), DATALENGTH(@d2), DATALENGTH(CONVERT(VARBINARY(16),@d2)) UNION ALl
SELECT 'DATETIME2(3)', @d3, CONVERT(VARBINARY(16),@d3), DATALENGTH(@d3), DATALENGTH(CONVERT(VARBINARY(16),@d3)) UNION ALl
SELECT 'DATETIME2(4)', @d4, CONVERT(VARBINARY(16),@d4), DATALENGTH(@d4), DATALENGTH(CONVERT(VARBINARY(16),@d4)) UNION ALl
SELECT 'DATETIME2(5)', @d5, CONVERT(VARBINARY(16),@d5), DATALENGTH(@d5), DATALENGTH(CONVERT(VARBINARY(16),@d5)) UNION ALl
SELECT 'DATETIME2(6)', @d6, CONVERT(VARBINARY(16),@d6), DATALENGTH(@d6), DATALENGTH(CONVERT(VARBINARY(16),@d6)) UNION ALl
SELECT 'DATETIME2(7)', @d7, CONVERT(VARBINARY(16),@d7), DATALENGTH(@d7), DATALENGTH(CONVERT(VARBINARY(16),@d7))
GO

 

If we will ignore the precision information (the first byte), then we can see that the value 4F9D00143B0B and the value 172506143B0B looks the same: both length 5 bytes after removing the precision byte, and without the information regarding the precision, we cannot go back and get the date & time of this value.

* The reason that we got different value is because the precision is different. Later in the article, I'll show that the value 0x004F9D00143B0B represents the DateTime ‘2016-02-27 11:11:11.0000000’ while the value 0x01172506143B0B represent the DateTime ‘2016-02-27 11:11:11.1000000’, and the only way to get it, is using the precision byte.

Type

Original Displaying Format

Binary

Original
LENGTH

Binary
LENGTH

DATETIME2(0)

2016-02-27 11:11:11.0000000

0x004F9D00143B0B

6

7

DATETIME2(1)

2016-02-27 11:11:11.1000000

0x01172506143B0B

6

7

DATETIME2(2)

2016-02-27 11:11:11.1100000

0x02E7723D143B0B

6

7

DATETIME2(3)

2016-02-27 11:11:11.1110000

0x03077D6602143B0B

7

8

DATETIME2(4)

2016-02-27 11:11:11.1110000

0x0446E20018143B0B

7

8

DATETIME2(5)

2016-02-27 11:11:11.1110000

0x05BCD608F000143B0B

8

9

DATETIME2(6)

2016-02-27 11:11:11.1110000

0x065863586009143B0B

8

9

DATETIME2(7)

2016-02-27 11:11:11.1110000

0x0770E173C35D143B0B

8

9

 

My conclusion: CONVERT to binary dose not return the data as it stored in the database! This is a mistake to use it before we confirm that the data format the same way, or we explain the differences.

 

Variable-length types vs DateTime2 (which is fixed length type)

I mentioned above that the idea of using the first byte in order to store the precision remind me variable-length types, like VARCHAR or NVARCHAR, but the behavior is actually opposite.

Variable-length data types have a length that varies on a record-by-record basis. For example VARCHAR (10) might take up to 10 bytes in one record, but it might only take 3 bytes in another, depending on the stored value. The storage size is “the actual length of the data entered” + “2 bytes”. These two bytes are written in the row data, but in a different section. The row data include: (1) two bytes that indicate the number of variable-length columns stored in the record. (2) A series of two-byte values that form the variable-length offset array, one for each column, pointing to the byte index in the record where the related column data ends (this part is the extra 2 byte that we do not get in the CONVERT result). (3) Finally, we have the actual variable-length columns. These 2 bytes are very important for improving the way SQL Server use the data, but it is not needed for the interpretation of the data value. When we check the size of the data using the DATALENGTH, we do not get the extra 2 byte, therefore the DATALEGTH of CONVERT Variable-Length data to binary smaller than the actual size in row data! it does not include the 2 bytes for the length.

The DateTime2 type is a fixed length type. The precision information dose not stored with the row data. Using CONVERT to binary, the information is added to the value. Therefore the DATALEGTH of CONVERT DateTime2 to binary bigger than the actual size in row data! it include the precision as part of the value, even so it is not stored in the row data in the data file.

 

Examine the actual table data in the page file:

I decided to check the idea if converted information can help us in this discussion. It is faster and simpler to examine the CONVERT result than to examine the data directly from the data file. This will allow me to examine data directly without insert it to a table.

In order to do this I will add one more row to the table, and check the data in the data file using DBCC PAGE.

DECLARE @d DATETIME2 = '2016-02-27 11:11:11.111' -- My next birthday :-)
INSERT MyDateTime2 (MyDate0,MyDate1,MyDate2,MyDate3,MyDate4,MyDate5,MyDate6,MyDate7)
select @d,@d,@d,@d,@d,@d,@d,@d
GO
 
select * from MyDateTime2
GO

 

MyDate0

MyDate1

MyDate2

MyDate3

2016-02-27 00:00:00

2016-02-27 00:00:00.0

2016-02-27 00:00:00.00

2016-02-27 00:00:00.000

2016-02-27 11:11:11

2016-02-27 11:11:11.1

2016-02-27 11:11:11.11

2016-02-27 11:11:11.111


MyDate4

MyDate5

MyDate6

2016-02-27 00:00:00.0000

2016-02-27 00:00:00.00000

2016-02-27 00:00:00.000000

2016-02-27 11:11:11.1110

2016-02-27 11:11:11.11100

2016-02-27 11:11:11.111000


MyDate7

2016-02-27 00:00:00.0000000

2016-02-27 11:11:11.1110000

 

Let’s examine the actual data at the second row in the page data (slot 1 is the second row while slot 0 is the first row, as the counting start from 0)

DBCC PAGE(ExamineDateTime2, 1, 296, 1);
GO
/*
Slot 1, Offset 0x9f, Length 63, DumpStyle BYTE
Record Type = PRIMARY_RECORD        Record Attributes =  NULL_BITMAP    Record Size = 63
Memory Dump @0x000000001180A09F
0000000000000000:   10003c00 4f9d0014 3b0b1725 06143b0b e7723d14  ..<.O...;..%..;.çr=.
0000000000000014:   3b0b077d 6602143b 0b46e200 18143b0b bcd608f0  ;..}f..;.Fâ...;.¼Ö.ð
0000000000000028:   00143b0b 58635860 09143b0b 70e173c3 5d143b0b  ..;.XcX`  .;.pásÃ].;.
000000000000003C:   080000                                           ...   
*/

 

This is the actual data for the second row in the page:

10003c00 4f9d0014 3b0b1725 06143b0b e7723d14
3b0b077d 6602143b 0b46e200 18143b0b bcd608f0
00143b0b 58635860 09143b0b 70e173c3 5d143b0b
080000                                     

To make it simpler to examine, I will write it as one long string. This is the actual data in the data file for the second row.

10003c004f9d00143b0b172506143b0be7723d143b0b077d6602143b0b46e20018143b0bbcd608f000143b0b586358600
9143b0b70e173c35d143b0b080000

Now I can split the data logically and explain each part of the data:

* SQL Server stores numbers with the low order byte first, so the bytes must be switched around to get the position. Thus 0x0800 = 0x-08-00 becomes 0x-00-08 or 0x0008 (8 in decimal).

10-00-3c00-4f9d00143b0b172506143b0be7723d143b0b077d6602143b0b46e20018143b0bbcd608f000143b0b5863586
009143b0b70e173c35d143b0b-080000

 

Bytes value

Binary value

Decimal

explanation

10

10

1

The first byte is a status byte that contains info on how to parse the row contents.

00

00

0

The next byte is currently unused

3c00

003c

60

The next two bytes stores the byte index position of where to find the number of columns

 

* We can jump to the 60th byte to make sure the value

080000

000008

8

As expected we have 8 columns in the table :-)

Let’s go back to the data. In our case all the information between the “index position” and the “number of columns” is the actual columns data. For more information regarding how rows data store in the page I will write another post, if I will have time.

4f9d00143b0b-172506143b0b-e7723d143b0b-077d6602143b0b-46e20018143b0b-bcd608f000143b0b-586358600914
3b0b-70e173c35d143b0b

 

 

 

 

4f9d00143b0b


 

 

 

172506143b0b

 

 

 

e7723d143b0b

 

 

 

077d6602143b0b

 

 

 

46e20018143b0b

 

 

 

bcd608f000143b0b

 

 

 

5863586009143b0b

 

 

 

70e173c35d143b0b

 

 

 


As we can see, the data match the data in the result of CONVERT to binary except of the extra byte at the start, which I explain above why this information is added.

By the way, this is the first row data (spited logically with dash):
10-00-3c00-000000143b0b-000000143b0b-000000143b0b-00000000143b0b-00000000143b0b-0000000000143b0b-0000000000143b0b-0000000000143b0b-080000                                     

 

My conclusion: CONVERT to binary from DateTime2 add one extra byte in the start, but the rest of the data is in the same format as it is stored in the data file. The precision information dose not stored with the row value itself, as expected from Fixed-Length data type.

OK... it is time to find the format of the actual data and how we can get the Date and Time from that data.

  

Date part in the DateTime2 data format

If you notice, in all the examples, I used the same date: 2016-02-27, and all the binary data that we got ended with the same 3 bytes: 143b0b. Moreover, the DateTime type format is well known. DateTime stored in the database as 4 bytes that represent the Time followed by four bytes that represent the date. At this point it was obvious to me that DateTime2 uses the same idea. It stored as date and time in separate parts. I just need to confirm this assumption with some testing. Next, I will need to find the way to get the actual time and actual date from these binaries parts.

-- Date part in the DateTime2 data format
-- 9999 days up from the min value
DECLARE @d DATETIME2 = '0001-01-01 00:00:00.0000000'
select top 99999
    DATEADD(DAY, NumberF, @d), CONVERT(VARBINARY(16),DATEADD(DAY, NumberF, @d))
from _ArielyAccessoriesDB.dbo.ArielyNumbersTbl
order by NumberF
GO
 
-- 9999 days down from the max  value
DECLARE @d DATETIME2 = '9999-12-31 00:00:00.0000000'
select top 99999
    DATEADD(DAY, -NumberF, @d), CONVERT(VARBINARY(16),DATEADD(DAY, -NumberF, @d))
from _ArielyAccessoriesDB.dbo.ArielyNumbersTbl
order by NumberF
GO

  

Original value

Binary

0001-01-02 00:00:00.0000000

0x070000000000010000

0001-01-03 00:00:00.0000000

0x070000000000020000

0001-01-04 00:00:00.0000000

0x070000000000030000

.
.
.

.
.
.

9999-12-28 00:00:00.0000000

0x070000000000D7B937

9999-12-29 00:00:00.0000000

0x070000000000D8B937

9999-12-30 00:00:00.0000000

0x070000000000D9B937

 

The result of the above two queries confirm that the last 3 bytes in the binary value represent the date.

Converting the binary data into actual dates

SQL Server stores numbers, which represent the times that a specific “time interval” passed from a specific value (the zero value). For the dates our time interval is day and the zero value is 0001-01-01. Therefore, the data that stored in the database is the number of days from the first day.

Trying to convert the value from hexadecimal to decimal will show us that this is not the right format. We can see that the first value is 010000 and not 000001 as expected from the number one in hexadecimal. This is more confusing than just reverse the order. The binary value is not actually hexadecimal number! It is 3 characters in base 256 (each two characters together represent one byte, which can get the value from 00 to FF). Moreover, SQL Server stores these base 256 numbers with the low order byte first, so the bytes must be switched around to get the real hexadecimal number. Thus the last value in the table above is D9B937 which is built of 3 chars D9, B9, 37.  This value becomes 37b9d9. Now we can convert this number to decimal and get the number of days from the first day. Let’s examine the last value in the table above 0x070000000000D9B937, and we will go step by step to get the actual date that this value represent.

Step 1: switch around the bytes in order to get the text that represent the hexadecimal value of the date part. At this time I will do it manually: D9B937 -> 37b9d9

Step 2: I will convert our text to binary type:

-- Convert text that represent hexadecimal number to binary type
-- style 2 mean that our text dose not include the prefix ‘0x’, as in our case.
-- style 1 mean that our text already include the prefix ‘0x’.
SELECT CONVERT (BINARY (3),'37b9d9', 2)
GO
-- 0x37B9D9

 

In order to convert the hexadecimal number that we got above to decimal, we can simply multiply it by 1.  SQL Server use implicit convert while calculating math operation on two different types (if it can do it, otherwise it raise an error). For example text ‘2’ multiple by number 1.1 returns the type of the number 2.2 (decimal), hexadecimal multiple by integer (INT/BIGINT) returns integer.

Since our number might be too big for INT I will use BIGINT for the convert:

SELECT 0x37B9D9 * convert(BIGINT,1)
GO
-- 3652057

  

So we got 3652057 times our “time interval” which is day. Therefore our actual date is:

declare @FirstValueDateTime DATETIME2 = '0001-01-01 00:00:00.000000'
select DATEADD(DAY, 3652057, @FirstValueDateTime)
GO
-- 9999-12-30 00:00:00.0000000

 

We can compare this result to the value in the table above, and see that as expected we got the original date.

I wrote a simple function that dose all these steps together. It get text as input and return the decimal number which is the times that the “interval time” passed. I will use this function from now on. 

IF OBJECT_ID (N'dbo.DateTimeTicks', N'FN') IS NOT NULL
    DROP FUNCTION FlipBytes;
GO
CREATE FUNCTION dbo.DateTimeTicks(@In NVARCHAR(20))
RETURNS BIGINT
AS
BEGIN
    DECLARE @Len INT = len(@In)/2, @i INT = 0;
    DECLARE @FlipedText NVARCHAR(20) =  ''
    WHILE @i < @Len BEGIN
        SET @FlipedText = @FlipedText + SUBSTRING(@In,len(@In) - (2*@i) - 1,2)
       SET @i = @i + 1
    END;
 
    RETURN CONVERT(binary(16),'0x' + RIGHT(N'00000000000000000000000000000000' + @FlipedText,32),1) * CONVERT(BIGINT,1)
 
END;
GO

 

My conclusion: the date represented in the stored value in the last 3 bytes. It represent an integer which is the number of days passed from the first day 0001-01-01

  

Time part in the DateTime2 data format

Let’s examine the binary data while we change only the time.

-- Time part in the DateTime2 data format
-- the best precision for DateTime2 is 7 whch give us time 100 nanosecond accuracy.
-- I will loop throw the time using 100 nanosecond time interval
DECLARE @d DATETIME2 = '0001-01-01 00:00:00.0000000'
select top 10000
    DATEADD(nanosecond, 100*NumberF, @d) Text_Format,
    CONVERT(VARBINARY(16),DATEADD(nanosecond, 100*NumberF, convert(DATETIME2(0),@d))) DateTime2_0,
    CONVERT(VARBINARY(16),DATEADD(nanosecond, 100*NumberF, convert(DATETIME2(1),@d))) DateTime2_1,
    CONVERT(VARBINARY(16),DATEADD(nanosecond, 100*NumberF, convert(DATETIME2(2),@d))) DateTime2_2,
    CONVERT(VARBINARY(16),DATEADD(nanosecond, 100*NumberF, convert(DATETIME2(3),@d))) DateTime2_3,
    CONVERT(VARBINARY(16),DATEADD(nanosecond, 100*NumberF, convert(DATETIME2(4),@d))) DateTime2_4,
    CONVERT(VARBINARY(16),DATEADD(nanosecond, 100*NumberF, convert(DATETIME2(5),@d))) DateTime2_5,
    CONVERT(VARBINARY(16),DATEADD(nanosecond, 100*NumberF, convert(DATETIME2(6),@d))) DateTime2_6,
    CONVERT(VARBINARY(16),DATEADD(nanosecond, 100*NumberF, convert(DATETIME2(7),@d))) DateTime2_7
from _ArielyAccessoriesDB.dbo.ArielyNumbersTbl
order by NumberF

 

The part in the stored data that represent the time in DateTime2 type, can take 3 to 5 bytes depending on the precision. In order to get the value that represent the time, I removed the last 3 bytes, which are the date. Next, I will remove the first byte, which is the precision part. All the rest is the time data.

The interpretation procedure to get the actual time, from the stored data is the same as we did with the Date's data. First I will switch around the bytes in order convert SQL Server data format into hexadecimal value. Next, we will convert the hexadecimal into integer (BIGINT). The result will be the times that our “time interval” passed from the first time, which is 00:00:00.00000. For this procedure I will simply use the same function that I used in the previous paragraph.

The last step in getting the actual time from the stored data is a bit more complex. We need to multiply the number of “time interval” by the “Time interval” in order to get the actual time. The “time interval” in DateTime2 depend on the precision. There is no way to understand the stored data and to get the actual time without knowing what the precision is! This is why we had to get this information as part of the value that CONVERT DateTime2 to binary returns.

Important! In this post and during my testing I am using the function DATEADD in order to get the actual time and date. In the final step, I am adding the time or date that we found, to the first date (0001-01-01) or the first time (00:00:00.00000). Unfortunately, DATEADD function work with INT and not BIGINT. The interpretation procedure of the time might returns a value that is too big for INT type, so we have to use BIGINT or we need to find a way to work around this limitation. Best option is to use CLR function, which allowed a simple function! Obviously (or should I say hopefully), under the hood SQL Server do not use the same procedure that I am using in this post.

Next code show the complete procedure of interpretation the stored value and get the actual Date and time in the displaying format that used to.

DECLARE     @FullText        nvarchar(20)   =   [sys].[fn_varbintohexstr](0x0770E173C35D143B0B);
DECLARE     @RESULT          DATETIME2      =   '0001-01-01 00:00:00.0000000'
DECLARE     @DateText        nvarchar(6)    =   RIGHT(@FullText, 6)
DECLARE     @TimeText        nvarchar(10)   =   SUBSTRING(@FullText,5,LEN(@FullText) - 4 - 6)
DECLARE     @ResultPercision INT            =   SUBSTRING(@FullText,3,2)
DECLARE     @TimeTicks       BIGINT         =   dbo.DateTimeTicks(@TimeText)
--select @FullText, @ResultPercision Percision, dbo.DateTimeTicks(@DateText) DateTicks, dbo.DateTimeTicks(@TimeText) TimeTicks
SET @RESULT = DATEADD(day, dbo.DateTimeTicks(@DateText), @RESULT)
-- This part we can calculate only if we have the precision!
-- this is why the precision must be part of the value that return from the CONVERT DateTime2 to Binary
select @RESULT =
    CASE
        when @ResultPercision = 0 then DATEADD(SECOND,      @TimeTicks,     @RESULT)   --OK
        when @ResultPercision = 1 then DATEADD(MILLISECOND, 100*@TimeTicks, @RESULT)   --OK
        when @ResultPercision = 2 then DATEADD(MILLISECOND, 10*@TimeTicks,  @RESULT)   --OK
        when @ResultPercision = 3 then DATEADD(MILLISECOND, @TimeTicks,     @RESULT)   --OK
        -- The complicity in the next part is only a result of the limitation of the DATEADD function.
        -- The function use INT and our values for precision more then 3 might be too big for INT!
        -- My solution was to brack the action and add seconds first, and then the nanosecond
        -- I did not check if ths fit the max value! if it is not, then we need to brack to another action
        -- hours first
        when @ResultPercision = 4 then
            DATEADD(SECOND,  (@TimeTicks*1000)%CONVERT(BIGINT,2160000000000)/10000000,
                DATEADD(nanosecond, 100*((@TimeTicks*1000) - (((@TimeTicks*1000)%CONVERT(BIGINT,2160000000000)/10000000)*10000000)),     @RESULT)
            )
        when @ResultPercision = 5 then
            DATEADD(SECOND,  (@TimeTicks*100)%CONVERT(BIGINT,2160000000000)/10000000,
                DATEADD(nanosecond, 100*((@TimeTicks*100) - (((@TimeTicks*100)%CONVERT(BIGINT,2160000000000)/10000000)*10000000)),     @RESULT)
            )
        when @ResultPercision = 6 then
            DATEADD(SECOND,  (@TimeTicks*10)%CONVERT(BIGINT,2160000000000)/10000000,
                DATEADD(nanosecond, 100*((@TimeTicks*10) - (((@TimeTicks*10)%CONVERT(BIGINT,2160000000000)/10000000)*10000000)),     @RESULT)
            )
        when @ResultPercision = 7 then
            DATEADD(SECOND,  @TimeTicks%CONVERT(BIGINT,2160000000000)/10000000,
                DATEADD(nanosecond, 100*(@TimeTicks - ((@TimeTicks%CONVERT(BIGINT,2160000000000)/10000000)*10000000)),     @RESULT)
            )
    END
SELECT @RESULT
GO

  

Summarize and Conclusions

In this post I showed the format that DateTime2 stored in the database. I showed that is stored as two separated parts (1) first represents the time and can take 3 to 5 bytes, (2) second part represent the date and always take 3 bytes. I showed how to interpret each part in order to get the actual Date and Time. The post focused on the reverse engineer’s process, examining the stored data, in order to understand the format it uses. This post is a small glimpse to the procedure. During the process we got several assumptions, which I confirmed, and several important conclusions.

Let’s go over the main points:

Ø  DateTime2 stored as two separated parts (1) first part represents the time and can take 3 to 5 bytes, (2) second part represent the date and always take 3 bytes.

o   Each part formatted as a number in base 256, but in reverse order. The low order byte first, so the bytes must be switched around to get the real hexadecimal number. Thus the value D9B937 built of 3 characters D9, B9, 37.  Switching the order, this value becomes the hexadecimal number 37b9d9.

o   The value give us number of times a specific “time interval” passed from the minimum value. If the stored data represents the number X, and the “time interval” is Y, and the minimum value is Z, than our stored data represent the actual value of Z + (X * Y).

Ø  The precision information dose not stored with the row value itself, as expected from Fixed-Length data type.

Ø  CONVERT to Binary does not necessarily return the same data or the same format as the stored data! It is always better to check the stored data directly using DBCC PAGE

Ø  CONVERT to binary from DateTime2 add one extra byte in the start, but the rest of the data is in the same format as stored in the data page file. Thus we get three parts that represent: (1) the precision, (2) the time, (3) the date.

Ø  In order to interpret the actual time we must have the precision value.

Ø  Date part in the stored data use days as “time interval”

Ø  Each byte represented as 2 characters in the binary displayed format.

Ø  The function DATEADD use INT as input. To work with DATETIME2, It is highly recommended to create a CLR function, which use BIGINT as input.

 

I hope that this post was fun and useful :-)

See you at my next post
Ronen,

 

 

Appendix A: About the author

Ronen Ariely is a senior consultant, architect, and a programmer. He has more than 15 years of experience in variety of programming languages and technologies, managing and leading development teams, and SQL & BI enterprise level solutions.

Ronen is active in network communities in the field of Programming, SQL Server, T-SQL. He serve as Moderator at MSDN communities (aka pituach), at the local interface in Hebrew and at the global interface in English. He is the founder and co-leader of the Microsoft TechNet Wiki Ninjas groups on Facebook. Ronen is also writing technical blogs, MS-TechNet articles, which won the TechNet Guru Competition, multiple times.

Blog - http://ariely.info/Blog/tabid/83/
Twitter - https://twitter.com/pitoach
Facebook - https://www.facebook.com/ronen.ariely
MSDN profile - https://social.msdn.microsoft.com/profile/pituach


Appendix B: Comments

Ø  The procedure described it the post, was done in on SQL Server 2014 (Enterprise Edition). It should be the same on any earlier version of SQL Server that supports DateTime2 probably, but we can not know what will be on the next version! There is no guarantee that Microsoft will not change the Stored format in the future. Please remember that DBCC PAGE is undocumented command.

Ø  In 1752 Britain switched from the Julian to the Gregorian calendar, which we are using today. There were some missing days in the British calendar in 1752 when the adjustment was finally made from the Julian calendar. September 3, 1752 to September 13, 1752 were skipped. This is the reason that DateTime start from the year 1753.

Ø  DateTime2 ranged between ‘0001-01-01 00:00:00’ through ‘9999-12-31 23:59:59.9999999’. Therefore, DateTime2 include several dates that are not exist in the Gregorian calendar. These are virtual dates that can leads us to unexpected results! For example the date '1752-09-04'

declare @D datetime2 = '1752-09-04'
GO – OK

Ø  During my original testing I executed several more testing, using different time intervals and different precisions. This post bring only the basic steps. Real reverse engineer work, might be complex and take long time. Experience and prior knowledge of similar cases can reduce our job dramatically. We can get assumptions from prior knowledge. For example. In exploring the DateTme2 I was already familiar with other types like DateTime, which help me a lot. Next time I will explore a type, probably I will use less than third of the time that I used in this case.

 

Appendix C: References and more information

Date displaying format, vs Date storing format, written by: Ronen Ariely
http://ariely.info/Blog/tabid/83/EntryId/161/Date-displaying-format-vs-Date-storing-format.aspx

Microsoft SQL Server 2012 Internals: Special Storage, By Kalen Delaney and Craig Freeman
https://www.microsoftpressstore.com/articles/article.aspx?p=2225060

SQL Server Storage Internals 101, by Mark S Rasmussen
https://www.simple-talk.com/sql/database-administration/sql-server-storage-internals-101/

Inside the Storage Engine: Anatomy of a record, By: Paul Randal
http://www.sqlskills.com/blogs/paul/inside-the-storage-engine-anatomy-of-a-record/

Deciphering a SQL Server Data Page
http://improve.dk/deciphering-a-sql-server-data-page/

The ultimate guide to the DateTime datatypes
http://www.karaszi.com/sqlserver/info_datetime.asp

BOL - datetime2 (Transact-SQL)
https://msdn.microsoft.com/en-us/library/bb677335.aspx?f=255&MSPPError=-2147217396

More undocumented fun: DBCC IND, DBCC PAGE, and off-row columns
http://blogs.msdn.com/b/sqlserverstorageengine/archive/2006/12/13/more-undocumented-fun_3a00_-dbcc-ind_2c00_-dbcc-page_2c00_-and-off_2d00_row-columns.aspx


Appendix D: SQL Server Data page structure

If you read this post, then you must already know that SQL Server use basic unit of IO named PAGE (each page size is 8192 bytes). The page has a basic structure consisting of (1) a header (96 bytes), and (2) body (8096 bytes). There are different types of pages like data, GAM, SGAM, etc. The actual data stored in data pages. The body include: (1) the actual data row content, and (2) row offsets table (an array noting the locations of the individual rows in the content section).


SQL Server Page Data Structure