Friday, December 17, 2010

Concept how index works in SQL SERVER 2005

Concepts of Page and indexes
In SQL Server, the page size is 8 KB.Tables and indexes are stored as a collection of 8-KB pages This means SQL Server databases have 128 pages per megabyte.Each page begins with a 96-byte header that is used to store system information about the page. This information includes the page number, page type, the amount of free space on the page, and the allocation unit ID of the object that owns the page.Data rows are put on the page serially, starting immediately after the header. A row offset table starts at the end of the page, and each row offset table contains one entry for each row on the page. Each entry records how far the first byte of the row is from the start of the page. The entries in the row offset table are in reverse sequence from the sequence of the rows on the page.





Row cannot span over the pages.It can only cover 8060 bytes This is done whenever an insert or update operation increases the total size of the row beyond the 8060 byte limit. When a column is moved to a page in the ROW_OVERFLOW_DATA allocation unit, a 24-byte pointer on the original page in the IN_ROW_DATA allocation unit is maintained. If a subsequent operation reduces the row size, SQL Server dynamically moves the columns back to the original data page

Extents are the basic unit in which space is managed. An extent is eight physically contiguous pages, or 64 KB. This means SQL Server databases have 16 extents per megabyte.


1 Extent = 64 KB = 8 Page

Per mb 16 Extents

Two Types:

MIX Extent means more than one object of database is holding these extents

UNIFORM Extent means that Each pages under the extents are hold by single Objcet of database


A new table or index is generally allocated pages from mixed extents.

Table Organization in SQL SERVER

A table is contained in one or more partitions and each partition contains data rows in either a heap or a clustered index structure. The pages of the heap or clustered index are managed in one or more allocation units, depending on the column types in the data rows.

Indexed views have the same storage structure as clustered tables.

When a heap or a clustered table has multiple partitions, each partition has a heap or B-tree structure that contains the group of rows for that specific partition. For example, if a clustered table has four partitions, there are four B-trees; one in each partition.

Cluster tables is always implementing a B Tree index and Heap table will not have any order of rows

select au.*,p.*,DDPS.used_page_count*8,si.name,si.*,so.* from sys.objects so
INNER JOIN sys.indexes si
ON so.object_id = si.object_id
INNER JOIN sys.dm_db_partition_stats DDPS
ON DDPS.Object_id = si.object_id
INNER JOIN sys.partitions P
ON P.Object_id = si.object_id
INNER JOIN sys.allocation_units AU
ON AU.Container_ID = p.Partition_ID
where so.type='U'

HEAP TABLE

A heap is a table without a clustered index.IF Heap table has multiple partitions then each heap table has its own structure for that partitions.

The column first_iam_page in the sys.system_internals_allocation_units system view points to the first IAM page in the chain of IAM pages that manage the space allocated to the heap in a specific partition..Table scans or serial reads of a heap can be performed by scanning the IAM pages to find the extents that are holding pages for the heap. Because the IAM represents extents in the same order that they exist in the data files.

CLUSTERED TABLE

In SQL Server, indexes are organized as B-trees. Each page in an index B-tree is called an index node. The top node of the B-tree is called the root node. The bottom level of nodes in the index is called the leaf nodes. Any index levels between the root and the leaf nodes are collectively known as intermediate levels. In a clustered index, the leaf nodes contain the data pages of the underlying table. The root and intermediate level nodes contain index pages holding index rows. Each index row contains a key value and a pointer to either an intermediate level page in the B-tree, or a data row in the leaf level of the index. The pages in each level of the index are linked in a doubly-linked list.
Same like heap clustered has each structure for each partition.





NON LCUSTERED INDEXES

Nonclustered indexes have the same B-tree structure as clustered indexes but the underlying data pages are not in sorted order.Nonclustered indexes can be defined on a table or view with a clustered index or a heap. Each index row in the nonclustered index contains the nonclustered key value and a row locator. This locator points to the data row in the clustered index or heap having the key value.

The row locators in nonclustered index rows are either a pointer to a row or are a clustered index key for a row, as described in the following:

If the table is a heap, which means it does not have a clustered index, the row locator is a pointer to the row. The pointer is built from the file identifier (ID), page number, and number of the row on the page. The whole pointer is known as a Row ID (RID).

If the table has a clustered index, or the index is on an indexed view, the row locator is the clustered index key for the row. If the clustered index is not a unique index, SQL Server makes any duplicate keys unique by adding an internally generated value called a uniqueifier. This four-byte value is not visible to users. It is only added when required to make the clustered key unique for use in nonclustered indexes. SQL Server retrieves the data row by searching the clustered index using the clustered index key stored in the leaf row of the nonclustered index.

The page collections for the B-tree are anchored by root_page pointers in the sys.system_internals_allocation_units





I got one solution from Mr Pinal dave and its like

cluster index is always stored with table and non clustered is stored seperately infact if you read abouve things throughly i m sure you will come to know that both indexes are storing what things under there leaf nodes..