.NET Tutorials, Forums, Interview Questions And Answers
Welcome :Guest
 
Sign In
Register
 
Win Surprise Gifts!!!
Congratulations!!!


Top 5 Contributors of the Month
satyapriyanayak
Ashutosh Jha
ASPEvil
Mahesh

Home >> Articles >> DataBase >> Post New Resource Bookmark and Share   

 Subscribe to Articles

Indexes in SQL Server

Posted By:Lakhan Pal Garg       Posted Date: May 10, 2010    Points: 25    Category: DataBase    URL: http://lakhangarg.blogspot.com  

An index in SQL Server assists the database engine with locating records, just like index in book helps you locate information quickly. Indexes are structured to facilitate the rapid return of result sets. The characteristics of an index affect its use of system resources and its lookup performance.
 

Introduction

An index in SQL Server assists the database engine with locating records, just like index in book helps you locate information quickly. Indexes are structured to facilitate the rapid return of result sets. The characteristics of an index affect its use of system resources and its lookup performance. The query optimizer uses an index if it will increase query performance. The use of indexes is necessary as without indexes, a query causes SQL Server to search all records in a table (table scan) in order to find matches. Therefore, proper usage of indexes is of prime importance from performance point of view.

Working of Indexes

An index in SQL Server assists the database engine with locating records, just like index in book helps you locate information quickly. Indexes are structured to facilitate the rapid return of result sets. The characteristics of an index affect its use of system resources and its lookup performance. The query optimizer uses an index if it will increase query performance. The use of indexes is necessary as without indexes, a query causes SQL Server to search all records in a table (table scan) in order to find matches. Therefore, proper usage of indexes is of prime importance from performance point of view.
An index is structured by the SQL Server Index Manager as a Balanced Tree (or B-Tree). An index contains an entry with one or more columns (search key) from each row in a table. A B-Tree) is stored in sorted order on the search key, and can be searched efficiently on any leading subset of that search key. For example, an index on columns A, B and C of a particular table can be searched efficiently on A, on A, B and on A, B, C. For optimal query performance, create indexes on columns in a table that are commonly used in queries. Do not create indexes for every column in a table, because too many indexes will negatively impact performance.
Example: In the pubs sample database provided with SQL Server, the employee table has an index on the emp_id column. When someone executes a statement to find data in the employee table based on a specified emp_id value, SQL Server query processor recognizes the index for the emp_id column and uses the index to find the data.Tables with indexes require more storage space in the database. Also, commands that insert, update or delete data can take longer and require more processing time to maintain the indexes. Therefore, before creating an index you must be sure that the increased query performance afforded by the index outweighs the additional computer resources necessary to maintain the index.

Index Architecture in SQL

Like tables, indexes are data structures that occupy space in the database. All indexes in SQL Server are physically built upon a B-tree index structures, which are stored on 8-KB index pages. Each index page has a page header followed by the index rows. Each index row contains a key value and a pointer to either a lower-level index page or an actual data row. Each page in an index is also referred to as an index node. The top node of the B-tree is called the root node. The bottom layer of nodes in an index are called the leaf nodes. Any index levels between the root and the leaves are collectively known as intermediate levels or nodes. Pages in each level of the index are linked together in a doubly-linked list.

Types of Indexes

The two primary types of indexes that SQL Server supports are :

. Clustered Index

. Non-Clustered Index

Clustered Indexes

There can be only one clustered index on a table or view. When a clustered index is created on a table, the data associated with that table is read, sorted and physically stored back to the database in the same order as the index search key. Because data for the table can only be persisted to storage in one order without causing duplication, the restriction of one clustered index applies. Retrieval of SQL Server data based on key search with a clustered index requires no pointer jump in order to retrieve the associated data page. This is because the leaf level of the clustered index is, in fact, the associated data page. As already mentioned, the leaf level (and consequentially the data for the table or indexed view) is physically sorted and stored in the same order as the search key. Because the leaf level of the clustered index contains the actual 8-KB data pages of the table, the row data of the entire table is physically arranged on the disk drive in the order determined by the clustered index. This provides a potential I/O performance advantage when fetching a significant number of rows from this table (at least greater than 64 KB) based on the value of the clustered index, because sequential disk I/O is being used. That is why it is important to pick the clustered index on a table based on a column that will be used to perform range scans to retrieve a large number of rows. The fact that the rows for table associate with a clustered index have to be sorted and stored in the same order as the index search key has the following implications:
. When you create a clustered index, the table is copied, the data in the table is sorted, and then the original table is deleted. Therefore, enough empty space must exist in the database to hold a copy of the data.
. By default, the data in the table is sorted when the index is created. However, if the data is already sorted in the correct order, the sort operation is automatically skipped. This can have a dramatic effect in speeding up the index creation process.
. Also, building clustered indexes on sorted data requires much less I/O. This is because the data does not have to be copied, sorted, stored back to the database, then the old table data deleted. Instead, the data is left in the extents where it was originally allocated. When you create a primary key constraint in a table that does not contain a clustered index, SQL Server will use the primary key column for the clustered index key. If the clustered index already exists in a table, a non-clustered index is created on the column defined with the primary key constraint. A column defined as a primary key is a useful index as the column values are guaranteed to be unique. Unique values create smaller B-Trees than redundant values and thus make more efficient lookup structures. If you create a non-unique clustered index on a column that contains redundant values, SQL Server creates a unique value on the redundant columns to serve as secondary sort key. To avoid the additional work requirement to maintain unique values on redundant rows, favor clustered indexes for columns defined with primary key constraint.

Non Clustered Indexes

On a table or view you can create 250 non-clustered indexes or 249 non-clustered indexes and one clustered index. You must first create a unique clustered index on a view before you can create non-clustered indexes. This restriction however does not apply to tables. Non clustered indexes are most useful for fetching few rows with good selectivity from large SQL Server tables based on a specific key value. As already mentioned, non clustered indexes are binary trees formed out of 8-KB index pages. The bottom, or leaf level, of the binary tree of index pages contains all the data from the columns that comprised that index. When a non clustered index is used to retrieve information from a table based on a match with the key value, the index B-tree is traversed until a key match is found at the leaf level of the index. A pointer jump is made if columns from the table are needed that did not form part of the index. This pointer jump will likely require a non sequential I/O operation on the disk. It might even require the data to be read from another disk, especially if the table and its accompanying index B-trees are large in size. For each row returned for an SQL query that involves searching with a non clustered index, at least one pointer jump is required. The overhead associated with each pointer jump is the reason that non clustered indexes are better suited for processing queries that return only one or a few rows from a table. Queries that require a range of rows are better served with a clustered index. If a clustered index does not exist on a table, the table is unsorted and is called a heap. A non-clustered index created on a heap contains pointer to table rows. Each entry in an index page contains a RowID (RID). The RID is a pointer to a table row in a heap, and it consists of a page number, a file number and a slot number. If a clustered index exists on a table, the index pages of a non-clustered index contain clustered index keys rather than RIDs. An index pointer, whether it is RID or an index key is called a bookmark Note: The preferred way to build indexes on large tables is to start with the clustered index and then build the non clustered indexes. In this way, no non clustered indexes will need to be rebuilt due to the data moving. When dropping all indexes, drop the non clustered indexes first and the clustered index last. That way, no indexes need to be rebuilt.

Unique Indexes

When an index is defined as unique, the index keys and the corresponding column values must be unique. Using a unique constraint is another way to ensure uniqueness within a table. Unique index can also be applied to a group of columns (a composite of columns). The composite column unique index must maintain distinctiveness. For example: a unique index defined on a lastname column and a Social Security Number column must not contain 'NULL' values in both the columns for more than one record. SQL Server automatically creates a unique index for a column or columns defined with primary key or unique constraint.

Composite Indexes

A composite index is any index that uses more than one column in a table for its index key. Composite indexes can improve query performance by reducing input/output (I/O) operations, because a query on a combination of columns contained in index will be located entirely in the index. When the result of a query is obtained from the index without having to rely on the underlying table, the query is considered covered and the index is considered COVERING INDEX. In other words, Covering Index is a non-clustered index that is built upon all of the columns required to satisfy a SQL query, both in selection criteria and the 'WHERE' predicate. SQL Server 2005 provides 'INCLUDE' option with which the columns mentioned in the 'Include' clause are actually stored in the index data pages though data is not being indexed on those columns. This saves on round trips to the database to get the rest of the data that is not in the composite index.

Indexed Views

Indexed views are views whose results are persisted in the database and indexed for fast access. As with any other views, indexed views depend on base tables for their data. Such dependency means that if you change a base table contributing to an indexed view, the indexed view might become invalid. For example, renaming a column that contributes to a view invalidates the view. To prevent such problems, SQL Server supports creating views with schema binding. Schema binding prohibits any table or column modification that would invalidate the view. Any indexed view you create with the View Designer automatically gets schema binding, because SQL Server requires that indexed views have schema binding. Schema binding does not mean you cannot modify the view; it means you cannot modify the underlying tables or views in ways that would change the view's result set. Also, indexed views, like indexes on computed columns, must be deterministic, precise, and must not contain text, ntext, or image columns. Indexed views work best when the underlying data is infrequently updated. The maintenance of an indexed view can be higher than the cost of maintaining a table index. If the underlying data is updated frequently, then the cost of maintaining the indexed view data may outweigh the performance benefits of using the indexed view. Indexed views improve the performance of these types of queries:
. Joins and aggregations that process many rows.
. Join and aggregation operations that are frequently performed by many queries.

The first index created on a view must be a unique clustered index. After the unique clustered index has been created, you can create additional non-clustered indexes. All indexes on a view are dropped if the view is dropped. All non-clustered indexes on the view are dropped if the clustered index is dropped. Non-clustered indexes can be dropped individually. Dropping the clustered index on the view removes the stored result set, and the optimizer returns to processing the view like a standard view. Although only the columns that make up the clustered index key are specified in the CREATE UNIQUE CLUSTERED INDEX statement, the complete result set of the view is stored in the database. As in a clustered index on a base table, the B-tree structure of the clustered index contains only the key columns, but the data rows contain all of the columns in the view result set.

Selection of Indexes

The choice of indexes significantly affects the amount of disk I/O generated and, subsequently, performance. Non-clustered indexes are good for retrieval of a small number of rows and clustered indexes are good for range-scans. This section provides the guidelines on determining when to create an index and deciding which index properties to configure for optimal performance:
. Exact Match Queries: For queries that use the where clause for finding a specific value, a clustered index is a good choice for exact match queries if the where clause returns a distinct value. Use a non-clustered index for an exact match query that does not returns a unique record.
. Wildcard Queries: Queries using 'LIKE' starting with percent sign '%' are not aided by indexes because index keys start with a specific character or numeric value.
. Range Queries: Clustered indexes are an excellent choice for this type of query because the index pages are physically sorted in sequential order.
. Covered Queries: You should keep clustered index keys as small as possible because non-clustered indexes use the clustered index key as their bookmark. If you need a wide index for a covered query, create a non-clustered index as covering index.

Conclusion

Indexes are structured to facilitate the rapid return of result sets. The indexes should be created keeping in mind the type of data, the number of records, the number of columns and especially the queries which will be used generally to fetch data from such tables. Use of suitable type of indexes can really help us improve performance of our queries.


 Subscribe to Articles

     

Further Readings:

Responses

No response found. Be the first to respond this post

Post Comment

You must Sign In To post reply
Find More Articles on C#, ASP.Net, Vb.Net, SQL Server and more Here

Hall of Fame    Twitter   Terms of Service    Privacy Policy    Contact Us    Archives   Tell A Friend