이번에는 인덱스를 고르는 방법에 대한 글 입니다.
저도 아직 다 안 읽어봤는데... 역시 시간과 능력 되는 만큼만 번역해 놓겠습니다.
How to Select Indexes for Your SQL Server Tables |
by Manish Rawat
Indexing is one of the most crucial elements in increasing the performance of SQL Server. A well-written query will not show its effectiveness unless powered by an appropriate index or indexes on the table(s) used in a query, especially if the tables are large.
Indexes exist to make data selection faster, so the focus of this article is on ways you can select the best indexes for faster data retrieval. This is done in a two-step process. (인덱스는 데이터의 검색을 빠르게 한다. 그래서, 이 글에선 빠른 데이터 검색을 위한 가장 좋은 인덱스를 선택하는 방법에 촛점을 맞춘다. 이것은 두 단계로 이루어진다.)
Indexing can be quite a challenging task if you are not familiar with your databases, the relationships between tables, and how queries are written in stored procedures and embedded SQL. How many companies you have worked for have a proper ERD diagram of their databases and have followed the textbook method style of programming? In the real world, time is often limited, resulting in poor SQL Server database performance. If you have been tasked with optimizing a database's performance (at least to a respectable level), or you want to be proactive with your databases to prevent potential future performance issues, following these steps should help you in tuning tables, just as they have helped me. These steps are applicable at any stage of project, even if a deadline is just around the corner.
Step One (Gathering Information) Interact with the people who know about the database and its table structures. (데이터베이스와 그 데이터베이스의 테이블 구조에 대해 아는 사람들과 대화를 나눈다.) If you know it already, that’s great. This is very important and makes your life easier. (만약 이미 알고 있다면 좋은 일이다. 이것은 매우 중요하며, 작업을 쉽게 만들어 준다.)
1) Identify key tables, based on: (아래의 내용을 기반으로 KEY 테이블을 찾는다.)
2) Identify the most frequently called stored procedures/queries and list all of the tables used by them. (가장 자주 사용되는 저장 프로시저나 쿼리들을 찾아내고, 이 저장 프로시저나 쿼리에서 사용되는 테이블을 목록화한다.)
3) Get the SQL Profiler trace of :
It is advisable to write down information you collect in a document for later retrieval.
4) Before we dive into analyzing the information gathered, here are few things to keep in mind while tuning your tables:
Preferred SQL Server Index Types When you use Query Analyzer to produce a graphical execution plan, you will notice that there are several different ways SQL Server uses indexes. 1) Clustered Index Seek: A Clustered Index Seek uses the seeking ability of indexes to retrieve rows directly from a clustered index. In most cases, they provide the best performance on SELECT statements. In Query Analyzer, go to pubs database. Type following query: SELECT * FROM authors WHERE au_id LIKE'2%' Highlight the query and press. (Cntl + L) or highlight the query and press F5. You will see the following in the “Estimated Execution Plan” tab. Take a close look at the Argument section of the above illustration. Notice that the “UPKCL_auidind” clustered index is used to retrieve the data. 2) Index Seek: An Index Seek uses a non-clustered index to retrieve data, and in some ways, acts like a clustered index. This is because all of the data retrieved is fetched from the leaf layer of the non-clustered index, not from any of the data pages. You often see this behavior in a covering index. In Query Analyzer, go to pubs database and type following query: SELECT title_id, title FROM titles WHERE title LIKE 't%' Highlight the query and press. (Cntl + L) or highlight the query and press F5. You will see the following in the “Estimated Execution Plan” tab: In the Argument section in the above illustration, note that the “titleind” non-clustered index is used to retrieve the data. 3) Bookmark Lookup: A Bookmark Lookup uses a non-clustered index to select the data. It starts with an index seek in the leaf nodes of the non-clustered index to identify the location of the data from the data pages, then it retrieves the necessary data directly from the data pages. Leaf nodes of non-clustered indexes contain row locator that point to the actual data on data pages. In Query Analyzer, go to the pubs database. Type following query: SELECT * FROM titles WHERE title LIKE 't%' Highlight the query and press. (Cntl + L) or highlight the query and press F5. You will see the following in the “Estimated Execution Plan” tab. In the Argument section of the Index Seek, notice that the "titlecind" non-clustered index is used, but once the data pages are identified from looking them up in the leaf pages of the non-clustered index, then a Bookmark Lookup must be performed. Again, a Bookmark Lookup is when the Query Optimizer has to lookup the data from the data pages in order to retrieve it. In the Argument section of the Bookmark Lookup, note that a Bookmark Lookup called "Bmk1000" is used. This name is assigned automatically by the Query Optimizer. 4) Scans: Scans (Table scans, Index scan, and Clustered Index scans) are usually bad unless the table has very few rows and the Query Optimizer determines that a table scan will outperform the use of an available index. Watch out for scans in your execution plans. In Query Analyzer, go to pubs database and type the following query: SELECT * FROM employee WHERE hire_date > '1992-08-01' Highlight the query and press. (Cntl + L) or highlight the query and press F5. You will see the following in the “Estimated Execution Plan” tab: Notice that in this case, a Clustered Index Scan was performed, which means that every row in the clustered index had to be examined to fulfill the requirements of the query. Now that we understand some of the basics of how to read Query Execution Plans, lets take a look at some additional information that you will find useful when analyzing queries for proper index use:
This query brings all SP’s having text “authors” in their code. Note that this query might bring extra SP’s, for example, if a stored procedure uses text in a comment.
Step Two: What to Do Once You Have Gathered the Necessary Information Actions for Key Tables For static tables (tables that rarely, if ever change), you can be liberal on the number of indexes that can be added. As mentioned earlier, too many indexes can degrade the performance of highly transactional tables, but this does not apply to tables whose data will not change. The only consideration possible could be disk space. Set all index fill factors on static tables to 100 in order to minimize disk I/O for even better performance. For highly transactional tables, try to limit the number of indexes. Always keep in mind that a non-clustered index contains the clustered index key. Because of this, limit the number of columns on your clustered index in order to keep their size small. Any index for busy transactional tables has to be highly justifiable. Choose the fill factor with caution (usually 80 to 90%) for these indexes in order to avoid potential page splitting. For tables used lot in stored procedures/embedded SQL, these tables play an important role in the total application lifetime as they are called most often. So they require special attention. What is important is look at how tables are being accessed in queries in order to eliminate scans and convert them into seeks. Watch the logical I/O used by "Set Statistics IO on" to help you determine which queries are accessing the most data. Less logical I/O is better than more. Choose clustered index with caution. Depending on how transactional the table is, choose a higher fill factor. For tables with index size greater then data size implies a lot of indexes, so review indexes and make sure their existence is useful and justified. For the Top 10 or 15 largest tables, keep this fact in mind when creating indexes for these types of tables, as their indexes will also be large. Also check to see if the tables are static or non-static, which is helpful information when deciding what columns need to be indexed. For the most frequently called Stored procedures/Embedded SQL, See the Query plans and Logical I/O page use. SQL Profiler Trace is a very good tool. It tracks calls getting executed in SQL Server at any given point of time, their execution time, I/O reads, user logins, executing SQL statement, etc. It can also be used as debugging tool. An analysis of a Profiler trace is important to identify slow running queries. You can set the duration to > 100ms to see queries which take more then 100 milliseconds to execute.
Using a Covering Index + Non-clustered Index Uses Clustered Index as a Row Locator Example 1 In the Orders table the NorthWind database, there currently is a non-clustered index on the ShippedDate column. Try running the following: SELECT ShippedDate, shipcity FROM orders WHERE ShippedDate > '8/6/1996' The query plan of statement produces a Clustered Index Scan. Now add the column shipcity to the non-clustered index on ShippedDate. CREATE INDEX [ShippedDate] ON [dbo].[Orders] ([ShippedDate], [ShipCity]) WITH DROP_EXISTING Now run the query again. This time, the query plan of statement produces an Index Seek. This magic happened because all fields (ShippedDate and ShipCity) in the SELECT and the WHERE clauses are part of an index. Example 2 In the Titles table of the Pubs database, check out the following execution plan for this query: SELECT title_id, title FROM titles WHERE title LIKE 't%' Notice that the execution plan shows an Index Seek, not a Bookmark Lookup (which is what you usually find with a non-clustered index). This is because the non-clustered index on the Title column contains a clustered index key Title_Id, and this SELECT has only Title_Id, Title in the SELECT and in the WHERE clause.
Now the Hard Part Starts By following the simple steps outlined here, you get useful pieces of information in no time that can help you improve the performance of your database. After you have data in front of you put it to work. These minor inputs will bring a majority of your results.
Published with express written permission of the author. Copyright 2002. |
'programming > MSSQL' 카테고리의 다른 글
[Tip]mdf파일 만으로 DB 복구하는 방법 (0) | 2003.11.12 |
---|---|
[튜닝] index... (0) | 2003.10.14 |
SQL Server에서 Blocking을 줄이기 위한 전략!! (4) | 2003.10.06 |