sql-server – 堆上的非集群索引与集群索引的性能

This 2007 White Paper将组织为聚簇索引的表上的各个select / insert / delete / update和range select语句的性能与组织为具有与CI表相同的键列上的具有非聚簇索引的堆的表进行比较。

通常,集群索引选项在测试中表现更好,因为只有一个结构要维护,因为不需要书签查找。

本文未涉及的一个潜在有趣的情况是将堆上的非聚簇索引与聚簇索引上的非聚簇索引进行比较。在这种情况下,我预计堆可能甚至在NCI叶级别更好地执行SQ​​L Server具有直接跟踪RID而不是需要遍历聚簇索引。

有谁知道在这方面进行过类似的正式测试,结果如何?

要检查您的请求,我创建了以下方案的2个表:

> 790万条记录,表示余额信息。
>一个身份字段从1到790万
>一个数字字段将记录分组在大约500k组。

名为heap的第一个表在字段组中获得非聚集索引。第二个称为“聚类”的表在一个名为key的顺序字段上得到一个聚集索引,在该字段组中有一个非聚簇索引

测试在具有2个超线程内核,4Gb内存和64位Windows 7的I5 M540处理器上运行。

Microsoft SQL Server 2008 R2 (RTM) - 10.50.1600.1 (X64) 
Apr  2 2010 15:48:46 
Developer Edition (64-bit) on Windows NT 6.1 <X64> (Build 7601: Service Pack 1)  

Update on 9 Mar 2011: I did a second more extensive benchmark by running the following .net code and logging Duration, CPU, Reads, Writes and RowCounts in Sql Server Profiler. (The CommandText used will be mentioned in the results.)

NOTE: CPU and Duration are expressed in milliseconds

  • 1000 queries
  • zero CPU queries are eliminated from the results
  • 0 rows affected are eliminated from the results
int[] idList = new int[] { 6816588, 7086702, 6498815 ... }; // 1000 values here.
using (var conn = new SqlConnection(@"Data Source=myserver;Initial Catalog=mydb;Integrated Security=SSPI;"))
            {
                conn.Open();
                using (var cmd = new SqlCommand())
                {
                    cmd.Connection = conn;
                    cmd.CommandType = CommandType.Text;
                    cmd.CommandText = "select * from heap where common_key between @id and @id+1000"; 
                    cmd.Parameters.Add("@id", SqlDbType.Int);
                    cmd.Prepare();
                    foreach (int id in idList)
                    {
                        cmd.Parameters[0].Value = id;

                        using (var reader = cmd.ExecuteReader())
                        {
                            int count = 0;
                            while (reader.Read())
                            {
                                count++;
                            }
                            Console.WriteLine(String.Format("key: {0} => {1} rows", id, count));
                        }
                    }
                }
            }

End of Update on 9 Mar 2011.

SELECT性能

要检查性能号码,我在堆表上执行了以下查询一次,并在集群表上执行了一次:

select * from heap/clust where group between 5678910 and 5679410
select * from heap/clust where group between 6234567 and 6234967
select * from heap/clust where group between 6455429 and 6455729
select * from heap/clust where group between 6655429 and 6655729
select * from heap/clust where group between 6955429 and 6955729
select * from heap/clust where group between 7195542 and 7155729

该基准测试的结果是堆:

rows  reads CPU   Elapsed 
----- ----- ----- --------
1503  1510  31ms  309ms
401   405   15ms  283ms
2700  2709  0ms   472ms
0     3     0ms   30ms
2953  2962  32ms  257ms
0     0     0ms   0ms

Update on 9 Mar 2011:
cmd.CommandText = "select * from heap where group between @id and @id+1000";

  • 721 Rows have > 0 CPU and affect more than 0 rows
Counter   Minimum    Maximum Average  Weighted
--------- ------- ---------- ------- ---------
RowCounts    1001      69788    6368         -         
Cpu            15        374      37   0.00754
Reads        1069      91459    7682   1.20155
Writes          0          0       0   0.00000
Duration   0.3716   282.4850 10.3672   0.00180

End of Update on 9 Mar 2011.

对于表群集,结果是:

rows  reads CPU   Elapsed 
----- ----- ----- --------
1503  4827  31ms  327ms
401   1241  0ms   242ms
2700  8372  0ms   410ms
0     3     0ms   0ms
2953  9060  47ms  213ms
0     0     0ms   0ms

Update on 9 Mar 2011:
cmd.CommandText = "select * from clust where group between @id and @id+1000";

  • 721 Rows have > 0 CPU and affect more than 0 rows
Counter   Minimum    Maximum Average  Weighted
--------- ------- ---------- ------- ---------
RowCounts    1001      69788    6056         -
Cpu            15        468      38   0.00782
Reads        3194     227018   20457   3.37618
Writes          0          0       0       0.0
Duration   0.3949   159.6223 11.5699   0.00214

End of Update on 9 Mar 2011.

SELECT WITH JOIN表现

cmd.CommandText =“select * from heap / clust h join keys k on h.group = k.group where hGroup between @id and @id 1000”;

该基准测试的结果是堆:

873行有> 0 CPU并影响超过0行

Counter   Minimum    Maximum Average  Weighted
--------- ------- ---------- ------- ---------
RowCounts    1009       4170    1683         -
Cpu            15         47      18   0.01175
Reads        2145       5518    2867   1.79246
Writes          0          0       0   0.00000
Duration   0.8215   131.9583  1.9095   0.00123

此基准测试的结果是针对集群的:

865行有> 0 CPU并影响超过0行

Counter   Minimum    Maximum Average  Weighted
--------- ------- ---------- ------- ---------
RowCounts    1000       4143    1685         -
Cpu            15         47      18   0.01193
Reads        5320      18690    8237   4.97813
Writes          0          0       0   0.00000
Duration   0.9699    20.3217  1.7934   0.00109

UPDATE性能

第二批查询是更新语句:

update heap/clust set amount = amount + 0 where group between 5678910 and 5679410
update heap/clust set amount = amount + 0 where group between 6234567 and 6234967
update heap/clust set amount = amount + 0 where group between 6455429 and 6455729
update heap/clust set amount = amount + 0 where group between 6655429 and 6655729
update heap/clust set amount = amount + 0 where group between 6955429 and 6955729
update heap/clust set amount = amount + 0 where group between 7195542 and 7155729

这个基准的堆结果:

rows  reads CPU   Elapsed 
----- ----- ----- -------- 
1503  3013  31ms  175ms
401   806   0ms   22ms
2700  5409  47ms  100ms
0     3     0ms   0ms
2953  5915  31ms  88ms
0     0     0ms   0ms

Update on 9 Mar 2011:
cmd.CommandText = "update heap set amount = amount + @id where group between @id and @id+1000";

  • 811 Rows have > 0 CPU and affect more than 0 rows
Counter   Minimum    Maximum Average  Weighted
--------- ------- ---------- ------- ---------
RowCounts    1001      69788    5598       811         
Cpu            15        873      56   0.01199
Reads        2080     167593   11809   2.11217
Writes          0       1687     121   0.02170
Duration   0.6705   514.5347 17.2041   0.00344

End of Update on 9 Mar 2011.

该基准测试的结果为:

rows  reads CPU   Elapsed 
----- ----- ----- -------- 
1503  9126  16ms  35ms
401   2444  0ms   4ms
2700  16385 31ms  54ms
0     3     0ms   0ms 
2953  17919 31ms  35ms
0     0     0ms   0ms

Update on 9 Mar 2011:
cmd.CommandText = "update clust set amount = amount + @id where group between @id and @id+1000";

  • 853 Rows have > 0 CPU and affect more than 0 rows
Counter   Minimum    Maximum Average  Weighted
--------- ------- ---------- ------- ---------
RowCounts    1001      69788    5420         -
Cpu            15        594      50   0.01073
Reads        6226     432237   33597   6.20450
Writes          0       1730     110   0.01971
Duration   0.9134   193.7685  8.2919   0.00155

End of Update on 9 Mar 2011.

删除基准

我运行的第三批查询是delete语句

delete heap/clust where group between 5678910 and 5679410
delete heap/clust where group between 6234567 and 6234967
delete heap/clust where group between 6455429 and 6455729
delete heap/clust where group between 6655429 and 6655729
delete heap/clust where group between 6955429 and 6955729
delete heap/clust where group between 7195542 and 7155729

这个基准的堆的结果:

rows  reads CPU   Elapsed 
----- ----- ----- -------- 
1503  10630 62ms  179ms
401   2838  0ms   26ms
2700  19077 47ms  87ms
0     4     0ms   0ms
2953  20865 62ms  196ms
0     4     0ms   9ms

Update on 9 Mar 2011:
cmd.CommandText = "delete heap where group between @id and @id+1000";

  • 724 Rows have > 0 CPU and affect more than 0 rows
Counter   Minimum    Maximum Average  Weighted
--------- ------- ---------- ------- ---------
RowCounts     192      69788    4781         -
Cpu            15        499      45   0.01247
Reads         841     307958   20987   4.37880
Writes          2       1819     127   0.02648
Duration   0.3775  1534.3383 17.2412   0.00349

End of Update on 9 Mar 2011.

此次基准测试的结果为:

rows  reads CPU   Elapsed 
----- ----- ----- -------- 
1503  9228  16ms  55ms
401   3681  0ms   50ms
2700  24644 46ms  79ms
0     3     0ms   0ms
2953  26955 47ms  92ms
0     3     0ms   0ms

Update on 9 Mar 2011:

cmd.CommandText = "delete clust where group between @id and @id+1000";

  • 751 Rows have > 0 CPU and affect more than 0 rows
Counter   Minimum    Maximum Average  Weighted
--------- ------- ---------- ------- ---------
RowCounts     144      69788    4648         -
Cpu            15        764      56   0.01538
Reads         989     458467   30207   6.48490
Writes          2       1830     127   0.02694
Duration   0.2938  2512.1968 24.3714   0.00555

End of Update on 9 Mar 2011.

INSERT基准

基准测试的最后一部分是执行insert语句。

插入到heap / clust(…)
 值(…),
        (…),
        (…),
        (…),
        (…),
        (……)

这个基准的堆的结果:

rows  reads CPU   Elapsed 
----- ----- ----- -------- 
6     38    0ms   31ms

Update on 9 Mar 2011:

string str = @"insert into heap (group, currency, year, period, domain_id, mtdAmount, mtdAmount, ytdAmount, amount, ytd_restated, restated, auditDate, auditUser)
                    values";

                    for (int x = 0; x < 999; x++)
                    {
                        str += string.Format(@"(@id + {0}, 'EUR', 2012, 2, 0, 100, 100, 1000 + @id,1000, 1000,1000, current_timestamp, 'test'),  ", x);
                    }
                    str += string.Format(@"(@id, 'CAD', 2012, 2, 0, 100, 100, 1000 + @id,1000, 1000,1000, current_timestamp, 'test') ", 1000);

                    cmd.CommandText = str;
  • 912 statements have > 0 CPU
Counter   Minimum    Maximum Average  Weighted
--------- ------- ---------- ------- ---------
RowCounts    1000       1000    1000         -
Cpu            15       2138      25   0.02500
Reads        5212       7069    6328   6.32837
Writes         16         34      22   0.02222
Duration   1.6336   293.2132  4.4009   0.00440

End of Update on 9 Mar 2011.

这个基准测试的结果是:

rows  reads CPU   Elapsed 
----- ----- ----- -------- 
6     50    0ms   18ms

Update on 9 Mar 2011:

string str = @"insert into clust (group, currency, year, period, domain_id, mtdAmount, mtdAmount, ytdAmount, amount, ytd_restated, restated, auditDate, auditUser)
                    values";

                    for (int x = 0; x < 999; x++)
                    {
                        str += string.Format(@"(@id + {0}, 'EUR', 2012, 2, 0, 100, 100, 1000 + @id,1000, 1000,1000, current_timestamp, 'test'),  ", x);
                    }
                    str += string.Format(@"(@id, 'CAD', 2012, 2, 0, 100, 100, 1000 + @id,1000, 1000,1000, current_timestamp, 'test') ", 1000);

                    cmd.CommandText = str;
  • 946 statements have > 0 CPU
Counter   Minimum    Maximum Average  Weighted
--------- ------- ---------- ------- ---------
RowCounts    1000       1000    1000         -      
Cpu            15       2403      21   0.02157
Reads        6810       8997    8412   8.41223
Writes         16         25      19   0.01942
Duration   1.5375   268.2571  6.1463   0.00614

End of Update on 9 Mar 2011.

结论

虽然使用集群&访问表进行更多的逻辑读取。非聚簇索引(使用非聚簇索引时)性能结果如下:

> SELECT语句是可比的
> UPDATE语句更快,聚集索引就位
> DELETE语句更快,聚集索引就位
>使用聚集索引的INSERT语句更快

当然,我的基准测试对于特定类型的表和非常有限的查询非常有限,但是我认为基于这些信息,我们可以开始说,在表上创建聚簇索引实际上总是更好。

Update on 9 Mar 2011:

从附加结果可以看出,有限测试的结论在每种情况下都不正确。

现在的结果表明,从集群索引中获益的唯一语句是update语句。其他语句在具有聚簇索引的表上慢了约30%。

一些额外的图表,我绘制了堆对群集的查询的加权持续时间。

您可以看到insert语句的性能配置文件非常有趣。尖峰是由几个数据点造成的,这些数据点需要更长的时间才能完成。

End of Update on 9 Mar 2011.

翻译自:https://stackoverflow.com/questions/5094400/performance-of-non-clustered-indexes-on-heaps-vs-clustered-indexes

转载注明原文:sql-server – 堆上的非集群索引与集群索引的性能