how to analysis – 享受代码

When looking at the wait statistics being tracked by SQL Server, it's important that these
wait types are eliminated from the analysis, allowing the more problematic waits in the
system to be identified. One of the things I do as a part of tracking wait information is to
maintain a script that filters out the non-problematic wait types, as shown in Listing 1.2.'

Troubleshooting SQL Server
A Guide for the Accidental DBA

--1,wait types,if for i/o,then sys.dm_io_virtual_file_stats
--2,sqlserver perfmon counts view

SELECT TOP 10
wait_type ,
max_wait_time_ms wait_time_ms ,
signal_wait_time_ms ,
wait_time_ms - signal_wait_time_ms AS resource_wait_time_ms ,
100.0 * wait_time_ms / SUM(wait_time_ms) OVER ( )
AS percent_total_waits ,
100.0 * signal_wait_time_ms / NULLIF(SUM(signal_wait_time_ms) OVER ( ),0)
AS percent_total_signal_waits ,
100.0 * ( wait_time_ms - signal_wait_time_ms )
/ SUM(wait_time_ms) OVER ( ) AS percent_total_resource_waits
FROM sys.dm_os_wait_stats
WHERE wait_time_ms > 0 -- remove zero wait_time
AND wait_type NOT IN -- filter out additional irrelevant waits
( 'SLEEP_TASK', 'BROKER_TASK_STOP', 'BROKER_TO_FLUSH',
'SQLTRACE_BUFFER_FLUSH','CLR_AUTO_EVENT', 'CLR_MANUAL_EVENT',
'LAZYWRITER_SLEEP', 'SLEEP_SYSTEMTASK', 'SLEEP_BPOOL_FLUSH',
'BROKER_EVENTHANDLER', 'XE_DISPATCHER_WAIT', 'FT_IFTSHC_MUTEX',
'CHECKPOINT_QUEUE', 'FT_IFTS_SCHEDULER_IDLE_WAIT',
'BROKER_TRANSMITTER', 'FT_IFTSHC_MUTEX', 'KSOURCE_WAKEUP',
'LAZYWRITER_SLEEP', 'LOGMGR_QUEUE', 'ONDEMAND_TASK_QUEUE',
'REQUEST_FOR_DEADLOCK_SEARCH', 'XE_TIMER_EVENT', 'BAD_PAGE_PROCESS',
'DBMIRROR_EVENTS_QUEUE', 'BROKER_RECEIVE_WAITFOR',
'PREEMPTIVE_OS_GETPROCADDRESS', 'PREEMPTIVE_OS_AUTHENTICATIONOPS',
'WAITFOR', 'DISPATCHER_QUEUE_SEMAPHORE', 'XE_DISPATCHER_JOIN',
'RESOURCE_QUEUE' )
ORDER BY wait_time_ms DESC


CXPACKET
Often indicates nothing more than that certain queries are executing with parallelism;
CXPACKET waits in the server are not an immediate sign of problems

SOS_SCHEDULER_YIELD
This may indicate that the server is under CPU press

THREADPOOL
requiring an increase in the number of
CPUs in the server, to handle a highly concurrent workload, or it can be a sign of
blocking


LCK_*
These wait types signify that blocking is occurring in the system and that sessions
have had to wait to acquire a lock of a specific typ
This problem can be investigated further using the information
in the sys.dm_db_index_operational_stats


PAGEIOLATCH_*, IO_COMPLETION, WRITELOG
These waits are commonly associated with disk I/O bottlenecks, though the root
cause of the problem may be,PAGEIOLATCH_* waits are
specifically associated with delays in being able to read or write data from the database
files. WRITELOG waits are related to issues with writing to log files.
These waits
should be evaluated in conjunction with the virtual file statistics as well as Physical
Disk performance counters


PAGELATCH_*
A lot of times
PAGELATCH_* waits are associated with allocation contention issues. One of
the best-known allocations issues associated with PAGELATCH_* waits occurs in
tempdb when the a large number of objects are being created and destroyed in
tempdb and the system experiences contention on the Shared Global Allocation
Map (SGAM), Global Allocation Map (GAM), and Page Free Space (PFS) pages in the
tempdb database.




LATCH_*
Determining the
specific latch class that has the most accumulated wait time associated with it can
be found by querying the sys.dm_os_latch_stats DMV.

ASYNC_NETWORK_IO
This wait is often incorrectly attributed to a network bottleneck,In fact, the most
common cause of this wait is a client application that is performing row-by-row
processing of the data being streamed from SQL Server as a result set (client accepts
one row, processes, accepts next row, and so on). Correcting this wait type generally
requires changing the client-side code so that it reads the result set as fast as possible,
and then performs processing



After fixing any problem in the server, in order to validate that the problem has indeed
been fixed, the wait statistics being tracked by the server can be reset using the code in
Listing 1.3.

DBCC SQLPERF('sys.dm_os_wait_stats', clear)


--------------disk I/O bottleneck.-----------
will provide cumulative physical I/O statistics, the number of reads
and writes on each data file, and the number of reads and writes on each log file, for
the various databases in the instance, from which can be calculated the ratio of reads to
writes. This also shows the number of I/O stalls and the stall time associated with the
requests, which is the total amount of time sessions have waited for I/O to be completed
on the file.
whether heavy-read or heavy-write, and at the average latency associated with the I/O, as this will
direct further investigation and possible solutions.


SELECT DB_NAME(vfs.database_id) AS database_name ,

vfs.database_id ,
vfs.FILE_ID ,
io_stall_read_ms / NULLIF(num_of_reads, 0) AS avg_read_latency ,
io_stall_write_ms / NULLIF(num_of_writes, 0)
AS avg_write_latency ,
io_stall / NULLIF(num_of_reads + num_of_writes, 0)
AS avg_total_latency ,
num_of_bytes_read / NULLIF(num_of_reads, 0)
AS avg_bytes_per_read ,
num_of_bytes_written / NULLIF(num_of_writes, 0)
AS avg_bytes_per_write ,
vfs.io_stall ,
vfs.num_of_reads ,
vfs.num_of_bytes_read ,
vfs.io_stall_read_ms ,
vfs.num_of_writes ,
vfs.num_of_bytes_written ,
vfs.io_stall_write_ms ,
size_on_disk_bytes / 1024 / 1024. AS size_on_disk_mbytes ,
physical_name
FROM sys.dm_io_virtual_file_stats(NULL, NULL) AS vfs
JOIN sys.master_files AS mf ON vfs.database_id = mf.database_id
AND vfs.FILE_ID = mf.FILE_ID
ORDER BY avg_total_latency DESC



SELECT * FROM sys.dm_os_performance_counters

DECLARE @CounterPrefix NVARCHAR(30)
SET @CounterPrefix = CASE WHEN @@SERVICENAME = 'MSSQLSERVER'
THEN 'SQLServer:'
ELSE 'MSSQL$' + @@SERVICENAME + ':'
END ;
-- Capture the first counter set
SELECT CAST(1 AS INT) AS collection_instance ,
[OBJECT_NAME] ,
counter_name ,
instance_name ,
cntr_value ,
cntr_type ,
CURRENT_TIMESTAMP AS collection_time
INTO #perf_counters_init
FROM sys.dm_os_performance_counters
WHERE ( OBJECT_NAME = @CounterPrefix + 'Access Methods'
AND counter_name = 'Full Scans/sec'
)
OR ( OBJECT_NAME = @CounterPrefix + 'Access Methods'
AND counter_name = 'Index Searches/sec'
)
OR ( OBJECT_NAME = @CounterPrefix + 'Buffer Manager'
AND counter_name = 'Lazy Writes/sec'
)
OR ( OBJECT_NAME = @CounterPrefix + 'Buffer Manager'
AND counter_name = 'Page life expectancy'
)
OR ( OBJECT_NAME = @CounterPrefix + 'General Statistics'
AND counter_name = 'Processes Blocked'
)
OR ( OBJECT_NAME = @CounterPrefix + 'General Statistics'
AND counter_name = 'User Connections'
)
OR ( OBJECT_NAME = @CounterPrefix + 'Locks'
AND counter_name = 'Lock Waits/sec'
)
OR ( OBJECT_NAME = @CounterPrefix + 'Locks'
AND counter_name = 'Lock Wait Time (ms)'
)

OR ( OBJECT_NAME = @CounterPrefix + 'SQL Statistics'
AND counter_name = 'SQL Re-Compilations/sec'
)
OR ( OBJECT_NAME = @CounterPrefix + 'Memory Manager'
AND counter_name = 'Memory Grants Pending'
)
OR ( OBJECT_NAME = @CounterPrefix + 'SQL Statistics'
AND counter_name = 'Batch Requests/sec'
)
OR ( OBJECT_NAME = @CounterPrefix + 'SQL Statistics'
AND counter_name = 'SQL Compilations/sec'
)
-- Wait on Second between data collection
WAITFOR DELAY '00:00:01'
-- Capture the second counter set
SELECT CAST(2 AS INT) AS collection_instance ,
OBJECT_NAME ,
counter_name ,
instance_name ,
cntr_value ,
cntr_type ,
CURRENT_TIMESTAMP AS collection_time
INTO #perf_counters_second
FROM sys.dm_os_performance_counters
WHERE ( OBJECT_NAME = @CounterPrefix + 'Access Methods'
AND counter_name = 'Full Scans/sec'
)
OR ( OBJECT_NAME = @CounterPrefix + 'Access Methods'
AND counter_name = 'Index Searches/sec'
)
OR ( OBJECT_NAME = @CounterPrefix + 'Buffer Manager'
AND counter_name = 'Lazy Writes/sec'
)
OR ( OBJECT_NAME = @CounterPrefix + 'Buffer Manager'
AND counter_name = 'Page life expectancy'
)
OR ( OBJECT_NAME = @CounterPrefix + 'General Statistics'
AND counter_name = 'Processes Blocked'
)
OR ( OBJECT_NAME = @CounterPrefix + 'General Statistics'
AND counter_name = 'User Connections'
)
OR ( OBJECT_NAME = @CounterPrefix + 'Locks'
AND counter_name = 'Lock Waits/sec'
)
OR ( OBJECT_NAME = @CounterPrefix + 'Locks'
AND counter_name = 'Lock Wait Time (ms)'
)
OR ( OBJECT_NAME = @CounterPrefix + 'SQL Statistics'
AND counter_name = 'SQL Re-Compilations/sec'
)
OR ( OBJECT_NAME = @CounterPrefix + 'Memory Manager'
AND counter_name = 'Memory Grants Pending'
)
OR ( OBJECT_NAME = @CounterPrefix + 'SQL Statistics'
AND counter_name = 'Batch Requests/sec'
)
OR ( OBJECT_NAME = @CounterPrefix + 'SQL Statistics'
AND counter_name = 'SQL Compilations/sec'
)
-- Calculate the cumulative counter values
SELECT i.OBJECT_NAME ,
i.counter_name ,
i.instance_name ,
CASE WHEN i.cntr_type = 272696576
THEN s.cntr_value - i.cntr_value
WHEN i.cntr_type = 65792 THEN s.cntr_value
END AS cntr_value
FROM #perf_counters_init AS i
JOIN #perf_counters_second AS s
ON i.collection_instance + 1 = s.collection_instance
AND i.OBJECT_NAME = s.OBJECT_NAME
AND i.counter_name = s.counter_name
AND i.instance_name = s.instance_name
ORDER BY OBJECT_NAME
-- Cleanup tables
DROP TABLE #perf_counters_init
DROP TABLE #perf_counters_second


The two Access Methods counters provide information about the ways that tables
are being accessed in the database. The most important one is the Full Scans/sec
counter, which can give us an idea of the number of index and table scans that are
occurring in the SYSTEM

In general, I want the number
of Index Searches/sec to be higher than the number of Full Scans/sec by a factor
of 800–1000. If the number of Full Scans/sec is too high, refer to Chapter 5, Missing
Indexes to determine if there are missing indexes in the database, resulting in excess
I/O operations.

Page Life Expectancy (PLE) which is the number of seconds a page will remain
in the data cache
the question VALUE <= (max server memory/4)*300s,
Writes/sec, which are page flushes from the buffer cache outside of the normal CHECKPOINT
process, then the server is most likely experiencing data cache memory pressure,
which will also increase the disk I/O being performed by the SQL SERVER,At this point
the Access Methods counters should be investigated to determine if excessive table or
index scans are being performed on the SQL SERVER


The General Statistics\Processes Blocked, Locks\Lock Waits/sec,
If these counters return a value other
than zero, over repeated collections of the data, then blocking is actively occurring in one
of the databases , Blocking should be used to
troubleshoot the problems further
 sp_configure ('show advanced options')
 
 The higher the number of SQL Compilations/
sec in relation to the Batch Requests/sec, the more likely the SQL Server is
experiencing an ad hoc workload that is not making optimal using of plan caching. The
higher the number of SQL Re-Compilations/sec in relation to the Batch Requests/
sec, the more likely it is that there is an inefficiency in the code design that is forcing
a recompile of the code being executed in the SQL Server. In either case, investigation
of the Plan Cache
 show advanced options
 
 The Memory Manager\Memory Grants Pending performance counter provides
information about the number of processes waiting on a workspace memory grant in
the instance.If this counter has a high value,there may be query inefficiencies in the instance that are causing excessive
memory grant requirements, for example, large sorts or hashes that can be resolved by
tuning the indexing or queries being executed