死循环引起的连接池耗尽

用户反应系统响应慢,报连接池超过最大连接数错误,上去看服务器,CPU,内存都很正常。过一段时间系统又自动恢复。通过以下方法排查连接数:
sp_configure ‘user connections’
—-查看系统连接数:
SELECT d.name,b.login_name,COUNT(1) ccount,m.text  FROM
sys.dm_exec_connections a
JOIN sys.dm_exec_sessions b ON a.session_id = b.session_id –AND b.login_name IN('OA','brm')
JOIN sys.databases d ON b.database_id = d.database_id
cross apply  sys.dm_exec_sql_text(a.most_recent_sql_handle)m
GROUP BY d.name,b.login_name,m.text
ORDER BY d.name,b.login_name,ccount DESC

—-或者看明细

SELECT b.session_id,d.name,b.login_name,m.text,
a.client_net_address,a.connect_time,a.last_read,a.last_write
FROM sys.dm_exec_connections a
JOIN sys.dm_exec_sessions b ON a.session_id = b.session_id –AND b.login_name IN('OA','brm')
JOIN sys.databases d ON b.database_id = d.database_id
cross apply  sys.dm_exec_sql_text(a.most_recent_sql_handle)m
where client_net_address != '<local machine>'
order by a.last_read desc

 
发现排在第一位的连接数景有数千,意味着一个会话内发生了上千次的连接,这是很异常的,分析sql语句来源定位异常sql语句:
SELECT * FROM sys.dm_exec_sql_text(0x03000600B1C9EA687B3F2801CA9F00000100000000000000)
–看sql语句
SELECT TOP 50
qs.total_worker_time/qs.execution_count/1000. as [平均消耗CPU 时间(ms)],
total_worker_time/1000 AS [总消耗CPU 时间(ms)],execution_count [运行次数],
SUBSTRING(qt.text,qs.statement_start_offset/2+1,
(case when qs.statement_end_offset = -1
then DATALENGTH(qt.text)
else qs.statement_end_offset end -qs.statement_start_offset)/2 + 1)
as [查询语句], qt.text [所在存储过程],
qt.dbid, dbname=db_name(qt.dbid),
qt.objectid,object_name(qt.objectid,qt.dbid) ObjectName
FROM sys.dm_exec_query_stats qs
cross apply sys.dm_exec_sql_text(qs.sql_handle) as qt
WHERE qs.last_execution_time >=’2012-01-10 10:00′ AND dbid = 6
AND qs.sql_handle = ’0x03000600B1C9EA687B3F2801CA9F00000100000000000000′
发现最大连接数为596,599?