SQL Server troubleshooting – tempdb utilization

Goal

To obtain information about tempdb database from utilization perspective

My way of achieving it

Using some of the DMV/DMF that come with SQL Server and overall advises (scripts taken from different sources over internet that all refers to SQL Server DMVs in Action Better Queries with Dynamic Management Views by Ian W. Stirk)  from peers colleagues from internet customized to fit my needs  and my own way of troubleshooting. For me the most important thing while troubleshooting is to have the result of the investigation side by side with what it means since in most of the cases I am not dealing with this everyday and my memory about this topic is as good as the number of occurrences I had throughout my working experience. In simple terms for tempdb database as for other user databases we can find information about usage using specific DMVs and DMFs. The utilization information (tempdb) provided by SQL Server comes under the form of

  • present utilization
    • current space occupied by all the objects that uses tempdb – user objects, internal objects, version store, mixed extent
    • current space each session is currently using in tempdb because of running tasks
    • current space each session is currently using in tempdb because of row versioning
  • past utilization
    • used space in tempdb by all the sessions since they were established

The output of the script is ordered by the space still used and all the columns along the table gives me all the information that I need in order to decide the next steps or to say whom, when,  why in regards to tempdb utilization.

tempdbtroubleshooting

The script can be downloaded from here

SQL Server troubleshooting – filegroup warning alerts

Goal

To obtain information that will help me troubleshoot, solve these type of alerts

My way of achieving it

Using tsql and xp_cmdshell. From time to time and depending of the monitoring solution in place alerts related to filegroup for a database is something common throughout the day of the dba and I wanted to have something that will help me while solving those type of alerts. The script that I am using for solving these alerts doesn’t mean that we shouldn’t take further actions anymore is just an way to tell me if a deeper investigation or look should be made. The output of the script is below and the script requires the database name and the threshold for which you want the script to calculate the new values. In most of the cases the threshold is 80 but some implementations can configure the threshold to 90 which means that the alerts will come when the used percentage of the database is >=90.

These are the places that needs to be modified to reflect your database and your threshold

SET @dbname = ‘AdventureWorks’ — database name
SET @threshold = 50 — threshold

As you can see below

 

fg_warning

the script shows the current info and increases the maxsize for database files belonging to the PRIMARY filegroup with at least 2%. The main goal of the increase is to have a maxsize that will make the used space in percent to be smaller than threshold. When the comment for each database file specifies that the maxsize is bigger than the free space on that drive we need to have a look and possible to start the increase disk procedure for that location.

The script can be downloaded from here

SQL Server troubleshooting – logical disk free space and transaction log backup alerts

Goal

To obtain information that will help me troubleshoot, solve these type of alerts

My way of achieving it

Using tsql and xp_cmdshell. Sometimes when I have to troubleshoot why we ran out of free space on the drive where transaction log files are for our databases I start by checking what other files are stored in the same path and check which of them are subject to returning their free space (only transaction log files) to the OS in order to get rid of this alert. This is handy when for some reason one of the database tlog file grew because of an ad hoc query or because of the usual activity against the database and now it takes all the space. This doesn’t mean that we should not configure the drive hosting our tlog files with the right size it  only offer an workaround for creating space quick for other databases in order for these to grow if they need. Along with logical disk free space alerts most of the monitoring implementations are creating also backup alerts if the default location cannot accommodate the size of the backup which means that we have to find space somewhere else to take that backup and truncate that file to return the free space to the operating system.

Because in some cases I am connecting to the instance remotely the need for having something in one place without using along SSMS, remote disk management, windows explorer using administrative shares and so on becomes a must if the other tools cannot be used due to firewall configuration between the server on which we use SSMS to connect to that instance. The script output displays what we can do to free some space, information related to the log space usage for our database and how tlog file is configured  along with information about the places where the temporary backup can be stored. Due to the fact that someone might have configured log shipping for our database and the script is considering default path the one in which last week backups went the most there is no WARNING or logic saying that this database CANNOT be backed up to different locations that script provides. This means that I have to always check if the database is configured in a log shipping configuration and is a primary database anytime I am using the script.

The output of the script is similar with the output below depending of what you entered and the database options.

ldfsandtlogb

The script generates commands for backing up the tlog for our databases and shrinking the files. If shrink is not successful then display all open transactions in order to help you find the transaction for which the shrink cannot be done if you entered a path in the script.

The script is generating the commands for backing up tlog, showing you the tlog usage and its configuration together with places where you can take the backup if you entered a database name in the script.

The script can be downloaded from here

SQL Server troubleshooting – first steps in troubleshooting performance of some TSQL statements (part 3)

Goal

To obtain information about the job names that ran during that interval

My way of achieving it

By querying the history table and by displaying the information in the native format followed by the display using the rows as columns. The last way of displaying the information might have in some cases more meaning that the information that is displayed by default when we check the job history.

The output looks similar to this

JobsInfoPivot

The script can be downloaded from here

SQL Server troubleshooting – first steps in troubleshooting performance of some TSQL statements (part 2)

Goal

To obtain the nested list of TSQL statements (SELECT) from the most inner one to the outer one

My way of achieving it

Using the approach explained here for parsing parenthesis expressions.  As you will about to see I am using his unchanged code in my script together with my own additions to extract the select statements and to check for open parenthesis which are not closed. The existence of any open close parenthesis in the comments will break the cast to XML of the code that we want to troubleshoot.  That is why along the code that parses parenthesis expressions I added some logic to remove the comments. After some tests I realized that I need an workaround or better said a real solution for comments removal from the code and the solution came under the form of ScriptDom namespace which provides Microsoft SQL Server-specific Parser and Script Generator.  I left in the tsql script the logic for removing comments but if after execution of the script you still receive errors like

Msg 9400, Level 16, State 1, Line 387
XML parsing: line 227, character 8, unexpected end of input

Then please use the powershell script to clean the code  from comments and once you have it paste it to the tsql script for a successful cast to XML.

The output of the script , below is an example, shows in messages the list of statements from the most inner one to the outer one in an easy format that can be taken and executed step by step and we can see for each the execution plans and statistics and how long it takes and we can modify them to aim for some improvement. Of course that this might look like trivial for an experienced dba but I believe is a great way to troubleshoot in the beginning after none of the steps explained here solved the issue. This approach has its benefits as  other approaches have theirs and there are times when we have to use one or another.

innertooutter

The scripts can be downloaded from:

    • tsql script – has limited logic for comments removal
  • ps script to remove comments – in case we still receive errors when executing the tsql script
    • the powershell script above requires 2 dll files in order to work, Microsoft.Data.Schema.ScriptDom.dll, Microsoft.Data.Schema.ScriptDom.Sql.dll

SQL Server troubleshooting – first steps in troubleshooting performance of someTSQL statements

Goal

Road map of steps that I can use to troubleshoot the performance of a TSQL statement or a SP and keep track of the things discovered on the way for future reference

My way of achieving it

Using scripts that I gathered or taken from what other people shared over time over the internet and by scripts that I created.

first steps troubleshootingV1

1. The script can be found in this book, SQL Server DMVs in Action by Ian W. Stirk right under — Listing 3.6 Identifying indexes used by a given routine
2. The script can be downloaded from here
3. The script can be downloaded from here
4. The script can be downloaded from here
5. The script can be downloaded from here
6. Check the execution plan of the routine in XML format and inside search after PARAMETERCOMPILEDVALUE. Check if the values from there are identical with the values that the application is using or the job.

In most of the cases the index optimization or updating the statistics will help us and maybe will solve the issue but in other cases as the post stated the above steps are just the beginning of troubleshooting.

SQL Server troubleshooting – what happened with my instance

Goal

Obtaining information about what happened during my last restart of SQL Server instance and if this was a normal restart or an unexpected restart

My way of achieving it

Using SSMS, xp_cmdshell and Wevtutil  we can achieve the goal and we can overcome some of the challenges we face in case we are after the same information using some of the other approaches. I have chosen to not use  SQL Server Error logs because of the

– cycling of the logs during each SQL server instance restart or by using the sp_cycle_errorlog in our environments

– information inside the logs doesn’t contain information outside of SQL Server

I want to mention that the execution time of the script is influenced by the information captured and kept in the windows application logs and the reason for which I decided to display by default 1 sql server restart times.

In case of a clustered instance we have to provide the credentials that will be used to ran the Wevutil remotely and another mention as important as first one is that the script is working only for 2 nodes cluster and in case we would like to run it on a clustered instance that has more than 2 nodes then we need to tweak the script a little bit.

The output of the script is shown below

irmi

We can use the information to check what happened during last restart and see what types of errors , critical, warnings appeared in the last 60 minutes before the last restart in the system and application log.

The script can be downloaded from here

SQL Server troubleshooting – what happened with my database

Goal

Obtaining in the same SSMS window, information about why a database might have been unavailable or having its status changed.

My way of achieving it

Almost all of the monitoring deployed solutions these days raise tickets every time a database becomes unavailable and from the DBA operational point of view this means that we must connect and check what happened with the database and take the required actions. Of course that depending of the monitoring implementation some tickets will be closed automatically if during the second check the database becomes online but this doesn’t mean that we don’t need to try to see why the database status changed. Although we can do our investigation using GUI tools that SQL Server provides this approach has some limitations that the script used here tries to overcome. Below are some of the limitations:

– the account we use to connect to the server where SQL Server instance is running might not have rights to access the location where default trace files are stored which will make almost impossible the usage of SQL Server Profiler

– filtering or searching of SQL Server Error log files using SQL Server Log File Viewer was not designed to search or filter after multiple strings in the same time which makes the filtering or searching of the logs after string a and after string b impossible.

Because of this and other limitations I turned my attention and I tried to find other ways of searching and filtering SQL Server Error Log files and default trace and display the required information in only one window.

The output of the script in some cases will provide us enough information to see what happened while in other cases might give us only the name of the logins that were performing activities during that time.

dbstatus

The main benefit of this approach is that we can have in one window the information pertaining to that database from SQL Server Error log files and the default trace.

The script can be downloaded from here

SQL Server troubleshooting using performance counters

Goal

Sometimes when I need to troubleshoot one local or remote SQL Server instance I need to have information from inside SQL Server instance but also information outside of  it in order to start my investigation or take some conclusions. Most of the time this is a very time consuming operation and not always straight forward hence I the need to have something that I can re-utilize over and over again in these situations.

My way of achieving it

After searching and seeing what other people are doing when it comes to this I decided that I can combine some of the ideas found and put my own ideas in one script that I can use to have access to performance counters outside of  SQL Server but obtainable from SQL Server. I already mentioned in another post and I would like to mention it again that every script that I will post might have flaws or shortcomings and should be perceived as such. More than that the scripts can be considered the result of collective effort of different people  from the internet since I am taking and using what they were sharing over the internet.

The output of the script provides this type of information but the script can be modified to return the kind of information that you would like to have and use.  For me is important when troubleshooting a SQL Server instance to know:

– the processor utilization

– available memory

– disk utilization for the drives where sql server has files

– network utilization

Counter    Value
“\\WIN-666BDQE0KVL\Memory\Commit Limit”    4292546560
“\\WIN-666BDQE0KVL\Memory\Available MBytes”    1511
“\\WIN-666BDQE0KVL\Network Interface(Intel[R] PRO_1000 MT Network Connection)\Output Queue Length”    0
“\\WIN-666BDQE0KVL\Network Interface(isatap.{F5634C4F-D7A9-4921-924B-C112B6BC5377})\Output Queue Length”    0
“\\WIN-666BDQE0KVL\Network Interface(Local Area Connection* 11)\Output Queue Length”    0
“\\WIN-666BDQE0KVL\Network Interface(Intel[R] PRO_1000 MT Network Connection)\Bytes Total/sec”    0
“\\WIN-666BDQE0KVL\Network Interface(isatap.{F5634C4F-D7A9-4921-924B-C112B6BC5377})\Bytes Total/sec”    0
“\\WIN-666BDQE0KVL\Network Interface(Local Area Connection* 11)\Bytes Total/sec”    0
“\\WIN-666BDQE0KVL\Processor(_Total)\% User Time”    -1
“\\WIN-666BDQE0KVL\Processor(_Total)\% Privileged Time”    0
“\\WIN-666BDQE0KVL\Processor(_Total)\% Processor Time”    0
“\\WIN-666BDQE0KVL\LogicalDisk(C:)\Current Disk Queue Length”    0
“\\WIN-666BDQE0KVL\LogicalDisk(G:)\Current Disk Queue Length”    0
“\\WIN-666BDQE0KVL\Process(_Total)\Page File Bytes”    2915581952
“\\WIN-666BDQE0KVL\Process(sqlservr#3)\Page File Bytes”    434040832
“\\WIN-666BDQE0KVL\Process(sqlservr#3)\% User Time”    0
“\\WIN-666BDQE0KVL\Process(sqlservr#3)\% Privileged Time”    0
“\\WIN-666BDQE0KVL\Process(sqlservr#3)\% Processor Time”    0

typeperf

I will not explain here how the above output should be used since the goal of the post was only to provide a method to obtain performance counters outside of  SQL Server

The script can be downloaded from here