SQL Server troubleshooting – tempdb utilization

Goal

To obtain information about tempdb database from utilization perspective

My way of achieving it

Using some of the DMV/DMF that come with SQL Server and overall advises (scripts taken from different sources over internet that all refers to SQL Server DMVs in Action Better Queries with Dynamic Management Views by Ian W. Stirk)  from peers colleagues from internet customized to fit my needs  and my own way of troubleshooting. For me the most important thing while troubleshooting is to have the result of the investigation side by side with what it means since in most of the cases I am not dealing with this everyday and my memory about this topic is as good as the number of occurrences I had throughout my working experience. In simple terms for tempdb database as for other user databases we can find information about usage using specific DMVs and DMFs. The utilization information (tempdb) provided by SQL Server comes under the form of

  • present utilization
    • current space occupied by all the objects that uses tempdb – user objects, internal objects, version store, mixed extent
    • current space each session is currently using in tempdb because of running tasks
    • current space each session is currently using in tempdb because of row versioning
  • past utilization
    • used space in tempdb by all the sessions since they were established

The output of the script is ordered by the space still used and all the columns along the table gives me all the information that I need in order to decide the next steps or to say whom, when,  why in regards to tempdb utilization.

tempdbtroubleshooting

The script can be downloaded from here

SQL Server troubleshooting – logical disk free space and transaction log backup alerts

Goal

To obtain information that will help me troubleshoot, solve these type of alerts

My way of achieving it

Using tsql and xp_cmdshell. Sometimes when I have to troubleshoot why we ran out of free space on the drive where transaction log files are for our databases I start by checking what other files are stored in the same path and check which of them are subject to returning their free space (only transaction log files) to the OS in order to get rid of this alert. This is handy when for some reason one of the database tlog file grew because of an ad hoc query or because of the usual activity against the database and now it takes all the space. This doesn’t mean that we should not configure the drive hosting our tlog files with the right size it  only offer an workaround for creating space quick for other databases in order for these to grow if they need. Along with logical disk free space alerts most of the monitoring implementations are creating also backup alerts if the default location cannot accommodate the size of the backup which means that we have to find space somewhere else to take that backup and truncate that file to return the free space to the operating system.

Because in some cases I am connecting to the instance remotely the need for having something in one place without using along SSMS, remote disk management, windows explorer using administrative shares and so on becomes a must if the other tools cannot be used due to firewall configuration between the server on which we use SSMS to connect to that instance. The script output displays what we can do to free some space, information related to the log space usage for our database and how tlog file is configured  along with information about the places where the temporary backup can be stored. Due to the fact that someone might have configured log shipping for our database and the script is considering default path the one in which last week backups went the most there is no WARNING or logic saying that this database CANNOT be backed up to different locations that script provides. This means that I have to always check if the database is configured in a log shipping configuration and is a primary database anytime I am using the script.

The output of the script is similar with the output below depending of what you entered and the database options.

ldfsandtlogb

The script generates commands for backing up the tlog for our databases and shrinking the files. If shrink is not successful then display all open transactions in order to help you find the transaction for which the shrink cannot be done if you entered a path in the script.

The script is generating the commands for backing up tlog, showing you the tlog usage and its configuration together with places where you can take the backup if you entered a database name in the script.

The script can be downloaded from here

SQL Server database restore – preparations and tests that we must perform before starting the restore

Goal

To obtain information about the target database files , to check the backup file that we have to use, to estimate the space required and to have a list with steps done and a list with steps to do

My way of achieving it

Using linked server and xp_cmdshell. The restore operation of a SQL Server database is something that requires most of the time careful attention especially when it comes to big databases when the restore operations is time consuming and we would like to have all in place before the restore. Not only that requires our full attention, the restore requires also preparation which is in many cases time consuming and prone to errors and mistakes or something that usually I forget to check. Because of this I decided to create a script that will check some of the things required for a potentially successful restore. The script performs the following verifications:
– checks if the bak file is accessible from the target instance
– checks the drive information of the target instance for space available for the restore
– checks if target instance is older than source instance since backups from newer versions of SQL cannot be restored on old versions
– checks the full bak file for information about the size of the files that will be restored
– check if target database exists on target instance
– check if we have space for individual database files restores and if not provides some alternative locations for relocation
– generate the restore command, is very basic one that needs to be changed in case target database doesn’t exist on target instance

Before using the script we must modify the @sourceinstance, @targetinstance, @sourcedb and @targetdb with the right information.

Th output of the script is below

restoreprerequistes

The script can be downloaded from here

Display group membership (including nested groups) of a domain account using SQL Server

Goal

To obtain a list of groups that a domain account is member of.

My way of achieving it

Using SSMS and Ad Hoc Distributed Queries. In order to use AD Hoc Distributed queries temporary you must enable it using sp_configure.

Sometimes when we need to troubleshoot why a domain account doesn’t have the requested permissions we have to check his group membership. Most of the time the access to SQL server is given by making an user member of a specific active directory domain group and have that group added as a login or by adding that domain account as a login. Although most of the time this action will solve the request there might be cases when the users say that they don’t have the same rights as other users. This post address this situation by discovering the group membership of those domain accounts. Although we can use net user command with domain switch in a command prompt window

– net user domain_account /domain

the problem is that this will not return any nested groups.

When comparing 2 domain accounts for differences that they might have in SQL Server in regards to permissions we want to be sure that those accounts have the same group membership.

The output of the script is below and can be downloaded from here

groupmembership

 

SQL Server troubleshooting – what happened with my instance

Goal

Obtaining information about what happened during my last restart of SQL Server instance and if this was a normal restart or an unexpected restart

My way of achieving it

Using SSMS, xp_cmdshell and Wevtutil  we can achieve the goal and we can overcome some of the challenges we face in case we are after the same information using some of the other approaches. I have chosen to not use  SQL Server Error logs because of the

– cycling of the logs during each SQL server instance restart or by using the sp_cycle_errorlog in our environments

– information inside the logs doesn’t contain information outside of SQL Server

I want to mention that the execution time of the script is influenced by the information captured and kept in the windows application logs and the reason for which I decided to display by default 1 sql server restart times.

In case of a clustered instance we have to provide the credentials that will be used to ran the Wevutil remotely and another mention as important as first one is that the script is working only for 2 nodes cluster and in case we would like to run it on a clustered instance that has more than 2 nodes then we need to tweak the script a little bit.

The output of the script is shown below

irmi

We can use the information to check what happened during last restart and see what types of errors , critical, warnings appeared in the last 60 minutes before the last restart in the system and application log.

The script can be downloaded from here

SQL Server troubleshooting – what happened with my database

Goal

Obtaining in the same SSMS window, information about why a database might have been unavailable or having its status changed.

My way of achieving it

Almost all of the monitoring deployed solutions these days raise tickets every time a database becomes unavailable and from the DBA operational point of view this means that we must connect and check what happened with the database and take the required actions. Of course that depending of the monitoring implementation some tickets will be closed automatically if during the second check the database becomes online but this doesn’t mean that we don’t need to try to see why the database status changed. Although we can do our investigation using GUI tools that SQL Server provides this approach has some limitations that the script used here tries to overcome. Below are some of the limitations:

– the account we use to connect to the server where SQL Server instance is running might not have rights to access the location where default trace files are stored which will make almost impossible the usage of SQL Server Profiler

– filtering or searching of SQL Server Error log files using SQL Server Log File Viewer was not designed to search or filter after multiple strings in the same time which makes the filtering or searching of the logs after string a and after string b impossible.

Because of this and other limitations I turned my attention and I tried to find other ways of searching and filtering SQL Server Error Log files and default trace and display the required information in only one window.

The output of the script in some cases will provide us enough information to see what happened while in other cases might give us only the name of the logins that were performing activities during that time.

dbstatus

The main benefit of this approach is that we can have in one window the information pertaining to that database from SQL Server Error log files and the default trace.

The script can be downloaded from here

SQL Server troubleshooting using performance counters

Goal

Sometimes when I need to troubleshoot one local or remote SQL Server instance I need to have information from inside SQL Server instance but also information outside of  it in order to start my investigation or take some conclusions. Most of the time this is a very time consuming operation and not always straight forward hence I the need to have something that I can re-utilize over and over again in these situations.

My way of achieving it

After searching and seeing what other people are doing when it comes to this I decided that I can combine some of the ideas found and put my own ideas in one script that I can use to have access to performance counters outside of  SQL Server but obtainable from SQL Server. I already mentioned in another post and I would like to mention it again that every script that I will post might have flaws or shortcomings and should be perceived as such. More than that the scripts can be considered the result of collective effort of different people  from the internet since I am taking and using what they were sharing over the internet.

The output of the script provides this type of information but the script can be modified to return the kind of information that you would like to have and use.  For me is important when troubleshooting a SQL Server instance to know:

– the processor utilization

– available memory

– disk utilization for the drives where sql server has files

– network utilization

Counter    Value
“\\WIN-666BDQE0KVL\Memory\Commit Limit”    4292546560
“\\WIN-666BDQE0KVL\Memory\Available MBytes”    1511
“\\WIN-666BDQE0KVL\Network Interface(Intel[R] PRO_1000 MT Network Connection)\Output Queue Length”    0
“\\WIN-666BDQE0KVL\Network Interface(isatap.{F5634C4F-D7A9-4921-924B-C112B6BC5377})\Output Queue Length”    0
“\\WIN-666BDQE0KVL\Network Interface(Local Area Connection* 11)\Output Queue Length”    0
“\\WIN-666BDQE0KVL\Network Interface(Intel[R] PRO_1000 MT Network Connection)\Bytes Total/sec”    0
“\\WIN-666BDQE0KVL\Network Interface(isatap.{F5634C4F-D7A9-4921-924B-C112B6BC5377})\Bytes Total/sec”    0
“\\WIN-666BDQE0KVL\Network Interface(Local Area Connection* 11)\Bytes Total/sec”    0
“\\WIN-666BDQE0KVL\Processor(_Total)\% User Time”    -1
“\\WIN-666BDQE0KVL\Processor(_Total)\% Privileged Time”    0
“\\WIN-666BDQE0KVL\Processor(_Total)\% Processor Time”    0
“\\WIN-666BDQE0KVL\LogicalDisk(C:)\Current Disk Queue Length”    0
“\\WIN-666BDQE0KVL\LogicalDisk(G:)\Current Disk Queue Length”    0
“\\WIN-666BDQE0KVL\Process(_Total)\Page File Bytes”    2915581952
“\\WIN-666BDQE0KVL\Process(sqlservr#3)\Page File Bytes”    434040832
“\\WIN-666BDQE0KVL\Process(sqlservr#3)\% User Time”    0
“\\WIN-666BDQE0KVL\Process(sqlservr#3)\% Privileged Time”    0
“\\WIN-666BDQE0KVL\Process(sqlservr#3)\% Processor Time”    0

typeperf

I will not explain here how the above output should be used since the goal of the post was only to provide a method to obtain performance counters outside of  SQL Server

The script can be downloaded from here

Interpreting sp_WhoIsActive stored procedure output for beginers

I will start my first post by apologizing in advance for any mistakes that most probably I will do but I hope that I will learn and educate myself on the way.

The idea behind this post came to me after I first heard about the stored procedure that Adam Machanic wrote and after I saw how useful was when troubleshooting or seeing what is happening to the SQL Server. Because depending on our role in the company and our day to day activities,  I realized that I need to have something that will refresh the meaning of the output till this output will become a second nature for me.  This was my attempt to take some of the information that he already made available in his blog ( http://sqlblog.com/blogs/adam_machanic/archive/2011/04/01/a-month-of-monitoring-part-1-of-30-a-brief-history-of-monitoring.aspx ) and put them in a format that in the beginning was easier for me to understand and communicate it to other colleagues. As with everything that I will post here this was my attempt and of course it has some flaws, some shortcomings but overall I believe it makes some sense for someone that is learning the output of this stored procedure and how to use it in the beginning. In order to use it we have to follow these steps

1. Create the stored procedure using the latest version of the SP from here http://sqlblog.com/files/default.aspx

2. Create a powershell script using the code provided at the end of this post.

3. Run the stored procedure with the parameters that you want but using this output column list. This is a prerequisite because the last columns depending on the parameters received by the SP will contain more than one line and the script cannot parse it correctly. We want to have those columns at the end in order to select all the columns expect those ones.

exec sp_whoisactive
@output_column_list = ‘[session_id][dd hh:mm:ss.mss][dd hh:mm:ss.mss (avg)][physical_io][reads][physical_reads][writes][tempdb_allocations][tempdb_current][CPU][context_switches][used_memory]
[physical_io_delta][reads_delta][physical_reads_delta][writes_delta][tempdb_allocations_delta][tempdb_current_delta][CPU_delta][context_switches_delta][used_memory_delta][tasks]
[status][wait_info][locks][tran_start_time][tran_log_writes][open_tran_count][blocking_session_id][blocked_session_count][percent_complete][host_name][login_name][database_name]
[program_name]start_time][login_time][request_id][collection_time][additional_info][sql_text][sql_command]’

4. Run the powershell script but not using the ISE. We will be prompted to select the command that you ran in SSMS and press enter after you made it available in the clipboard

PS C:\Users\Administrator\Downloads> .\sp_whoisactive10august2015.ps1
Select the command that you ran in SSMS and copy it in order to be available in the clipboard (Ctrl+C).
Press Enter to continue …:

5. After pressing enter we will have to provide the output of the command  by providing also the column names. Usually I am selecting only one row and all the column headers except the additional_info, sql_text and sql_command since these as we mentioned before, sometimes, have more lines and are not parsed right by the script

6. After pressing enter again the output is parsed and it provides more information about the columns and what those means

Below is a picture with step 4, 5 and 6 and because the script is not displaying the query the first line seems to be out of context since we are not pasting the query that is captured in the columns (sql_text or sql_command).

sp_whoisactive

The script can be downloaded from here