Oracle Diaries: June 2012

Friday, June 22, 2012

Create failed for user 'xx' error

"Create failed for user 'xx'. User or role already exists in the current database" is likely to occur when users were not properly mapped. it is fixed by running sp_change_users_login stored procedure


USE DatabaseName
GO

EXEC sp_change_users_login 'Update_One', 'myuser', 'myuser'
GO

This procedure with action 'Update_One', links the specified user in the current database to an existing SQL Server login. User and login must be specified.

Thursday, June 21, 2012

Pivot tables in SQL Server 2008

A pivot table is a frequently used method of summarizing and displaying especially report data by means of grouping and aggregating values.
For example, given the data from Orders table - OrderID, CustomerID, OrderDate etc., I can create a pivot table to display number of orders per each year listed in the database:

SELECT * FROM (SELECT OrderID, YEAR(OrderDate) As OrderYear FROM Orders) Data
PIVOT(
COUNT(OrderID) 
FOR OrderYear IN([2008], [2009], [2010])) As YearlyOrderQty

will produce something like that:

2008	2009	2010
152	408	270

which would be quite difficult to accomplish without the use of pivot table.

SELECT * FROM (SELECT CustomerID, OrderID, YEAR(OrderDate) As OrderYear FROM Orders) Data
PIVOT(
COUNT(OrderID) 
FOR OrderYear IN([2008], [2009], [2010])) As YearlyOrderQty

will produce a list of customers with number of orders for each customer per each year:

CustomerID	2008	2009	20120
ALFKI	0	3	3
ANATR	1	2	1
ANTON	1	5	1
AROUT	2	7	4
BERGS	3	10	5
BLAUS	0	4	3
BLONP	3	7	1
BOLID	1	1	1

The query below is similar to the one just above but it returns Company name instead of Customer ID




SELECT * FROM (
SELECT c.CompanyName, o.OrderID, YEAR(o.OrderDate) As OrderYear FROM Orders as o INNER JOIN Customers as c ON o.CustomerID=c.CustomerID
) Data
PIVOT(
COUNT(OrderID) 
FOR OrderYear IN([2008], [2009], [2010])) As YearlyOrderQty

CompanyName	2008	2009	2010
Alfreds Futterkiste	0	3	3
Ana Trujillo Emparedados y helados	1	2	1
Antonio Moreno Taquería	1	5	1
Around the Horn	2	7	4
Berglunds snabbköp	3	10	5
Blauer See Delikatessen	0	4	3
Blondesddsl père et fils	3	7	1
Bólido Comidas preparadas	1	1	1

Monday, June 18, 2012

Optimizing stored procedures

1) Add SET NONCOUNT OFF - to prevent your stored procedure from returning number of rows affected, thus reducing network traffic

2) Use fully qualified object names and table names - preceeding the object names with schema name reduces the time it takes to search all schemas for the object, thus reducing the time it takes for a stored proceedure to runMake sure object names and table names

3) Avoid SQL Server searching master database for your stored proceedure by ensuring that none of the proceedure names are preceeded with "sp_", use "usp_" instead, to specify that it's a user stored procedure, not a system one.

4) If proceedure returns an integer value, use RETURN statement to return a single integer value as opposed to returning the value as a part of a recordset.

The RETURN statement exits unconditionally from a stored procedure, so the statements following RETURN are not executed. Though the RETURN statement is generally used for error checking, you can use this statement to return an integer value for any other reason. Using the RETURN statement can boost performance because SQL Server will not create a recordset.

5) Replace all EXECUTE statements with sp_executesql. sp_executesql makes your code more reusable because it takes parameters and eliminates the possibility for sql injection.

The execution plan of a dynamic statement can be reused only if each and every character, including case, space, comments and parameter, is same for two statements. For example, if we execute a sql batch with a parameter, and then execute the same batch with a different value for the parameter, the execution plan created the first time around will be reused for the different value of the parameter. The reuse of the existing complied plan will result in improved performance.

6) When checking for an existance of a particular record, Use IF EXISTS (SELECT 1) instead of (SELECT *) since it minimizes data processing

7) Add indexes to the fields most often used in WHERE clauses and for JOINS. Indexing the right fields can help to significantly improve the performance.

8) Utilize TRY-CATCh blocks, it is much more efficient than the old way of error checking after each sql statement.

BEGIN TRY
--Your t-sql code goes here
END TRY
BEGIN CATCH
--Your error handling code goes here
END CATCH

9) Utilize temp tables. For instance if the same query runs multiple times, run it only once, dumping the results into a temp table.

10) use EXISTS instead of JOINS. If the table you are joining with is not contributing any columns to your query (the select list), then you are using unnecessary IO resources on the background.

11) Avoid using DISTINCT. Usage of DISTINCT may mean you have a bad table design somewhere

12) Limit the SELECT list by returning only the columns you need, since returning too many columns can have a drastic effect on your query. Not only will it increase you chances for bookmark lookups (or key lookups), but the network and disk latency add to the query. Not to mention you will be squeezing more data into your buffer cache.

13)Use the least amount of tables to Compile Your SELECT list. An example would be let’s say you need to join on 2 tables in order to get your result set. If one of the tables contains all the fields needed for the SELECT list, but you are also able to get the same field(s) from the other table, always go with only returning the values from the one table. Doing so will limit the number of IO operations necessary to give you your result.

14)Index temp tables Temp tables are treated just like permanent tables according to SQL. They can have indexes & statistics. The only downfall is that they often cause recompiles for the statement when the result sets differ. To counter this read reducing temp table recompiles or use table variables if you have to.

15) Break down large stored procedures into several sub-procedures and call them from controlling stored procedure.

The stored procedure will be recompiled when any structural changes are made to a table or view referenced by the stored procedure (an ALTER TABLE statement, for example), or when a large number of INSERTS, UPDATES or DELETES are made to a table referenced by a stored procedure. So, if you break down a very large stored procedure into several sub-procedures, there's a chance that only a single sub-procedure will be recompiled, while other sub-procedures will not.

16) Try to avoid using temporary tables inside your stored procedures.

Using temporary tables inside stored procedures reduce the chance to reuse the execution plan.

17) Try to avoid using DDL (Data Definition Language) statements inside your stored procedure.

Using DDL statements inside stored procedures also reduce the chance to reuse the execution plan.

18) Add the WITH RECOMPILE option to the CREATE PROCEDURE statement if you know that your query will vary each time it is run from the stored procedure.

The WITH RECOMPILE option prevents reusing the stored procedure execution plan, so SQL Server does not cache a plan for this procedure and the procedure is always recompiled at run time. Using the WITH RECOMPILE option can boost performance if your query will vary each time it is run from the stored procedure, because in this case the wrong execution plan will not be used.

19) Use SQL Server Profiler to determine which stored procedures have been recompiled too often. To c

heck if a stored procedure has been recompiled, run SQL Server Profiler and choose to trace the event in the "Stored Procedures" category called "SP:Recompile". You can also trace the event "SP:StmtStarting" to see at what point in the procedure it is being recompiled. When you identify these stored procedures, you can take some correction actions to reduce or eliminate the excessive recompilations.

Thursday, June 14, 2012

Troubleshooting database slowness

Check task manager make sure SQL Server is not consuming more than 50% RAM available. If CPU usage is high, check sysprocesses table in master database for the number of connections that are using SQL Server - make sure you check the ones with abnormally high CPU time and spid > 50 and to exclude all system connections.

This query will show which processes have been holding the connections for more than 3 hours, as well as blocking queries:

SELECT   spid,
sqltext.text as [SQL], 
blocked As [Process Blocked],
         loginame as [User], 
         hostname as [Hostname], 
         program_name as [Application], 
         cpu as [CPU], 
         physical_io as [Physical IO], 
         memusage as [Memory Usage], 
         open_tran as [Open Transactions],
         last_batch as [Time Last Ran],
         DB_NAME(sysprocesses.dbid) As [Database Name]
FROM     sysprocesses  
CROSS APPLY sys.dm_exec_sql_text(sql_handle) AS sqltext
WHERE    spid > 50 
AND Datediff(hh,last_batch,Getdate()) > 3 
ORDER BY last_batch

Next, check sys.dm_exec_query_stats for the top 10 longest running queries:

SELECT DISTINCT TOP 10
t.TEXT QueryName,
s.execution_count AS ExecutionCount,
s.max_elapsed_time AS MaxElapsedTime,
ISNULL(s.total_elapsed_time / 1000 / NULLIF(s.execution_count, 0), 0) AS AvgElapsedTime,
s.creation_time AS LogCreatedOn,
ISNULL(s.execution_count / 1000 / NULLIF(DATEDIFF(s, s.creation_time, GETDATE()), 0), 0) AS FrequencyPerSec
FROM sys.dm_exec_query_stats s
CROSS APPLY sys.dm_exec_sql_text( s.sql_handle ) t
ORDER BY s.max_elapsed_time DESC, ExecutionCount DESC

Also examine SQL Server logs to see if anything unusual has occurred.

Monday, June 11, 2012

Index defragmentation

Fragmentation can be defined as any condition that cause more than optimal amount of disk I/O to be performed in accessing a table or cause the longer disk I/O. For optimal performance of SQL queries, the data pages of tables are as contiguous as possible and pages are as fully packed as possible. Fragmentation breaks this rule thus reducing the performance of the queries. There are two levels of fragmentation: file system level (physical disk fragmentation) and index level fragmentation.

Physical disk fragmentation is the fragmentation of the database file in the file system, when file system cannot allocate contiguous space for it. As a result disk head has to move back and forth when reading from the datbase files. That happens when database file is stored on the same disk as other files, such as OS, application, or log files or when database grows frequently and in small chunks. The way to avoid it is to allocate enough space originally and to specify growth option to allocate larger chunks instead of small ones.

Before running windows defragmentation tool, SQL Server needs to be stopped, otherwise database file will be skipped.

Index fragmentation can be reduced by selecting an appropriate fill factor when creating the index.

There are two ways to defragment indexes: rebuild and reorganize.

When you reorganize an index, SQL Server physically reorders the leaf-level pages to match logical order of the leaf nodes. The process does not allocate new pages but only uses existing ones, compacting the indexes. Reorganization uses minimal resources and does not block queries or updates.

Reorganization should only be used on lightly fragmented indexes (no more than 30% fragmentation), heavily fragmented indexes should be rebuilt.

The sql below reorganizes the specified index:

ALTER INDEX index_name 
ON DBName.TableName
REORGANIZE

The important thing to keep in mind is that if you try to reorganize or rebuild an index with few pages, most likely nothing will change.

Rebuilding an index involves dropping the original index and building a new one, so that index starts clean, with minimal fragmentation. Index can be rebuilt offline and online. Rebuilding is done the same was as reorganizing except REBUILD keyword is used instead of REORGANIZE:

ALTER INDEX index_name 
ON DBName.TableName
REBUILD

if you want to add some options to your rebuild, use WITH clause

ALTER INDEX index_name 
ON DBName.TableName 
REBUILD WITH (
  FILLFACTOR = 60,
  ONLINE = ON
)

Friday, June 8, 2012

Analyzing index fragmentation

The sys.dm_db_index_physical_stats function is used to determine which indexes are fragmented and the extent of their fragmentation.

The function takes the following parameters: database_id, object_id, index_id, partition_number, mode.

database_id - ia an integer value that represents id number of your database
object_number - is an integer value that represents your table or view
index_id - integer that represents id of your index
mode - can be either LIMITED, SAMPLED, or DETAILED

LIMITED: Scans the smallest number of pages, which means this is the fastest mode. The LIMITED mode is equivalent to NULL and DEFAULT.
SAMPLED: Scans 1% of all pages. If an index contains fewer than 10,000 pages, then DETAILED mode is used.
DETAILED: Scans all index pages, which means this is the slowest mode, but most accurate.

You must specify all five parameters, even if their values are null.

Here are some important values returned by sys.dm_db_index_physical_stats function:

index_type_desc - index type (clustered, non-clustere)
avg_fragmentation_in_percent - percentage of the logical index that is fragmented
fragment_count - number of fragments in the leaf level
page_count - number of index or data pages.

An index always has at least one fragment (fragment_count). The maximum number of fragments that an index can have is equal to the number of pages (page_count). For example, an index that is made up of 3 pages can at the most have 3 fragments. The larger the fragment, the less disk I/O that is required. Ideally, the avg_fragmentation_in_percent value should be as close to zero as possible.

Microsoft recommends that you reorganize your index if the avg_fragmentation_in_percent value is less than or equal to 30% and rebuild the index if the value is greater than 30%

I am using the following query to retrieve the fragmentation data:

SELECT OBJECT_NAME(ps.object_id) As TableName, i.name As IndexName,ps.index_type_desc As IndexType, ps.index_depth, ps.index_level, ps.avg_fragmentation_in_percent, ps.fragment_count, ps.page_count 
FROM sys.dm_db_index_physical_stats(DB_ID('Northwind'), null, null, null, 'LIMITED') as ps
INNER JOIN sys.indexes i ON ps.object_id = i.object_id AND ps.index_id = i.index_id 
WHERE NOT i.name IS NULL

Thursday, June 7, 2012

On index statistics

If an index is not being used much, it will have a lot of free space in the index pages, thus occupying a lot more disk space then it should and it may be worth compacting it (via rebuild or defrag) to get the disk space back. At the same time is an index is barely used, there might not be a point to spend resources to remove any kind of fragmentation.

Another thing is a non-clustered index affects performance in a number of ways, one of them being the amount of extra IO it takes to maintain each index. For example whenever a new record is added to the database, a matching record is inserted into a non-clustered index, or whenevr the record is deleted, a matching record for the index must be deleted as well, or a clustered index's value is changed during update, the matching record for non-clustered index must be updated as well.

There is a dynamic management view sys.dm_db_index_usage_stats that can help us determine whether a particular index is being used. This view does not keep historical data but keeps the data only since the last time database was open. It does not keep the data after the database has been closed.

The sql below will retrieve data for a particular database

SELECT * FROM sys.dm_db_index_usage_stats
WHERE database_id = DB_ID('DatabaseName')

Below are some useful fields from sys.dm_db_index_usage_stats view:

user_seeks - either looking up a single row or doing a range scan
user_scans - select * operation performed on the table
user_lookups - a bookmark lookup where a non-clustered index does not fully cover a query and additional columns must be retrieved from the base table row
user_updates - number of times it was used in update operations (this included inserts, updates, and deletes)

Some useful queries:

--get database name based on database_id
select * from sysdatabases where dbid=7

--get object name based on object_id
SELECT name
FROM sys.objects
WHERE object_id = 37575172

--retrieves database_id of a current database
Select db_id()

--retrieves an index name based on object_id and index_id from sys.dm_db_index_usage_stats view
SELECT name FROM sys.indexes WHERE object_id = @object_id and index_id = @index_id

--retrieves index statistics, including table name to which the index belongs, and index name
SELECT object_name(iu.object_id, iu.database_id) as tablename, i.name as IndexName, iu.user_seeks,iu.user_scans,iu.user_lookups,iu.user_updates,iu.last_user_seek, iu.last_user_scan,iu.last_user_lookup,iu.last_user_update 
FROM sys.dm_db_index_usage_stats as iu
LEFT JOIN sys.indexes as i ON iu.index_id=i.index_id AND iu.object_id=i.object_id
WHERE iu.database_id = DB_ID('Northwind')

The query below will return a lit of all indexes with corresponding table names that have not been used by SQL Server since the service started

SELECT  ObjectName = OBJECT_NAME(ind.object_id),  IndexName = ind.name
FROM Sys.Indexes ind            
INNER JOIN Sys.Objects obj ON obj.object_id = ind.object_id
WHERE OBJECTPROPERTY(obj.object_id,'IsUserTable') = 1            
AND NOT EXISTS ( SELECT 1 FROM Sys.dm_db_index_usage_stats usg WHERE usg.object_id = ind.object_id AND usg.index_id = ind.index_id AND ind.index_id = usg.index_id)
ORDER BY ObjectName,IndexName

Wednesday, June 6, 2012

Query for cleaning up the history records except the last one for each item

The query below will delete all the history records for each item, except one latest record

DELETE FROM Table1
WHERE itemHistoryId NOT IN 
(SELECT t1.itemHistoryId
 FROM Table1 as t1
WHERE t1.historyDateTime = (SELECT MAX(historyDateTime) FROM Table1 WHERE ItemID=t1.ItemID))

The subquery

SELECT t1.itemHistoryId
 FROM Table1 as t1
WHERE t1.historyDateTime = (SELECT MAX(historyDateTime) FROM Table1 WHERE ItemID=t1.ItemID)

will return all the records with the latest history date, one per each itemID (i.e. records to be kept), and the main query will pull all the records that are not included in this subquery and delete them

Friday, June 1, 2012

Cannot open user default database. Login failed. Login failed for user ‘UserName’.

If, when attempting to log in to SQL Server Management Studio you get an error "Cannot open user default database. Login failed. Login failed for user 'UserName'. (Microsoft SQL Server, Error: 4064)", click Options and in a dropdown Connect to database, type in your database name and click connect.