Sunday, February 19, 2012

DTC problem

I have a master package that executes a series of sub packages. The sub packages run in a series with OnSuccess precedence constraints between them. The master package is run from a SQL Agent job.

When I run the master package without setting TransactionOption = Required at any level they all work fine. However when I set TransactionOption = Required at the master package level (and Supported from there downwards) the first package in the series of sub packages executes (as reported in the Event Log), but the next package does not seem to start . . . however the SQL Agent job keeps running indefinitely.

If I disable the second package and set the precedence constraint between the first and third packages, they both execute fine. In all there are 19 packages and 4 of them have the same problem. Disable them and remove them from the precedence chain and it all works.

I tried re-enabling the 4 problem packages and setting the TransactionOption on the Execute Package Task to Not Supported for them but it still 'hangs' at the first one. If I set the TransactionOption to Required for one of the problem packages and execute it on its own, then it runs fine.

I am currently running a SQL Profile trace which isn't telling me anything except that the package isn't executing any SQL (though there are loads of SQL Transaction entries with no text data) and I've got logging to the Event Log turned on, but that just shows that the last package finished and the next one didn't start.

I'm at a loss as to how to proceed so any ideas on what to look for or what might be causing the problem would be much appreciated.

TIA . . . Ed

I just tried swapping the order of the first and second package and the same situation occurs - it runs the first but not the second. Which tells me the problem is not related to the package as such, but some resource used within the package.

The packages only access two databases residing on the same instance of SQL Server - the same machine as the packages are being executed on. They don't access the file system or any other resources.

Could there be a locking problem and if so how can I find out where it is? (As you can probably tell my d/b admin skills aren't tip-top, I'm a web developer in my day job).

Thanks . . . Ed

|||

Ed,

I'm no DBA myself but it does sound as though you have a locking problem. My first port of call in these situations is to execute sp_who and sp_lock

These return information about processes (i.e. SPIDs) that are blocked by other processes and which resources are locked respectively.

BOL has information on sp_who and sp_lock.

Note that there is an undocumented sproc called sp_who2 which returns the same info as sp_who....plus a little bit more! The 'BlkBy' column is interesting as that lists the SPID that is blocking. If BlkBy=-2 then read this article: http://msdn.microsoft.com/library/default.asp?url=/library/en-us/tsqlref/ts_sp_wa-wz_3v8v.asp

-Jamie

|||

Jamie - Thank you very much, you set us on the right path. The queries above showed that there was locking and this:

select * from sys.dm_tran_locks a
inner join sys.objects b on
a.resource_associated_entity_id = b.object_id
where a.request_session_id = <spid>

. . . showed the objects which were being locked.

The problem was due to our schema . . . not all tables had PKs (something I'd been meaning to address) so presumably there were table scans deadlocking each other.

Adding some missing PKs has fixed the problem.

Thanks again . . . Ed

No comments:

Post a Comment