Sunday, February 19, 2012

DTC error with concurrent data flows

I have a package that has several data flows that run concurrently after some initial tasks and an initial data flow. I want transactions on each of the data flows and have set the transaction option to Required on the data flows (not on the package itself). I am also using checkpoint restart on the package. A couple things are happening.

1) the first data flow is successful and that releases the several that are waiting. Some of these complete OK but inevitably one or two will fail. The failing data flows will be different from run to run, sometimes one and sometimes two will fail. The error says:

Error: 0xC0202009 at Provider_NF_Code, Delete Provider_NF_Code [130]: An OLE DB error has occurred. Error code: 0x80004005.

An OLE DB record is available. Source: "Microsoft SQL Native Client" Hresult: 0x80004005 Description: "Distributed transaction completed. Either enlist this session in a new transaction or the NULL transaction.".

My hunch is that DTC is getting the transactions mixed up. I think it is committing one just after another data flow has already started work expecting the transaction to still be active. That would explain why the failing data flows are random. Plus, if I set the MaxConcurrentExecutables to 1 the entire package is successful. BUT, why have concurrent tasks if you can't run them concurrently.

2) when the package fails with the DTC problem I restart it with the checkpoint file. I was expecting the package to restart with the failed data flows. Instead, it restarts with the initial data flow (that all the other flows wait for in the package). This data flow has always been successful. It's as if the transactions I have put on the individual data flows are actually placed on a single virtual container that all of them are in, and when down stream data flows fail the entire data flow chain is rolled back and set to restart.

How can I get multiple concurrent data flows to run with transactions?

Why are successful data flows being restarted? Can I get just the failed tasks to restart?

-Gordy

I can't help with why things are failing but regarding restarting using checkpoints you should find this to be of use: http://blogs.conchango.com/jamiethomson/archive/2005/04/20/1324.aspx

Here's a bit more info on checkponts that you might find useful: http://blogs.conchango.com/jamiethomson/archive/2006/01/03/2537.aspx

-Jamie

|||

One other thing, if each of your transactions only contain one task, what's the point in putting it in a transaction?

-Jamie

|||

Each data flow task maintains a different table. The data flows perform inserts, updates and deletes. If one data flow fails the others shouldn't be rolled back, that's why they are in seperate tasks (among other reasons) and seperate transactions. And in order to avoid partially updated tables the data flows are using transactions.

-Gordy

|||

Jamie,

Thanks for your assistance but I disagree with your blog. There is no reason to assume that parallel sequence containers should be rolled back if one of the containers has a failure. The rollback should be based on whether the sequence container(s) finished successfully or not. If tables A, B, and C are in seperate seq containers and A & B finished but C choked, why rerun A & B? However, that said, it seems to be what MS is doing and it is waisting a lot of processing time for me.

So, let's say I want to build 3 houses at the same time. If any has a defect I'll tear it down and start over, but I don't want to tear them all down if just 1 has a defect. Is there a way to use checkpoints with transactions to get this done?

As I said above, concurrent processing seems to get DTC confused and a transaction that a parallel task was going to join gets comitted leaving it without a trans.

-Gordy

|||

Gordon,

I don't think I was trying to say that would always be the case - just in that particular "for instance". Mind you, it was a long time ago so I might be wrong :)

-Jamie

No comments:

Post a Comment