Friday, February 17, 2012

DTA package handling multiple file formats

Hi,

Currently we get data from more then 200 different sources and all of
our vendors provide data in different file formats. The problem is we
have more then 100 DTS packages now and the maintainance is very
diffucult.
Every time our vendor changes the format we have to change in multiple
DTS packages.
Is anybody know what would be the right way of reducing the no. of DTS
packages.
The type of file formats we get are .xls .txt .dat .csv etc. and .txt
& .dat files comes with different delimitters. The # of columns also
varies from file to file. Is it possible to have a DTS package which
can handle diff file formats and loads data into a staging table and
from there based of the source of the file we can move data into
respective tables & columns.

We are using SQL SERVER 2000

Thanks in advance.

Subodh"Subodh" <sgoyal@.agline.on.ca> wrote in message
news:90104bf0.0501240846.58b2b293@.posting.google.c om...
> Hi,
> Currently we get data from more then 200 different sources and all of
> our vendors provide data in different file formats. The problem is we
> have more then 100 DTS packages now and the maintainance is very
> diffucult.
> Every time our vendor changes the format we have to change in multiple
> DTS packages.
> Is anybody know what would be the right way of reducing the no. of DTS
> packages.
> The type of file formats we get are .xls .txt .dat .csv etc. and .txt
> & .dat files comes with different delimitters. The # of columns also
> varies from file to file. Is it possible to have a DTS package which
> can handle diff file formats and loads data into a staging table and
> from there based of the source of the file we can move data into
> respective tables & columns.
> We are using SQL SERVER 2000
> Thanks in advance.
> Subodh

Personally, I would look at writing an external script or program in C#,
Python, Perl or whatever to manipulate the files and load the staging table.
The script could load the data directly to the staging table by dynamically
generating INSERTs, or it might transform the source files to your own
standard file format to be used with bcp.exe, BULK INSERT or the DTS Bulk
Insert task.

Your maintenance efforts would be then directed at the program, not at the
packages, which is probably a good thing - it's likely easier to modify one
module/class/object than 10 packages, and most languages have good library
support for parsing, tokenizing, regexes and so on. Or perhaps a hybrid
solution might work - an external program for proprietary file formats, and
standard DTS connections/tasks for the rest. You might also want to ask in
microsoft.public.sqlserver.dts to see if someone else has experienced a
similar situation.

Finally, since your basic issue (as I understood it) is that you have too
many file formats, you should consider agreeing a standard file format - at
least with your larger clients/vendors - rather than looking at it just as a
technical problem. I have no idea how easy that would be in your company's
situation, of course.

Simon|||
*** Sent via Developersdex http://www.developersdex.com ***
Don't just participate in USENET...get rewarded for it!

No comments:

Post a Comment