Wednesday, March 21, 2012

DTS csv import fails when using double quotes as the Text Qualifier, the last column is a

I am using DTS to import csv files that I receive from a third party.
Each day I receive many file which I import into many different SQL
Server tables depending on the name of the file. I am doing this as
part of a larger automation process written in C#. Due to the csv
file names differing each day (they contain a date) and the large
number of DB Table schemas, I am programmatically/dynamically
creating/building a generic DTS package for each file and calling it
to import the csv file into the appropriate table.
The data contains commas for the column delimiters, double quotes for
the Text Qualifier and CR LF for the row delimiter. I have set the
appropriate DTSFlatFile connection properties and successfully
imported many of the csv files except files which meet the following
criteria:
1) the last column of the SQL Server table that corresponds to this
file is of type varchar or char
2) the data value for the last column of the record is always null
For these file I get the following error message when I try to execute
the package: "A DTSTransformCopy must specify no columns (signifying a
sequential 1-to-1 mapping of all columns) or the same number of source
and destination columns."
e.g.
File 1 -- imports fine:
--
"1","a, b","03/03/2004 00:00:00","01/17/2038 23:59:59","03/04/2004
06:25:58","0088288574800","dd"
"2","cd","03/03/2004 00:00:00","01/17/2038 23:59:59","03/04/2004
06:25:58","0088288574800",
"3","e, f","03/03/2004 00:00:00","01/17/2038 23:59:59","03/04/2004
06:25:58","0088288574800",
File 2 -- imports fine:
--
"1","c, d","03/03/2004 00:00:00","01/17/2038 23:59:59","03/04/2004
06:25:58","0088288574800",
"2","ab","03/03/2004 00:00:00","01/17/2038 23:59:59","03/04/2004
06:25:58","0088288574800","dd"
"3","c, d","03/03/2004 00:00:00","01/17/2038 23:59:59","03/04/2004
06:25:58","0088288574800",
File 3 -- imports fine:
--
"1","a, b","03/03/2004 00:00:00","01/17/2038 23:59:59","03/04/2004
06:25:58","0088288574800",
"2","cd","03/03/2004 00:00:00","01/17/2038 23:59:59","03/04/2004
06:25:58","0088288574800",
"3","e, f","03/03/2004 00:00:00","01/17/2038 23:59:59","03/04/2004
06:25:58","0088288574800","dd"
File 4 -- fails:
--
"1","a, b","03/03/2004 00:00:00","01/17/2038 23:59:59","03/04/2004
06:25:58","0088288574800",
"2","cd","03/03/2004 00:00:00","01/17/2038 23:59:59","03/04/2004
06:25:58","0088288574800",
"3","e, f","03/03/2004 00:00:00","01/17/2038 23:59:59","03/04/2004
06:25:58","0088288574800",
Table schema:
if exists (select * from dbo.sysobjects where id = object_id(N'[dbo].[Table1]') and OBJECTPROPERTY(id, N'IsUserTable') = 1)
drop table [dbo].[Table1]
GO
CREATE TABLE [dbo].[Table1] (
[Field1] [varchar] (64) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL
,
[Field2] [varchar] (64) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL
,
[Field3] [datetime] NOT NULL ,
[Field4] [datetime] NULL ,
[Field5] [datetime] NULL ,
[Field6] [varchar] (64) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL
,
[Field7] [varchar] (64) COLLATE SQL_Latin1_General_CP1_CI_AS NULL
) ON [PRIMARY]
GO
I am using SQL 2000 sp3a:
Microsoft SQL Server 2000 - 8.00.760 (Intel X86)
Dec 17 2002 14:22:05
Copyright (c) 1988-2003 Microsoft Corporation
Developer Edition on Windows NT 5.0 (Build 2195: Service Pack 3)
BTW I found an 1.5 year old posting about this issue but no solution
was ever posted:
http://groups.google.com/groups?hl=en&lr=&ie=UTF-8&threadm=%23XiCPwG3CHA.1624%40TK2MSFTNGP12.phx.gbl&rnum=1&prev=/groups%3Fhl%3Den%26lr%3D%26ie%3DUTF-8%26selm%3D%2523XiCPwG3CHA.1624%2540TK2MSFTNGP12.phx.gbl
Thanks,
DavidSo why you can't process this text file manually in your C# program
It will be a simple loop with output to the table
"David E Herbst" wrote:
> I am using DTS to import csv files that I receive from a third party.
> Each day I receive many file which I import into many different SQL
> Server tables depending on the name of the file. I am doing this as
> part of a larger automation process written in C#. Due to the csv
> file names differing each day (they contain a date) and the large
> number of DB Table schemas, I am programmatically/dynamically
> creating/building a generic DTS package for each file and calling it
> to import the csv file into the appropriate table.
> The data contains commas for the column delimiters, double quotes for
> the Text Qualifier and CR LF for the row delimiter. I have set the
> appropriate DTSFlatFile connection properties and successfully
> imported many of the csv files except files which meet the following
> criteria:
> 1) the last column of the SQL Server table that corresponds to this
> file is of type varchar or char
> 2) the data value for the last column of the record is always null
> For these file I get the following error message when I try to execute
> the package: "A DTSTransformCopy must specify no columns (signifying a
> sequential 1-to-1 mapping of all columns) or the same number of source
> and destination columns."
> e.g.
> File 1 -- imports fine:
> --
> "1","a, b","03/03/2004 00:00:00","01/17/2038 23:59:59","03/04/2004
> 06:25:58","0088288574800","dd"
> "2","cd","03/03/2004 00:00:00","01/17/2038 23:59:59","03/04/2004
> 06:25:58","0088288574800",
> "3","e, f","03/03/2004 00:00:00","01/17/2038 23:59:59","03/04/2004
> 06:25:58","0088288574800",
> File 2 -- imports fine:
> --
> "1","c, d","03/03/2004 00:00:00","01/17/2038 23:59:59","03/04/2004
> 06:25:58","0088288574800",
> "2","ab","03/03/2004 00:00:00","01/17/2038 23:59:59","03/04/2004
> 06:25:58","0088288574800","dd"
> "3","c, d","03/03/2004 00:00:00","01/17/2038 23:59:59","03/04/2004
> 06:25:58","0088288574800",
> File 3 -- imports fine:
> --
> "1","a, b","03/03/2004 00:00:00","01/17/2038 23:59:59","03/04/2004
> 06:25:58","0088288574800",
> "2","cd","03/03/2004 00:00:00","01/17/2038 23:59:59","03/04/2004
> 06:25:58","0088288574800",
> "3","e, f","03/03/2004 00:00:00","01/17/2038 23:59:59","03/04/2004
> 06:25:58","0088288574800","dd"
> File 4 -- fails:
> --
> "1","a, b","03/03/2004 00:00:00","01/17/2038 23:59:59","03/04/2004
> 06:25:58","0088288574800",
> "2","cd","03/03/2004 00:00:00","01/17/2038 23:59:59","03/04/2004
> 06:25:58","0088288574800",
> "3","e, f","03/03/2004 00:00:00","01/17/2038 23:59:59","03/04/2004
> 06:25:58","0088288574800",
> Table schema:
> if exists (select * from dbo.sysobjects where id => object_id(N'[dbo].[Table1]') and OBJECTPROPERTY(id, N'IsUserTable') => 1)
> drop table [dbo].[Table1]
> GO
> CREATE TABLE [dbo].[Table1] (
> [Field1] [varchar] (64) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL
> ,
> [Field2] [varchar] (64) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL
> ,
> [Field3] [datetime] NOT NULL ,
> [Field4] [datetime] NULL ,
> [Field5] [datetime] NULL ,
> [Field6] [varchar] (64) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL
> ,
> [Field7] [varchar] (64) COLLATE SQL_Latin1_General_CP1_CI_AS NULL
> ) ON [PRIMARY]
> GO
> I am using SQL 2000 sp3a:
> Microsoft SQL Server 2000 - 8.00.760 (Intel X86)
> Dec 17 2002 14:22:05
> Copyright (c) 1988-2003 Microsoft Corporation
> Developer Edition on Windows NT 5.0 (Build 2195: Service Pack 3)
> BTW I found an 1.5 year old posting about this issue but no solution
> was ever posted:
> http://groups.google.com/groups?hl=en&lr=&ie=UTF-8&threadm=%23XiCPwG3CHA.1624%40TK2MSFTNGP12.phx.gbl&rnum=1&prev=/groups%3Fhl%3Den%26lr%3D%26ie%3DUTF-8%26selm%3D%2523XiCPwG3CHA.1624%2540TK2MSFTNGP12.phx.gbl
> Thanks,
> David
>|||Thank you for the suggestion.
Yes, that is an option but I am importing a many different csv files
to a large number of DB tables (each with a different number of
columns and column types) and I'm trying to avoid writing custom file
parsing code for each of the table schemas. Importing csv files seems
like a pretty standard problem that has been solved many times already
so I'm also trying to avoid spending time writing my own generic csv
file parser/loader which seems like reinventing the wheel. In
addition I've heard that large numbers of INSERTs though ADO.NET is
not that efficient and news groups postings usually suggest DTS
instead?
BTW I forgot to mention that I am taking advantage of the fact that
since the source and destination have the same number of columns in
the same order I don't have to define and add any column objects to
the DTS.DataPumpTransformCopy transformation object. This is useful
since I can use the same generic package creation code for all of the
DB tables and not embed any knowledge of the column schema in my code.
See:
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dtsprog/dtspapps_45k3.asp?frame=true
(see comment in the example)
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dtsprog/dtspapps_976r.asp?frame=true
(first bullet points)sql

No comments:

Post a Comment