Sunday, February 19, 2012

Help: Bulk insert with ADO.NET from outside database

I am programmatically importing lots of rows from dBase 5 into SQL and
want to get an idea what the optimal approach would be for this
scenario.
Currently I'm doing the following:
1) Using OleDbConnection, OleDbCommand and OleDbDataReader to read the
rows from dBase tables;
2) Putting the extracted data into a DataSet;
3) Once the number of rows in the DataSet reaches, say, 1000, I'm
using SqlDataAdapter to submit the data to the database. This step
creates a transaction, issues an INSERT statement for each row in the
DataSet, then commits the transaction.
4) Go to Step 1 and repeat until all data is processes.
The above is inefficient and slow even if the both database
connections are kept open during the whole batch process. For example,
50,000 rows get processed in around 30 seconds.
On the other hand, if I use Import and Export Wizard, the same task
(dBase -> SQL) takes less than 5 seconds.
Out of curiosity, I used SQL Profiler to trace the Import and Export
Wizard to see what it's doing. The last statement in the log is as
follows:
insert bulk [DatabaseName].[dbo].[TableName]
(
"Field_1" float,
"Field_2" float,
"Field_3" nvarchar(100) COLLATE SQL_Latin1_General_CP1_CI_AS,
..
(other table fields here)
..
)
with
(
check_constraints
)
As you can see there is no FROM clause. Also, it's "insert bulk", not
widely-known "bulk insert". (?)
My goal is to achieve the same (or similar speed) with ADO.NET, if
possible. But how?
Also, DTS is not an option, because the tables I'm dealing with change
from batch to batch (different table names, not schema)
Thanks!That performance is not achievable in ADO.NET. The problem is that the 1000
rows that you are updating are still updating 1 row at a time. So 1000
individual calls to the database.
The only way to increase that is to use DTS (or you could query the dBase
files, store as text file, and use BCP, Bulk Insert, Etc.). If you need thi
s
to happen in /Net code, you can create the DTS package and then call it from
your .Net application via COM Interop.
HTH,
John Scragg
"Usenet User" wrote:

> I am programmatically importing lots of rows from dBase 5 into SQL and
> want to get an idea what the optimal approach would be for this
> scenario.
> Currently I'm doing the following:
> 1) Using OleDbConnection, OleDbCommand and OleDbDataReader to read the
> rows from dBase tables;
> 2) Putting the extracted data into a DataSet;
> 3) Once the number of rows in the DataSet reaches, say, 1000, I'm
> using SqlDataAdapter to submit the data to the database. This step
> creates a transaction, issues an INSERT statement for each row in the
> DataSet, then commits the transaction.
> 4) Go to Step 1 and repeat until all data is processes.
> The above is inefficient and slow even if the both database
> connections are kept open during the whole batch process. For example,
> 50,000 rows get processed in around 30 seconds.
> On the other hand, if I use Import and Export Wizard, the same task
> (dBase -> SQL) takes less than 5 seconds.
> Out of curiosity, I used SQL Profiler to trace the Import and Export
> Wizard to see what it's doing. The last statement in the log is as
> follows:
> insert bulk [DatabaseName].[dbo].[TableName]
> (
> "Field_1" float,
> "Field_2" float,
> "Field_3" nvarchar(100) COLLATE SQL_Latin1_General_CP1_CI_AS,
> ...
> (other table fields here)
> ...
> )
> with
> (
> check_constraints
> )
> As you can see there is no FROM clause. Also, it's "insert bulk", not
> widely-known "bulk insert". (?)
> My goal is to achieve the same (or similar speed) with ADO.NET, if
> possible. But how?
> Also, DTS is not an option, because the tables I'm dealing with change
> from batch to batch (different table names, not schema)
> Thanks!
>|||John,
Thanks for responding.
Calling a saved DTS from within .NET is not an option, because, as I
mentioned, it must be dynamic because table names differ for every
batch. I suppose, I could re-create the inner guts of the package in
my code using DTS object model but then the program will have to run
on the SQL server, otherwise that model won't be available. In
addition, this would be mean stepping away from ADO.NET completely.
Instead I tried yesterday the same thing you mentioned: export the
data into a temp file with OleDbDataReader and then use SqlCommand to
import the file with BULK INSERT. Worked like a charm: 3.5 sec to
export and 3.5 sec to import - 7 seconds total.
:)
On Mon, 17 Oct 2005 18:37:02 -0700, "John Scragg"
<JohnScragg@.discussions.microsoft.com> wrote:
>That performance is not achievable in ADO.NET. The problem is that the 100
0
>rows that you are updating are still updating 1 row at a time. So 1000
>individual calls to the database.
>The only way to increase that is to use DTS (or you could query the dBase
>files, store as text file, and use BCP, Bulk Insert, Etc.). If you need th
is
>to happen in /Net code, you can create the DTS package and then call it fro
m
>your .Net application via COM Interop.
>HTH,
>John Scragg
>"Usenet User" wrote:
>

No comments:

Post a Comment