Showing posts with label ssis. Show all posts
Showing posts with label ssis. Show all posts

Monday, March 26, 2012

Looping through several Excel data sources in SSIS

I am attempting to use the foreach loop structure in an SSIS package to
loop through however many Excel files are placed in a directory and
then perform an import operation into a SQL table on each of these
files sequentially. The closest model for this that I was able to find
in the MS tutorial used a flat file source rather than Excel. That
involved adding a new expression to the Connection Manager that set the
connection string to the current filename, as provided by the foreach
component. That works just fine, but when I attempt to apply the same
method to an Excel source, rather than a flat file source, I cannot get
it to work. I see the following error associated with the Excel source
on the Data Flow page: "Validation error. Data Flow Task: Excel Source
[1]: The AcquireConnection method call to the connection manager "Excel
Connection Manager 1" failed with error code 0xC020200." I think that
it's just a matter of getting the right expression, and I thought that
perhaps I should be constructing an expression for ExcelFilePath rather
than the Connection String, but I have fiddled with it for hours and
haven't come up with something that will be accepted. Has anybody out
there been able to do this, or can perhaps refer me to some
documentation that contains an example of what I am trying to do?
Thanks for any help you can give.David,

The April 2006 update of SQL Server 2005 Books Online contains a
new topic titled "How to: Loop through Excel Files and Tables", at
ms-help://MS.SQLCC.v9/MS.SQLSVR.v9.en/extran9/html/a5393c1a-cc37-491a-a260-7aad84dbff68.htm
or online at
http://msdn2.microsoft.com/en-us/library/ms345182.aspx

Why don't you see if that helps, and if not, let us know exactly
what point in that article something goes wrong.

The download page for BOL, for local installation, is
http://www.microsoft.com/downloads/...&DisplayLang=en

Steve Kass
Drew University

davidz wrote:
> I am attempting to use the foreach loop structure in an SSIS package to
> loop through however many Excel files are placed in a directory and
> then perform an import operation into a SQL table on each of these
> files sequentially. The closest model for this that I was able to find
> in the MS tutorial used a flat file source rather than Excel. That
> involved adding a new expression to the Connection Manager that set the
> connection string to the current filename, as provided by the foreach
> component. That works just fine, but when I attempt to apply the same
> method to an Excel source, rather than a flat file source, I cannot get
> it to work. I see the following error associated with the Excel source
> on the Data Flow page: "Validation error. Data Flow Task: Excel Source
> [1]: The AcquireConnection method call to the connection manager "Excel
> Connection Manager 1" failed with error code 0xC020200." I think that
> it's just a matter of getting the right expression, and I thought that
> perhaps I should be constructing an expression for ExcelFilePath rather
> than the Connection String, but I have fiddled with it for hours and
> haven't come up with something that will be accepted. Has anybody out
> there been able to do this, or can perhaps refer me to some
> documentation that contains an example of what I am trying to do?
> Thanks for any help you can give.

Looping through an excel spreadsheet

Being new to SSIS I wish to loop through a series of excel spreadsheets and within each workbook loop through each sheet. I am aware of the For Each container but how can the each sheet in the workbook be referenced?

Steve

Use the ForEach file enumerator.

-Jamie

|||

Thanks Jamie, my problem is for each spreadsheet I loop through how do I reference each sheet / tab within the spreadsheets returned by the For EachLoop.

Thanks

Steve

|||

Hi Steve,

Both of our problem is the same. I couldn't loop through Each Sheet in a Excel File. I tried with ForEach Loop File Enumerator, but I couldn't acheive it.

Jamie,

Need your help.

Thanks & Regards,

Prakash Srinivasan

|||

Hi Guys,

Any updates or any ideas?

Urgent Please.

Thanks & Regards,

Prakash Srinivasan

|||

You can use a Foreach Loop and the Foreach ADO.NET Schema Rowset enumerator, return a TABLES rowset, and loop through each table. Note that both worksheets (with the $ suffix) and named ranges are TABLES in Excel. The SP1 refresh of BOL will include a new topic that discusses this and another aspect or two of working with Excel files.

-Doug

|||

Hi Doug,

As you mentioned, I tried with ForEach Loop ADO.NET Schema Rowset Enumerator, but I am not able to provide the Connection for Excel Files. I tried with both Microsoft Jet 4.0 OLEDB Provider as well as ODBC for Excel, but it is giving me an error.

So if you explain this in detail it will be very much helpful to me.

Expecting your reply ASAP.

Thanks & Regards,

Prakash Srinivasan.

|||

try this:

set the delay validation to TRUE in your package properties, this may fix the error you are getting from the foreach loop going thru your sheets

|||

Hi,

I tried this setup (delay validation as true) very long back. Now my concern is like how do we create the connection for Excel when you are trying with Foreach ADO.NET Schema Rowset Enumerator.

It is not supporting for Excel Files. Please advice.

Thanks for your help.

Prakash Srinivasan

|||

Hi All,

Setting delayvalidation to true does not seem to help, I have a foreach loop for all the excel files, then a forech loop for the sheet names, how do I assign the variables to the for the filename and the sheet name, i am current generating an SQL qury varaible for the sheet ie select * from [sheetname$] , however the excel data source refuses to work, has anybody got a worked example or simple explanation.

Many thanks

Steve

|||

You need to use an ADO.NET Connection Manager, the Jet Provider, and on the All page of the editor, enter "Excel 8.0" as the value of the Extended Properties argument.

-Doug

|||

Steve,

When looping through tables, I assume that you will want to use "Table name from variable" in the Excel Source,

When looping through Excel files, you will need to concatenate the filename into the connection string by using an expression. There is a sample in the following thread:

http://forums.microsoft.com/MSDN/ShowPost.aspx?PostID=103273&SiteID=1

As for the validation issue, you can either put a valid file path in the ConnectionString property of the connection manager to avoid a validation error (if you've set an expression, this value will never be used), or set DelayValidation as you've done.

-Doug

|||

Hi,

I tried with "Table Name from Variable" option in Excel Source also. But I am not at all able to close that dialog box as it is giving an error message.

Also I tried giving DelayValidation as True only for DataFlow Task. Still it doesn't work. So if you can send me the process in detail, that will be really helpful to me to get this resolved.

Thanks in advance.

Regards,

Prakash Srinivasan.

|||

Yes,

I would appreciate it spelt out as I am finding this thoroughly confusing. Will keep persevering though.

Steve

|||

Here is the draft of a revised BOL topic, copied into plain text because the HTML can't be copied neatly.

How to: Loop through Excel Files and Tables

Introduction
The procedures in this topic describe how to loop through the Excel workbooks in a folder, or through the tables in an Excel workbook, by using the Foreach Loop container with the appropriate enumerator.

Procedures

To loop through Excel files by using the Foreach File enumerator
1. Create a string variable that will receive the current Excel path and filename on each iteration of the loop. (The sample expression shown later in this procedure uses the variable name ExcelFile, with no initial value.)
2. Optionally, create another string variable that will hold the value for the Extended Properties argument of the Excel connection string. This argument contains a series of values that specify the Excel version and determine whether the first row contains column names, and whether import mode is used. (The sample expression shown later in this procedure uses the variable name ExtProperties, with an initial value of Excel 8.0;HDR=Yes.)
3. Add a Foreach Loop container to the Control Flow tab and configure it as described in How to: Configure a Foreach Loop Container.
4. On the Collection page of the Foreach Loop Editor, select the Foreach File enumerator, specify the folder in which the Excel workbooks are located, and specify the file filter (normally *.xls).
5. On the Variable Mapping page, map Index 0 to a user-defined string variable that will receive the current Excel path and filename on each iteration of the loop. (The sample expression shown later in this procedure uses the variable name ExcelFile.)
6. Close the Foreach Loop Editor.
7. Add an Excel connection manager to the package.
Note To avoid validation errors later as you configure tasks and data flow components to use this connection manager, assign a default Excel workbook in the Excel Connection Manager Editor. After creating and configuring the package, you can delete this value in the Properties window. However, after you delete this value, a validation error may occur because the connection string property of the Excel connection manager is no longer valid until the Foreach Loop runs. In this case, set the DelayValidation property to True on the connection manager, on the tasks in which it is used, or on the package.
8. Select the new Excel connection manager, click the Expressions property in the Properties window, and then click the ellipsis.
9. In the Property Expressions Editor, select the ConnectionString property, and click the ellipsis.
10. In the Expression Builder, enter the following expression:
"Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" + @.[User::ExcelFile] + ";Extended Properties=\"" + @.[User::ExtProperties] + "\""
Note the use of the escape character "\" to escape the inner quotes required around the value of the Extended Properties argument.
11. Create tasks within the Foreach Loop container that use the Excel connection manager to perform the same operations on each Excel workbook that matches the specified file location and pattern.

To loop through Excel tables by using the Foreach ADO.NET Schema Rowset enumerator

1. Create an ADO.NET connection manager that uses the Microsoft Jet OLE DB Provider to connect to an Excel workbook. On the All page of the Connection Manager dialog box, make sure that you enter Excel 8.0 as the value of the Extended Properties property.
2. Create a string variable that will receive the name of the current table on each iteration of the loop.
3. Add a Foreach Loop container to the Control Flow tab. For information on configuring the ForeachLoop, see How to: Configure a Foreach Loop Container.
4. On the Collection page of the Foreach Loop Editor, select the Foreach ADO.NET Schema Rowset enumerator.
5. As the value of Connection, select the ADO.NET connection manager that you created previously.
6. As the value of Schema, select Tables.
Note The list of tables in an Excel workbook includes both worksheets (which have the $ suffix) and named ranges. If you have to filter the list for only worksheets or only named ranges, you may have to write custom code in a Script task for this purpose. For more information, see Working with Excel Files with the Script Taskb8fa110a-2c9c-4f5a-8fe1-305555640e44.
7. On the Variable Mappings page, map Index 2 to the string variable created earlier to hold the name of the current table.
8. Close the Foreach Loop Editor.
9. Create tasks within the Foreach Loop container that use the Excel connection manager to perform the same operations on each Excel table in the specified workbook.

Looping through an excel spreadsheet

Being new to SSIS I wish to loop through a series of excel spreadsheets and within each workbook loop through each sheet. I am aware of the For Each container but how can the each sheet in the workbook be referenced?

Steve

Use the ForEach file enumerator.

-Jamie

|||

Thanks Jamie, my problem is for each spreadsheet I loop through how do I reference each sheet / tab within the spreadsheets returned by the For EachLoop.

Thanks

Steve

|||

Hi Steve,

Both of our problem is the same. I couldn't loop through Each Sheet in a Excel File. I tried with ForEach Loop File Enumerator, but I couldn't acheive it.

Jamie,

Need your help.

Thanks & Regards,

Prakash Srinivasan

|||

Hi Guys,

Any updates or any ideas?

Urgent Please.

Thanks & Regards,

Prakash Srinivasan

|||

You can use a Foreach Loop and the Foreach ADO.NET Schema Rowset enumerator, return a TABLES rowset, and loop through each table. Note that both worksheets (with the $ suffix) and named ranges are TABLES in Excel. The SP1 refresh of BOL will include a new topic that discusses this and another aspect or two of working with Excel files.

-Doug

|||

Hi Doug,

As you mentioned, I tried with ForEach Loop ADO.NET Schema Rowset Enumerator, but I am not able to provide the Connection for Excel Files. I tried with both Microsoft Jet 4.0 OLEDB Provider as well as ODBC for Excel, but it is giving me an error.

So if you explain this in detail it will be very much helpful to me.

Expecting your reply ASAP.

Thanks & Regards,

Prakash Srinivasan.

|||

try this:

set the delay validation to TRUE in your package properties, this may fix the error you are getting from the foreach loop going thru your sheets

|||

Hi,

I tried this setup (delay validation as true) very long back. Now my concern is like how do we create the connection for Excel when you are trying with Foreach ADO.NET Schema Rowset Enumerator.

It is not supporting for Excel Files. Please advice.

Thanks for your help.

Prakash Srinivasan

|||

Hi All,

Setting delayvalidation to true does not seem to help, I have a foreach loop for all the excel files, then a forech loop for the sheet names, how do I assign the variables to the for the filename and the sheet name, i am current generating an SQL qury varaible for the sheet ie select * from [sheetname$] , however the excel data source refuses to work, has anybody got a worked example or simple explanation.

Many thanks

Steve

|||

You need to use an ADO.NET Connection Manager, the Jet Provider, and on the All page of the editor, enter "Excel 8.0" as the value of the Extended Properties argument.

-Doug

|||

Steve,

When looping through tables, I assume that you will want to use "Table name from variable" in the Excel Source,

When looping through Excel files, you will need to concatenate the filename into the connection string by using an expression. There is a sample in the following thread:

http://forums.microsoft.com/MSDN/ShowPost.aspx?PostID=103273&SiteID=1

As for the validation issue, you can either put a valid file path in the ConnectionString property of the connection manager to avoid a validation error (if you've set an expression, this value will never be used), or set DelayValidation as you've done.

-Doug

|||

Hi,

I tried with "Table Name from Variable" option in Excel Source also. But I am not at all able to close that dialog box as it is giving an error message.

Also I tried giving DelayValidation as True only for DataFlow Task. Still it doesn't work. So if you can send me the process in detail, that will be really helpful to me to get this resolved.

Thanks in advance.

Regards,

Prakash Srinivasan.

|||

Yes,

I would appreciate it spelt out as I am finding this thoroughly confusing. Will keep persevering though.

Steve

|||

Here is the draft of a revised BOL topic, copied into plain text because the HTML can't be copied neatly.

How to: Loop through Excel Files and Tables

Introduction
The procedures in this topic describe how to loop through the Excel workbooks in a folder, or through the tables in an Excel workbook, by using the Foreach Loop container with the appropriate enumerator.

Procedures

To loop through Excel files by using the Foreach File enumerator
1. Create a string variable that will receive the current Excel path and filename on each iteration of the loop. (The sample expression shown later in this procedure uses the variable name ExcelFile, with no initial value.)
2. Optionally, create another string variable that will hold the value for the Extended Properties argument of the Excel connection string. This argument contains a series of values that specify the Excel version and determine whether the first row contains column names, and whether import mode is used. (The sample expression shown later in this procedure uses the variable name ExtProperties, with an initial value of Excel 8.0;HDR=Yes.)
3. Add a Foreach Loop container to the Control Flow tab and configure it as described in How to: Configure a Foreach Loop Container.
4. On the Collection page of the Foreach Loop Editor, select the Foreach File enumerator, specify the folder in which the Excel workbooks are located, and specify the file filter (normally *.xls).
5. On the Variable Mapping page, map Index 0 to a user-defined string variable that will receive the current Excel path and filename on each iteration of the loop. (The sample expression shown later in this procedure uses the variable name ExcelFile.)
6. Close the Foreach Loop Editor.
7. Add an Excel connection manager to the package.
Note To avoid validation errors later as you configure tasks and data flow components to use this connection manager, assign a default Excel workbook in the Excel Connection Manager Editor. After creating and configuring the package, you can delete this value in the Properties window. However, after you delete this value, a validation error may occur because the connection string property of the Excel connection manager is no longer valid until the Foreach Loop runs. In this case, set the DelayValidation property to True on the connection manager, on the tasks in which it is used, or on the package.
8. Select the new Excel connection manager, click the Expressions property in the Properties window, and then click the ellipsis.
9. In the Property Expressions Editor, select the ConnectionString property, and click the ellipsis.
10. In the Expression Builder, enter the following expression:
"Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" + @.[User::ExcelFile] + ";Extended Properties=\"" + @.[User::ExtProperties] + "\""
Note the use of the escape character "\" to escape the inner quotes required around the value of the Extended Properties argument.
11. Create tasks within the Foreach Loop container that use the Excel connection manager to perform the same operations on each Excel workbook that matches the specified file location and pattern.

To loop through Excel tables by using the Foreach ADO.NET Schema Rowset enumerator

1. Create an ADO.NET connection manager that uses the Microsoft Jet OLE DB Provider to connect to an Excel workbook. On the All page of the Connection Manager dialog box, make sure that you enter Excel 8.0 as the value of the Extended Properties property.
2. Create a string variable that will receive the name of the current table on each iteration of the loop.
3. Add a Foreach Loop container to the Control Flow tab. For information on configuring the ForeachLoop, see How to: Configure a Foreach Loop Container.
4. On the Collection page of the Foreach Loop Editor, select the Foreach ADO.NET Schema Rowset enumerator.
5. As the value of Connection, select the ADO.NET connection manager that you created previously.
6. As the value of Schema, select Tables.
Note The list of tables in an Excel workbook includes both worksheets (which have the $ suffix) and named ranges. If you have to filter the list for only worksheets or only named ranges, you may have to write custom code in a Script task for this purpose. For more information, see Working with Excel Files with the Script Taskb8fa110a-2c9c-4f5a-8fe1-305555640e44.
7. On the Variable Mappings page, map Index 2 to the string variable created earlier to hold the name of the current table.
8. Close the Foreach Loop Editor.
9. Create tasks within the Foreach Loop container that use the Excel connection manager to perform the same operations on each Excel table in the specified workbook.

Looping through an excel spreadsheet

Being new to SSIS I wish to loop through a series of excel spreadsheets and within each workbook loop through each sheet. I am aware of the For Each container but how can the each sheet in the workbook be referenced?

Steve

Use the ForEach file enumerator.

-Jamie

|||

Thanks Jamie, my problem is for each spreadsheet I loop through how do I reference each sheet / tab within the spreadsheets returned by the For EachLoop.

Thanks

Steve

|||

Hi Steve,

Both of our problem is the same. I couldn't loop through Each Sheet in a Excel File. I tried with ForEach Loop File Enumerator, but I couldn't acheive it.

Jamie,

Need your help.

Thanks & Regards,

Prakash Srinivasan

|||

Hi Guys,

Any updates or any ideas?

Urgent Please.

Thanks & Regards,

Prakash Srinivasan

|||

You can use a Foreach Loop and the Foreach ADO.NET Schema Rowset enumerator, return a TABLES rowset, and loop through each table. Note that both worksheets (with the $ suffix) and named ranges are TABLES in Excel. The SP1 refresh of BOL will include a new topic that discusses this and another aspect or two of working with Excel files.

-Doug

|||

Hi Doug,

As you mentioned, I tried with ForEach Loop ADO.NET Schema Rowset Enumerator, but I am not able to provide the Connection for Excel Files. I tried with both Microsoft Jet 4.0 OLEDB Provider as well as ODBC for Excel, but it is giving me an error.

So if you explain this in detail it will be very much helpful to me.

Expecting your reply ASAP.

Thanks & Regards,

Prakash Srinivasan.

|||

try this:

set the delay validation to TRUE in your package properties, this may fix the error you are getting from the foreach loop going thru your sheets

|||

Hi,

I tried this setup (delay validation as true) very long back. Now my concern is like how do we create the connection for Excel when you are trying with Foreach ADO.NET Schema Rowset Enumerator.

It is not supporting for Excel Files. Please advice.

Thanks for your help.

Prakash Srinivasan

|||

Hi All,

Setting delayvalidation to true does not seem to help, I have a foreach loop for all the excel files, then a forech loop for the sheet names, how do I assign the variables to the for the filename and the sheet name, i am current generating an SQL qury varaible for the sheet ie select * from [sheetname$] , however the excel data source refuses to work, has anybody got a worked example or simple explanation.

Many thanks

Steve

|||

You need to use an ADO.NET Connection Manager, the Jet Provider, and on the All page of the editor, enter "Excel 8.0" as the value of the Extended Properties argument.

-Doug

|||

Steve,

When looping through tables, I assume that you will want to use "Table name from variable" in the Excel Source,

When looping through Excel files, you will need to concatenate the filename into the connection string by using an expression. There is a sample in the following thread:

http://forums.microsoft.com/MSDN/ShowPost.aspx?PostID=103273&SiteID=1

As for the validation issue, you can either put a valid file path in the ConnectionString property of the connection manager to avoid a validation error (if you've set an expression, this value will never be used), or set DelayValidation as you've done.

-Doug

|||

Hi,

I tried with "Table Name from Variable" option in Excel Source also. But I am not at all able to close that dialog box as it is giving an error message.

Also I tried giving DelayValidation as True only for DataFlow Task. Still it doesn't work. So if you can send me the process in detail, that will be really helpful to me to get this resolved.

Thanks in advance.

Regards,

Prakash Srinivasan.

|||

Yes,

I would appreciate it spelt out as I am finding this thoroughly confusing. Will keep persevering though.

Steve

|||

Here is the draft of a revised BOL topic, copied into plain text because the HTML can't be copied neatly.

How to: Loop through Excel Files and Tables

Introduction
The procedures in this topic describe how to loop through the Excel workbooks in a folder, or through the tables in an Excel workbook, by using the Foreach Loop container with the appropriate enumerator.

Procedures

To loop through Excel files by using the Foreach File enumerator
1. Create a string variable that will receive the current Excel path and filename on each iteration of the loop. (The sample expression shown later in this procedure uses the variable name ExcelFile, with no initial value.)
2. Optionally, create another string variable that will hold the value for the Extended Properties argument of the Excel connection string. This argument contains a series of values that specify the Excel version and determine whether the first row contains column names, and whether import mode is used. (The sample expression shown later in this procedure uses the variable name ExtProperties, with an initial value of Excel 8.0;HDR=Yes.)
3. Add a Foreach Loop container to the Control Flow tab and configure it as described in How to: Configure a Foreach Loop Container.
4. On the Collection page of the Foreach Loop Editor, select the Foreach File enumerator, specify the folder in which the Excel workbooks are located, and specify the file filter (normally *.xls).
5. On the Variable Mapping page, map Index 0 to a user-defined string variable that will receive the current Excel path and filename on each iteration of the loop. (The sample expression shown later in this procedure uses the variable name ExcelFile.)
6. Close the Foreach Loop Editor.
7. Add an Excel connection manager to the package.
Note To avoid validation errors later as you configure tasks and data flow components to use this connection manager, assign a default Excel workbook in the Excel Connection Manager Editor. After creating and configuring the package, you can delete this value in the Properties window. However, after you delete this value, a validation error may occur because the connection string property of the Excel connection manager is no longer valid until the Foreach Loop runs. In this case, set the DelayValidation property to True on the connection manager, on the tasks in which it is used, or on the package.
8. Select the new Excel connection manager, click the Expressions property in the Properties window, and then click the ellipsis.
9. In the Property Expressions Editor, select the ConnectionString property, and click the ellipsis.
10. In the Expression Builder, enter the following expression:
"Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" + @.[User::ExcelFile] + ";Extended Properties=\"" + @.[User::ExtProperties] + "\""
Note the use of the escape character "\" to escape the inner quotes required around the value of the Extended Properties argument.
11. Create tasks within the Foreach Loop container that use the Excel connection manager to perform the same operations on each Excel workbook that matches the specified file location and pattern.

To loop through Excel tables by using the Foreach ADO.NET Schema Rowset enumerator

1. Create an ADO.NET connection manager that uses the Microsoft Jet OLE DB Provider to connect to an Excel workbook. On the All page of the Connection Manager dialog box, make sure that you enter Excel 8.0 as the value of the Extended Properties property.
2. Create a string variable that will receive the name of the current table on each iteration of the loop.
3. Add a Foreach Loop container to the Control Flow tab. For information on configuring the ForeachLoop, see How to: Configure a Foreach Loop Container.
4. On the Collection page of the Foreach Loop Editor, select the Foreach ADO.NET Schema Rowset enumerator.
5. As the value of Connection, select the ADO.NET connection manager that you created previously.
6. As the value of Schema, select Tables.
Note The list of tables in an Excel workbook includes both worksheets (which have the $ suffix) and named ranges. If you have to filter the list for only worksheets or only named ranges, you may have to write custom code in a Script task for this purpose. For more information, see Working with Excel Files with the Script Taskb8fa110a-2c9c-4f5a-8fe1-305555640e44.
7. On the Variable Mappings page, map Index 2 to the string variable created earlier to hold the name of the current table.
8. Close the Foreach Loop Editor.
9. Create tasks within the Foreach Loop container that use the Excel connection manager to perform the same operations on each Excel table in the specified workbook.

Looping through an excel spreadsheet

Being new to SSIS I wish to loop through a series of excel spreadsheets and within each workbook loop through each sheet. I am aware of the For Each container but how can the each sheet in the workbook be referenced?

Steve

Use the ForEach file enumerator.

-Jamie

|||

Thanks Jamie, my problem is for each spreadsheet I loop through how do I reference each sheet / tab within the spreadsheets returned by the For EachLoop.

Thanks

Steve

|||

Hi Steve,

Both of our problem is the same. I couldn't loop through Each Sheet in a Excel File. I tried with ForEach Loop File Enumerator, but I couldn't acheive it.

Jamie,

Need your help.

Thanks & Regards,

Prakash Srinivasan

|||

Hi Guys,

Any updates or any ideas?

Urgent Please.

Thanks & Regards,

Prakash Srinivasan

|||

You can use a Foreach Loop and the Foreach ADO.NET Schema Rowset enumerator, return a TABLES rowset, and loop through each table. Note that both worksheets (with the $ suffix) and named ranges are TABLES in Excel. The SP1 refresh of BOL will include a new topic that discusses this and another aspect or two of working with Excel files.

-Doug

|||

Hi Doug,

As you mentioned, I tried with ForEach Loop ADO.NET Schema Rowset Enumerator, but I am not able to provide the Connection for Excel Files. I tried with both Microsoft Jet 4.0 OLEDB Provider as well as ODBC for Excel, but it is giving me an error.

So if you explain this in detail it will be very much helpful to me.

Expecting your reply ASAP.

Thanks & Regards,

Prakash Srinivasan.

|||

try this:

set the delay validation to TRUE in your package properties, this may fix the error you are getting from the foreach loop going thru your sheets

|||

Hi,

I tried this setup (delay validation as true) very long back. Now my concern is like how do we create the connection for Excel when you are trying with Foreach ADO.NET Schema Rowset Enumerator.

It is not supporting for Excel Files. Please advice.

Thanks for your help.

Prakash Srinivasan

|||

Hi All,

Setting delayvalidation to true does not seem to help, I have a foreach loop for all the excel files, then a forech loop for the sheet names, how do I assign the variables to the for the filename and the sheet name, i am current generating an SQL qury varaible for the sheet ie select * from [sheetname$] , however the excel data source refuses to work, has anybody got a worked example or simple explanation.

Many thanks

Steve

|||

You need to use an ADO.NET Connection Manager, the Jet Provider, and on the All page of the editor, enter "Excel 8.0" as the value of the Extended Properties argument.

-Doug

|||

Steve,

When looping through tables, I assume that you will want to use "Table name from variable" in the Excel Source,

When looping through Excel files, you will need to concatenate the filename into the connection string by using an expression. There is a sample in the following thread:

http://forums.microsoft.com/MSDN/ShowPost.aspx?PostID=103273&SiteID=1

As for the validation issue, you can either put a valid file path in the ConnectionString property of the connection manager to avoid a validation error (if you've set an expression, this value will never be used), or set DelayValidation as you've done.

-Doug

|||

Hi,

I tried with "Table Name from Variable" option in Excel Source also. But I am not at all able to close that dialog box as it is giving an error message.

Also I tried giving DelayValidation as True only for DataFlow Task. Still it doesn't work. So if you can send me the process in detail, that will be really helpful to me to get this resolved.

Thanks in advance.

Regards,

Prakash Srinivasan.

|||

Yes,

I would appreciate it spelt out as I am finding this thoroughly confusing. Will keep persevering though.

Steve

|||

Here is the draft of a revised BOL topic, copied into plain text because the HTML can't be copied neatly.

How to: Loop through Excel Files and Tables

Introduction
The procedures in this topic describe how to loop through the Excel workbooks in a folder, or through the tables in an Excel workbook, by using the Foreach Loop container with the appropriate enumerator.

Procedures

To loop through Excel files by using the Foreach File enumerator
1. Create a string variable that will receive the current Excel path and filename on each iteration of the loop. (The sample expression shown later in this procedure uses the variable name ExcelFile, with no initial value.)
2. Optionally, create another string variable that will hold the value for the Extended Properties argument of the Excel connection string. This argument contains a series of values that specify the Excel version and determine whether the first row contains column names, and whether import mode is used. (The sample expression shown later in this procedure uses the variable name ExtProperties, with an initial value of Excel 8.0;HDR=Yes.)
3. Add a Foreach Loop container to the Control Flow tab and configure it as described in How to: Configure a Foreach Loop Container.
4. On the Collection page of the Foreach Loop Editor, select the Foreach File enumerator, specify the folder in which the Excel workbooks are located, and specify the file filter (normally *.xls).
5. On the Variable Mapping page, map Index 0 to a user-defined string variable that will receive the current Excel path and filename on each iteration of the loop. (The sample expression shown later in this procedure uses the variable name ExcelFile.)
6. Close the Foreach Loop Editor.
7. Add an Excel connection manager to the package.
Note To avoid validation errors later as you configure tasks and data flow components to use this connection manager, assign a default Excel workbook in the Excel Connection Manager Editor. After creating and configuring the package, you can delete this value in the Properties window. However, after you delete this value, a validation error may occur because the connection string property of the Excel connection manager is no longer valid until the Foreach Loop runs. In this case, set the DelayValidation property to True on the connection manager, on the tasks in which it is used, or on the package.
8. Select the new Excel connection manager, click the Expressions property in the Properties window, and then click the ellipsis.
9. In the Property Expressions Editor, select the ConnectionString property, and click the ellipsis.
10. In the Expression Builder, enter the following expression:
"Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" + @.[User::ExcelFile] + ";Extended Properties=\"" + @.[User::ExtProperties] + "\""
Note the use of the escape character "\" to escape the inner quotes required around the value of the Extended Properties argument.
11. Create tasks within the Foreach Loop container that use the Excel connection manager to perform the same operations on each Excel workbook that matches the specified file location and pattern.

To loop through Excel tables by using the Foreach ADO.NET Schema Rowset enumerator

1. Create an ADO.NET connection manager that uses the Microsoft Jet OLE DB Provider to connect to an Excel workbook. On the All page of the Connection Manager dialog box, make sure that you enter Excel 8.0 as the value of the Extended Properties property.
2. Create a string variable that will receive the name of the current table on each iteration of the loop.
3. Add a Foreach Loop container to the Control Flow tab. For information on configuring the ForeachLoop, see How to: Configure a Foreach Loop Container.
4. On the Collection page of the Foreach Loop Editor, select the Foreach ADO.NET Schema Rowset enumerator.
5. As the value of Connection, select the ADO.NET connection manager that you created previously.
6. As the value of Schema, select Tables.
Note The list of tables in an Excel workbook includes both worksheets (which have the $ suffix) and named ranges. If you have to filter the list for only worksheets or only named ranges, you may have to write custom code in a Script task for this purpose. For more information, see Working with Excel Files with the Script Taskb8fa110a-2c9c-4f5a-8fe1-305555640e44.
7. On the Variable Mappings page, map Index 2 to the string variable created earlier to hold the name of the current table.
8. Close the Foreach Loop Editor.
9. Create tasks within the Foreach Loop container that use the Excel connection manager to perform the same operations on each Excel table in the specified workbook.

sql

Looping through an excel spreadsheet

Being new to SSIS I wish to loop through a series of excel spreadsheets and within each workbook loop through each sheet. I am aware of the For Each container but how can the each sheet in the workbook be referenced?

Steve

Use the ForEach file enumerator.

-Jamie

|||

Thanks Jamie, my problem is for each spreadsheet I loop through how do I reference each sheet / tab within the spreadsheets returned by the For EachLoop.

Thanks

Steve

|||

Hi Steve,

Both of our problem is the same. I couldn't loop through Each Sheet in a Excel File. I tried with ForEach Loop File Enumerator, but I couldn't acheive it.

Jamie,

Need your help.

Thanks & Regards,

Prakash Srinivasan

|||

Hi Guys,

Any updates or any ideas?

Urgent Please.

Thanks & Regards,

Prakash Srinivasan

|||

You can use a Foreach Loop and the Foreach ADO.NET Schema Rowset enumerator, return a TABLES rowset, and loop through each table. Note that both worksheets (with the $ suffix) and named ranges are TABLES in Excel. The SP1 refresh of BOL will include a new topic that discusses this and another aspect or two of working with Excel files.

-Doug

|||

Hi Doug,

As you mentioned, I tried with ForEach Loop ADO.NET Schema Rowset Enumerator, but I am not able to provide the Connection for Excel Files. I tried with both Microsoft Jet 4.0 OLEDB Provider as well as ODBC for Excel, but it is giving me an error.

So if you explain this in detail it will be very much helpful to me.

Expecting your reply ASAP.

Thanks & Regards,

Prakash Srinivasan.

|||

try this:

set the delay validation to TRUE in your package properties, this may fix the error you are getting from the foreach loop going thru your sheets

|||

Hi,

I tried this setup (delay validation as true) very long back. Now my concern is like how do we create the connection for Excel when you are trying with Foreach ADO.NET Schema Rowset Enumerator.

It is not supporting for Excel Files. Please advice.

Thanks for your help.

Prakash Srinivasan

|||

Hi All,

Setting delayvalidation to true does not seem to help, I have a foreach loop for all the excel files, then a forech loop for the sheet names, how do I assign the variables to the for the filename and the sheet name, i am current generating an SQL qury varaible for the sheet ie select * from [sheetname$] , however the excel data source refuses to work, has anybody got a worked example or simple explanation.

Many thanks

Steve

|||

You need to use an ADO.NET Connection Manager, the Jet Provider, and on the All page of the editor, enter "Excel 8.0" as the value of the Extended Properties argument.

-Doug

|||

Steve,

When looping through tables, I assume that you will want to use "Table name from variable" in the Excel Source,

When looping through Excel files, you will need to concatenate the filename into the connection string by using an expression. There is a sample in the following thread:

http://forums.microsoft.com/MSDN/ShowPost.aspx?PostID=103273&SiteID=1

As for the validation issue, you can either put a valid file path in the ConnectionString property of the connection manager to avoid a validation error (if you've set an expression, this value will never be used), or set DelayValidation as you've done.

-Doug

|||

Hi,

I tried with "Table Name from Variable" option in Excel Source also. But I am not at all able to close that dialog box as it is giving an error message.

Also I tried giving DelayValidation as True only for DataFlow Task. Still it doesn't work. So if you can send me the process in detail, that will be really helpful to me to get this resolved.

Thanks in advance.

Regards,

Prakash Srinivasan.

|||

Yes,

I would appreciate it spelt out as I am finding this thoroughly confusing. Will keep persevering though.

Steve

|||

Here is the draft of a revised BOL topic, copied into plain text because the HTML can't be copied neatly.

How to: Loop through Excel Files and Tables

Introduction
The procedures in this topic describe how to loop through the Excel workbooks in a folder, or through the tables in an Excel workbook, by using the Foreach Loop container with the appropriate enumerator.

Procedures

To loop through Excel files by using the Foreach File enumerator
1. Create a string variable that will receive the current Excel path and filename on each iteration of the loop. (The sample expression shown later in this procedure uses the variable name ExcelFile, with no initial value.)
2. Optionally, create another string variable that will hold the value for the Extended Properties argument of the Excel connection string. This argument contains a series of values that specify the Excel version and determine whether the first row contains column names, and whether import mode is used. (The sample expression shown later in this procedure uses the variable name ExtProperties, with an initial value of Excel 8.0;HDR=Yes.)
3. Add a Foreach Loop container to the Control Flow tab and configure it as described in How to: Configure a Foreach Loop Container.
4. On the Collection page of the Foreach Loop Editor, select the Foreach File enumerator, specify the folder in which the Excel workbooks are located, and specify the file filter (normally *.xls).
5. On the Variable Mapping page, map Index 0 to a user-defined string variable that will receive the current Excel path and filename on each iteration of the loop. (The sample expression shown later in this procedure uses the variable name ExcelFile.)
6. Close the Foreach Loop Editor.
7. Add an Excel connection manager to the package.
Note To avoid validation errors later as you configure tasks and data flow components to use this connection manager, assign a default Excel workbook in the Excel Connection Manager Editor. After creating and configuring the package, you can delete this value in the Properties window. However, after you delete this value, a validation error may occur because the connection string property of the Excel connection manager is no longer valid until the Foreach Loop runs. In this case, set the DelayValidation property to True on the connection manager, on the tasks in which it is used, or on the package.
8. Select the new Excel connection manager, click the Expressions property in the Properties window, and then click the ellipsis.
9. In the Property Expressions Editor, select the ConnectionString property, and click the ellipsis.
10. In the Expression Builder, enter the following expression:
"Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" + @.[User::ExcelFile] + ";Extended Properties=\"" + @.[User::ExtProperties] + "\""
Note the use of the escape character "\" to escape the inner quotes required around the value of the Extended Properties argument.
11. Create tasks within the Foreach Loop container that use the Excel connection manager to perform the same operations on each Excel workbook that matches the specified file location and pattern.

To loop through Excel tables by using the Foreach ADO.NET Schema Rowset enumerator

1. Create an ADO.NET connection manager that uses the Microsoft Jet OLE DB Provider to connect to an Excel workbook. On the All page of the Connection Manager dialog box, make sure that you enter Excel 8.0 as the value of the Extended Properties property.
2. Create a string variable that will receive the name of the current table on each iteration of the loop.
3. Add a Foreach Loop container to the Control Flow tab. For information on configuring the ForeachLoop, see How to: Configure a Foreach Loop Container.
4. On the Collection page of the Foreach Loop Editor, select the Foreach ADO.NET Schema Rowset enumerator.
5. As the value of Connection, select the ADO.NET connection manager that you created previously.
6. As the value of Schema, select Tables.
Note The list of tables in an Excel workbook includes both worksheets (which have the $ suffix) and named ranges. If you have to filter the list for only worksheets or only named ranges, you may have to write custom code in a Script task for this purpose. For more information, see Working with Excel Files with the Script Taskb8fa110a-2c9c-4f5a-8fe1-305555640e44.
7. On the Variable Mappings page, map Index 2 to the string variable created earlier to hold the name of the current table.
8. Close the Foreach Loop Editor.
9. Create tasks within the Foreach Loop container that use the Excel connection manager to perform the same operations on each Excel table in the specified workbook.

Looping through an excel spreadsheet

Being new to SSIS I wish to loop through a series of excel spreadsheets and within each workbook loop through each sheet. I am aware of the For Each container but how can the each sheet in the workbook be referenced?

Steve

Use the ForEach file enumerator.

-Jamie

|||

Thanks Jamie, my problem is for each spreadsheet I loop through how do I reference each sheet / tab within the spreadsheets returned by the For EachLoop.

Thanks

Steve

|||

Hi Steve,

Both of our problem is the same. I couldn't loop through Each Sheet in a Excel File. I tried with ForEach Loop File Enumerator, but I couldn't acheive it.

Jamie,

Need your help.

Thanks & Regards,

Prakash Srinivasan

|||

Hi Guys,

Any updates or any ideas?

Urgent Please.

Thanks & Regards,

Prakash Srinivasan

|||

You can use a Foreach Loop and the Foreach ADO.NET Schema Rowset enumerator, return a TABLES rowset, and loop through each table. Note that both worksheets (with the $ suffix) and named ranges are TABLES in Excel. The SP1 refresh of BOL will include a new topic that discusses this and another aspect or two of working with Excel files.

-Doug

|||

Hi Doug,

As you mentioned, I tried with ForEach Loop ADO.NET Schema Rowset Enumerator, but I am not able to provide the Connection for Excel Files. I tried with both Microsoft Jet 4.0 OLEDB Provider as well as ODBC for Excel, but it is giving me an error.

So if you explain this in detail it will be very much helpful to me.

Expecting your reply ASAP.

Thanks & Regards,

Prakash Srinivasan.

|||

try this:

set the delay validation to TRUE in your package properties, this may fix the error you are getting from the foreach loop going thru your sheets

|||

Hi,

I tried this setup (delay validation as true) very long back. Now my concern is like how do we create the connection for Excel when you are trying with Foreach ADO.NET Schema Rowset Enumerator.

It is not supporting for Excel Files. Please advice.

Thanks for your help.

Prakash Srinivasan

|||

Hi All,

Setting delayvalidation to true does not seem to help, I have a foreach loop for all the excel files, then a forech loop for the sheet names, how do I assign the variables to the for the filename and the sheet name, i am current generating an SQL qury varaible for the sheet ie select * from [sheetname$] , however the excel data source refuses to work, has anybody got a worked example or simple explanation.

Many thanks

Steve

|||

You need to use an ADO.NET Connection Manager, the Jet Provider, and on the All page of the editor, enter "Excel 8.0" as the value of the Extended Properties argument.

-Doug

|||

Steve,

When looping through tables, I assume that you will want to use "Table name from variable" in the Excel Source,

When looping through Excel files, you will need to concatenate the filename into the connection string by using an expression. There is a sample in the following thread:

http://forums.microsoft.com/MSDN/ShowPost.aspx?PostID=103273&SiteID=1

As for the validation issue, you can either put a valid file path in the ConnectionString property of the connection manager to avoid a validation error (if you've set an expression, this value will never be used), or set DelayValidation as you've done.

-Doug

|||

Hi,

I tried with "Table Name from Variable" option in Excel Source also. But I am not at all able to close that dialog box as it is giving an error message.

Also I tried giving DelayValidation as True only for DataFlow Task. Still it doesn't work. So if you can send me the process in detail, that will be really helpful to me to get this resolved.

Thanks in advance.

Regards,

Prakash Srinivasan.

|||

Yes,

I would appreciate it spelt out as I am finding this thoroughly confusing. Will keep persevering though.

Steve

|||

Here is the draft of a revised BOL topic, copied into plain text because the HTML can't be copied neatly.

How to: Loop through Excel Files and Tables

Introduction
The procedures in this topic describe how to loop through the Excel workbooks in a folder, or through the tables in an Excel workbook, by using the Foreach Loop container with the appropriate enumerator.

Procedures

To loop through Excel files by using the Foreach File enumerator
1. Create a string variable that will receive the current Excel path and filename on each iteration of the loop. (The sample expression shown later in this procedure uses the variable name ExcelFile, with no initial value.)
2. Optionally, create another string variable that will hold the value for the Extended Properties argument of the Excel connection string. This argument contains a series of values that specify the Excel version and determine whether the first row contains column names, and whether import mode is used. (The sample expression shown later in this procedure uses the variable name ExtProperties, with an initial value of Excel 8.0;HDR=Yes.)
3. Add a Foreach Loop container to the Control Flow tab and configure it as described in How to: Configure a Foreach Loop Container.
4. On the Collection page of the Foreach Loop Editor, select the Foreach File enumerator, specify the folder in which the Excel workbooks are located, and specify the file filter (normally *.xls).
5. On the Variable Mapping page, map Index 0 to a user-defined string variable that will receive the current Excel path and filename on each iteration of the loop. (The sample expression shown later in this procedure uses the variable name ExcelFile.)
6. Close the Foreach Loop Editor.
7. Add an Excel connection manager to the package.
Note To avoid validation errors later as you configure tasks and data flow components to use this connection manager, assign a default Excel workbook in the Excel Connection Manager Editor. After creating and configuring the package, you can delete this value in the Properties window. However, after you delete this value, a validation error may occur because the connection string property of the Excel connection manager is no longer valid until the Foreach Loop runs. In this case, set the DelayValidation property to True on the connection manager, on the tasks in which it is used, or on the package.
8. Select the new Excel connection manager, click the Expressions property in the Properties window, and then click the ellipsis.
9. In the Property Expressions Editor, select the ConnectionString property, and click the ellipsis.
10. In the Expression Builder, enter the following expression:
"Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" + @.[User::ExcelFile] + ";Extended Properties=\"" + @.[User::ExtProperties] + "\""
Note the use of the escape character "\" to escape the inner quotes required around the value of the Extended Properties argument.
11. Create tasks within the Foreach Loop container that use the Excel connection manager to perform the same operations on each Excel workbook that matches the specified file location and pattern.

To loop through Excel tables by using the Foreach ADO.NET Schema Rowset enumerator

1. Create an ADO.NET connection manager that uses the Microsoft Jet OLE DB Provider to connect to an Excel workbook. On the All page of the Connection Manager dialog box, make sure that you enter Excel 8.0 as the value of the Extended Properties property.
2. Create a string variable that will receive the name of the current table on each iteration of the loop.
3. Add a Foreach Loop container to the Control Flow tab. For information on configuring the ForeachLoop, see How to: Configure a Foreach Loop Container.
4. On the Collection page of the Foreach Loop Editor, select the Foreach ADO.NET Schema Rowset enumerator.
5. As the value of Connection, select the ADO.NET connection manager that you created previously.
6. As the value of Schema, select Tables.
Note The list of tables in an Excel workbook includes both worksheets (which have the $ suffix) and named ranges. If you have to filter the list for only worksheets or only named ranges, you may have to write custom code in a Script task for this purpose. For more information, see Working with Excel Files with the Script Taskb8fa110a-2c9c-4f5a-8fe1-305555640e44.
7. On the Variable Mappings page, map Index 2 to the string variable created earlier to hold the name of the current table.
8. Close the Foreach Loop Editor.
9. Create tasks within the Foreach Loop container that use the Excel connection manager to perform the same operations on each Excel table in the specified workbook.

Looping scripts

I have an application that manages the booking of rooms in several different buildings, and am trying to build an SSIS package to build a table containing records showing the total number of hours per day that each building is being used. Because two bookings in the same building can take place at the same time or overlap with each other, it's not a straightforward sum of the hours per booking.

I've written a VBA script that loops through booking records comparing their start and end times, building a total value for each day, and then writing the building id, date and total hours to a table. My problem is that I'm not sure how to do this

in SSIS. The script component operates on row by row basis so doesn't seem to allow looping through records, and the aggregate component doesn't allow scripts.

I'm still pretty new to SSIS so may be missing something obvious, but if anyone has help to offer it would be greatly appreciated.

Thanks,

Tim

You will need an Execute SQL Task, a few variables, and a ForEach loop.

You need to begin with the Exceute SQL Task to read your table for the appropriate records and write them to a ADO RecordSet. The task will need a variable to write the recordset to and so you will need to define a variable of data type "Object" to hold the record set. In the Execute SQL Task set the ResultSet property to "Full Result Set" as you will be retrieving multiple rows. In the ResultSet option you will keep the ResultName to 0 and in the variable you need to choose the name of the variable you would have created.

In the ForEach Loop you need to use the ForEach ADO Enumerator and set the SourceVariable to the variable that holds the recordset. If you are planning to use the values of the columns retrieved from the Execute SQL Task within the ForEach Loop then you would need to define variables to pass them into, and then define the variables in the Variables option for the ForEach Loop.

I hope this helps.

|||

A few altenative approaches:

One, you can use a script component in the data flow to accomplish this. You'd have to capture the rows in an internal data store (a DataTable or array would work), and then process them once you have all the data for a building.

Two, you could use an Execute SQL task as desibull suggests, to get a recordset. However, rather than processing it in a For Each (which is going to introduce the same single record at a time issue), process it in a Script Task.

Three, it seems (at least theoretically) possible to build a SELECT statement which would do this for you. In particular, CTEs in SQL 2005 could help with this type of problem. More detail on the table structure could help determine if that is possible.

Considering you already have the VBA code for processing the data, I'd probably go with option 2.

Friday, March 23, 2012

Looping over files not available in my ssis

I'm downloading zip'd files and would like to loop through each file that was downloaded. I'd also like to unizip each file and append all of them to one file. I have a dos batch that is fairly simple and would like to emulate it using ssis. Here is what the dos batch file looks like.

DATE /T >%TEMP%\D.txt

FOR /F "usebackq tokens=2,3,4 delims=/, " %%i IN ("%TEMP%\D.txt") DO SET fname=TAMF_162%%i%%j%%k-%%k.zip

ECHO xxx>zzzzz
ECHO xxxxx>>zzzzz
ECHO BINARY>>zzzzz
ECHO GET %fname%>>zzzzzz
ECHO QUIT>>zzzzz

FTP -s:zzzzzzz ftp.aaaaa.com

PKUNZIP -o -xxxxxxx downloadedfile_1~1.ZIP

DEL TAMF_1~1.ZIP
DEL zzzzzzzz

EXIT

I would just run my DOS batch file as an external process if the batch file does what you want it to do.|||

We could do that but then each time that this needs to be edited its two places instead of one. I'd rather convert it over to ssis.

Big problem with SSIS now. My looping container does not have the option to loop over files. Seems as if others are having the same problem http://forums.microsoft.com/MSDN/ShowPost.aspx?PostID=416437&SiteID=1&mode=1

|||Not sure why it would make for something that has to be edited in two places. You would only have the code that does the unzip in one bat file so if things needed to be edited you would only have to edit the bat file. Must be something I'm missing in what you are doing.|||Swells - I just want to convert the dos file to SSIS and schedule it in one package and I'd like to learn how to unzip and parse the files. As you can see from my other posts my ssis is somehow broken in that I do not have the option to loop through files. Seems like this may be an issue with the SP1.|||Regarding missing ForEach File enumerator - I've just had a chance to investigate a machine that has this problem. It turned out this is another symptom of the problem discussed in this KB:

http://support.microsoft.com/kb/913817|||

I'm coming up with FlashProp Class. How do I keep the dos window open. I'm not a console app programmer, sorry. And how do we know that this is the problem?

thanks

|||

Bogey1 wrote:

I'm coming up with FlashProp Class. How do I keep the dos window open. I'm not a console app programmer, sorry. And how do we know that this is the problem?

thanks

The application does not fix the registry settings, it just lists CLSIDs with incorrect permissions. Once they are identified, open RegEdit and find these CLSIDs under HKEY_LOCAL_MACHINE\Software\Classes\CLSID, and grant everyone read permissions to these keys. You may then close the application, and reboot. The problem should go away.

|||

Thats seems to not of worked for me. Is there a fix for this? I cannot have my developers run regedit and fix keys to get this working. Can MS provide a tool that will run and fix the keys if this is the problem? Anyone else running into this problem?

thanks Michael and keep me updated. We'd really like to use this on one of our projects here at the University.

|||The problem is caused by third-party installation (the cases I've seen were caused by Lexmark driver and Flash, there might be other applications incorrectly setting the permissions), so we don't know what keys these applications create and what are the appropriate permissions for them. So it would be risky to write an automated program to fix them. The program in KB article identifies the problematic keys, but you currently need to fix them in RegEdit.

Wednesday, March 21, 2012

loop through certain files

Using SSIS foreach loop, I am looping through files to get the filenames and pass them on as variable.
How is it possible to loop through the files but only pass the files that do not end with ..._Parameters.xml? or loop through only the ones which end with ..._Parameters.xml
I think this is to do with the scripting or expression in the foreachloop?
ThanksYou could use a script task to determine which path to take (i.e. is _Parameters.xml or isNot _Paremeters.xml). One way you could do this would be by using the "Fail" path if the task if it is _Parameters.xml, and otherwise use the sucess path. I'm not sure if this is the best way to go about this or not though... (You would need to make sure that this task would not cause the container to fail when it does)|||Can't you set the files property (available in the dialog) to "*_Parameters.xml" to get the ones that do match?|||

Good thinking.

Done that now.

Thanks

sql

Loop container to process all excel files

What I'm trying to achieve is a SSIS package that will pick up 1 or more excel files, process the data in them via the conditional splitter, pushing the good data into a table, and all other rows into an error table.
I'm having some issues using the ForEach container to process multiple excel spreadsheets into tables. The excel import into the tables is more or less working (imports data for good cases, but uses a null if the Excel Source when it gets an unexpected value - but that's a seperate problem).
I found something related to this when searching, but it related to CTPs (June and September) and trying to reuse the connection strings they built up there (using my own variable names, naturally) causes a 'Property Value failure':
--
The connection string format is not valid. It must consist of one or more components of the form X=Y, separated by semicolons. This error occurs when a connection string with zero components is set on database connection manager.
--
I attemtpted to use this:
"Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" + @.[User::RankingFileFullPath] + ";Extended Properties=\"Excel 8.0;HDR=YES\";"
The excel importer works fine as a stand-alone component. Trying to use the process defined in 'Profession SQL Server Integration Services' pp140, I tried to use an expression to assign the variable value to the connection string. I get a validation error:
--
Error at Import TPNB Ranking Excel spreadsheets [Excel Source [1]]: The AcquireConnection method call to the connection manager "Excel Connection Manager" failed with error code 0xC0202009.
Error at Import TPNB Ranking Excel spreadsheets [DTS.Pipeline]: component "Excel Source" (1) failed validation and returned error code 0xC020801C.
Error at Import TPNB Ranking Excel spreadsheets [DTS.Pipeline]: One or more component failed validation.
Error at Import TPNB Ranking Excel spreadsheets: There were errors during task validation.
Error at Excel Importer [Connection manager "Excel Connection Manager"]: An OLE DB error has occurred. Error code: 0x80040E4D.
--
Any advice?
....
.... in addition ....
I attempted to change the package - I set the Data Flow validation to Delay Validation, and changed the expression to change from:
ConnectionString @.[User::RankingFileFullPath]
to
ExcelFilePath @.[User::RankingFileFullPath]
This allowed the package to start debugging, and gave more information in the failure:
--
SSIS package "Excel Importer.dtsx" starting.
SSIS breakpoint 'Break when the container receives the OnPreExecute event' at executable 'Excel Importer' has been hit
SSIS breakpoint 'Break when the container receives the OnPreExecute event' at executable 'Foreach Loop Container' has been hit
SSIS breakpoint 'Break when the container receives the OnQueryCancel event' at executable 'Excel Importer' has been hit
Information: 0x4004300A at Import TPNB Ranking Excel spreadsheets, DTS.Pipeline: Validation phase is beginning.
Warning: 0x802092A7 at Import TPNB Ranking Excel spreadsheets, ProductSalesRank Table [278]: Truncation may occur due to inserting data from data flow column "Rank" with a length of 1000 to database column "SalesRank" with a length of 50.
Error: 0xC0202009 at Excel Importer, Connection manager "Excel Connection Manager": An OLE DB error has occurred. Error code: 0x80004005.
An OLE DB record is available. Source: "Microsoft JET Database Engine" Hresult: 0x80004005 Description: "Unrecognized database format 'D:\Testing\TestRanking.xls'.".
Error: 0xC020801C at Import TPNB Ranking Excel spreadsheets, Excel Source [1]: The AcquireConnection method call to the connection manager "Excel Connection Manager" failed with error code 0xC0202009.
Error: 0xC0047017 at Import TPNB Ranking Excel spreadsheets, DTS.Pipeline: component "Excel Source" (1) failed validation and returned error code 0xC020801C.
Error: 0xC004700C at Import TPNB Ranking Excel spreadsheets, DTS.Pipeline: One or more component failed validation.
Error: 0xC0024107 at Import TPNB Ranking Excel spreadsheets: There were errors during task validation.
SSIS breakpoint 'Break when the container receives the OnQueryCancel event' at executable 'Excel Importer' has been hit
Warning: 0x80019002 at Foreach Loop Container: The Execution method succeeded, but the number of errors raised (5) reached the maximum allowed (1); resulting in failure. This occurs when the number of errors reaches the number specified in MaximumErrorCount. Change the MaximumErrorCount or fix the errors.
SSIS breakpoint 'Break when the container receives the OnQueryCancel event' at executable 'Excel Importer' has been hit
SSIS breakpoint 'Break when the container receives the OnQueryCancel event' at executable 'Excel Importer' has been hit
SSIS breakpoint 'Break when the container receives the OnWarning event' at executable 'Excel Importer' has been hit
Warning: 0x80019002 at Excel Importer: The Execution method succeeded, but the number of errors raised (5) reached the maximum allowed (1); resulting in failure. This occurs when the number of errors reaches the number specified in MaximumErrorCount. Change the MaximumErrorCount or fix the errors.
SSIS breakpoint 'Break when the container receives the OnPostExecute event' at executable 'Excel Importer' has been hit
SSIS package "Excel Importer.dtsx" finished: Failure.
The program '[2460] Excel Importer.dtsx: DTS' has exited with code 0 (0x0).--
Apparently, I added too much detail, and poeple are avoiding the question :-)
I've tested doing exactly the same thing, but using CSV's as the source (i.e. text files and not excel files). And it works fine. So there's nothing wrong with the environment, DB etc.
...aaarg|||

I do recall other people having similar problems with Excel but I don't know if there was a resolution.

Sorry...not much help I'm afraid.

-Jamie

|||Thanks for the response - at least I have had one, now
The only reference's I've found pointed me to (a) re-installing service pack 1 to ensure SQL and SSIS etc. are all good. I did that, but no go. The other reference seemed to say I need to write an activeX (or .Net script) component to handle the files. That's just got to be wrong - what's the point of having to continuously kludge different sections of SSIS together with .Net scripting... In that case it seems that we should rather do the whole thing in 'Net ... *sighs* ...
I guess I'm just a bit frustrated that something as simple as importing a few excel files into a SQL table doesn't work simply ... after all the forEach and loop controls was one of the NEW BIG THINGS that SSIS was bringing us.... oh well, it's getting later on a Friday, and it looks like it's going to be a long, frustrating weekend (... it might be better if I knew how to write the .Net )
Well, I hope you (and anyone else reading my ranting) has a good weekend.
|||

I was able to complete your scenario successfully following the instructions at: http://msdn2.microsoft.com/en-us/library/ms345182.aspx

Note that I paid particular attention to this paragraph:

To avoid validation errors later as you configure tasks and data flow components to use this connection manager, assign a default Excel workbook in the Excel Connection Manager Editor. After you create and configure the package, you can delete this value in the Properties window. However, after you delete this value, a validation error may occur because the connection string property of the Excel connection manager is no longer valid until the Foreach Loop runs. In this case, set the DelayValidation property to True on the connection manager, on the tasks in which the connection manager is used, or on the package.

If I did not have DelayValidation set True on the connection manager and have a valid default value in the connection manager I received the same errors as you. I also set DelayedValidation = True on the Data Flow Task for extra caution.

Donald

|||

Hi Donald,

Thanks for the link and the reponse ... I appreciate it.

I'll have a look - the article loks promising. I did already have all my 'delay validations' set to true, and I initially created the excel connection with a valid connection - I had it pointing at one of the file I want to impoty. When it is hard-coded to the excel file, my import runs fine... as soon as I delete the connection name, and replace it witht eh variable name @.[User::RankingFileFullPath], mr problems occur. I'll read through the article, and create a new package, just to ensure I don't inherit some issues I created in my attempts to appease the SSIS connections gods... :-)

thanks again

|||

I followed the article's guidelines, but am having an issue.

The sequence of actions:

Create New Project.

Create ForEach container.

Create Variables ExcelFileFullPath and ExcelConnectionProperties. Set the Container to map the ExcelFileFullPath.

Create Excel Connection, pointing to valid, existing excel file.

Alter Excel connection properties: Delay Validation = true. Expression: ConnectionString:

"Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" + @.[User::ExcelFilefullPath] + ";Extended Properties=\"" + @.[User::ExcelConnectionProperties] + "\""

All good, so far.

Create a data flow control in the container. Add an excel source. Try connecting excel source to excel connection - error: Initial error was an Hresult type error (not captured). Trying to go further, I now get 'Invlaid Arguement' when trying to choose name of excel sheet - this happens on new excel source as well.

I'm going to start again, this time creating an excel import that works correctly. Once that's done, I'll try and move it into the container, and apply the properties as described in article, and see if that makes a difference (surely it shouldn't?)....

|||

Status:

I have created a Data Flow Task that opens a HardCoded (or direct) Excel Connection, does a data conversion, then a conditional split, and inserts rows into a OLE DB table connection. I have a ForEach container that correctly populates the excel file path into the ExcelFullFilePath variable. The Data flow task is OUTSIDE the container. All runs through successfully.

Next:

I intend to move the data flow task into the Container, but not change the excel connection, and confirm that it runs successfully. Assumming success, I will then apply the process from the article, specifically:

Set Delay Validation = True on the Package, Containter, Data Flow and Excel Connection.

Set the ConnectionString (via expressions) =

"Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" + @.[User::ExcelFilefullPath] + ";Extended Properties=\"" + @.[User::ExcelConnectionProperties] + "\""

I'll respond on progress/results

-- so far:

Run with Data Flow in ForEach container - still runs through, USING Hard-coded connection

Change Delay Validation properties on all objects - still runs through, USING Hard-coded connection

Change ConnectionString property to use dynamic ("Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" + @.[User::ExcelFilefullPath] + ";Extended Properties=\"" + @.[User::ExcelConnectionProperties] + "\"") - SUCCESS.

Add additional .xls file to folder for processing - failure. BUT that was (due to a GOOD, DESCRIPTIVE ERROR MESSAGE) because the connection requires the same sheet name for the 2 excel files. I can live with that.

Update 2nd test file sheet name, re-run... SUCCESS. 2 Excel file successfully imported.

Many thanks for the response, especially yours, Donald.

I still don't understand why I had to go through the sequence I had to, i.e. create a fully working excel import, and then alter the properties, but it worked. Also, a useful note for those doing this in the future - make sure the sheets for the multiple files have the same sheet names.

HTH and thanks

|||

Thanks for these detailed notes. They will help us to improve this experience in the future.

Donald

|||Wow, thanks GethWho. I've been looking for this answer all day!

Everyone looking at this make sure the set DelayValidation to true on ALL of your components!

-Matt

Loop container to process all excel files

What I'm trying to achieve is a SSIS package that will pick up 1 or

more excel files, process the data in them via the conditional

splitter, pushing the good data into a table, and all other rows into

an error table.

I'm having some issues using the ForEach container to process multiple

excel spreadsheets into tables. The excel import into the tables is

more or less working (imports data for good cases, but uses a null if

the Excel Source when it gets an unexpected value - but that's a

seperate problem).

I found something related to this when searching, but it related to

CTPs (June and September) and trying to reuse the connection strings

they built up there (using my own variable names, naturally) causes a

'Property Value failure':

--

The connection string format is not valid. It must consist of one or

more components of the form X=Y, separated by semicolons. This error

occurs when a connection string with zero components is set on database

connection manager.

--

I attemtpted to use this:

"Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" +

@.[User::RankingFileFullPath] + ";Extended Properties=\"Excel

8.0;HDR=YES\";"

The excel importer works fine as a stand-alone component. Trying to use

the process defined in 'Profession SQL Server Integration Services'

pp140, I tried to use an expression to assign the variable value to the

connection string. I get a validation error:

--

Error at Import TPNB Ranking Excel spreadsheets [Excel Source [1]]: The

AcquireConnection method call to the connection manager "Excel

Connection Manager" failed with error code 0xC0202009.

Error at Import TPNB Ranking Excel spreadsheets [DTS.Pipeline]:

component "Excel Source" (1) failed validation and returned error code

0xC020801C.

Error at Import TPNB Ranking Excel spreadsheets [DTS.Pipeline]: One or more component failed validation.

Error at Import TPNB Ranking Excel spreadsheets: There were errors during task validation.

Error at Excel Importer [Connection manager "Excel Connection Manager"]: An OLE DB error has occurred. Error code: 0x80040E4D.

--

Any advice?

....

.... in addition ....

I attempted to change the package - I set the Data Flow validation to

Delay Validation, and changed the expression to change from:

ConnectionString @.[User::RankingFileFullPath]

to

ExcelFilePath @.[User::RankingFileFullPath]

This allowed the package to start debugging, and gave more information in the failure:

--

SSIS package "Excel Importer.dtsx" starting.

SSIS breakpoint 'Break when the container receives the OnPreExecute event' at executable 'Excel Importer' has been hit

SSIS breakpoint 'Break when the container receives the OnPreExecute event' at executable 'Foreach Loop Container' has been hit

SSIS breakpoint 'Break when the container receives the OnQueryCancel event' at executable 'Excel Importer' has been hit

Information: 0x4004300A at Import TPNB Ranking Excel spreadsheets, DTS.Pipeline: Validation phase is beginning.

Warning: 0x802092A7 at Import TPNB Ranking Excel spreadsheets,

ProductSalesRank Table [278]: Truncation may occur due to inserting

data from data flow column "Rank" with a length of 1000 to database

column "SalesRank" with a length of 50.

Error: 0xC0202009 at Excel Importer, Connection manager "Excel

Connection Manager": An OLE DB error has occurred. Error code:

0x80004005.

An OLE DB record is available. Source: "Microsoft JET Database

Engine" Hresult: 0x80004005 Description: "Unrecognized

database format 'D:\Testing\TestRanking.xls'.".

Error: 0xC020801C at Import TPNB Ranking Excel spreadsheets, Excel

Source [1]: The AcquireConnection method call to the connection manager

"Excel Connection Manager" failed with error code 0xC0202009.

Error: 0xC0047017 at Import TPNB Ranking Excel spreadsheets,

DTS.Pipeline: component "Excel Source" (1) failed validation and

returned error code 0xC020801C.

Error: 0xC004700C at Import TPNB Ranking Excel spreadsheets, DTS.Pipeline: One or more component failed validation.

Error: 0xC0024107 at Import TPNB Ranking Excel spreadsheets: There were errors during task validation.

SSIS breakpoint 'Break when the container receives the OnQueryCancel event' at executable 'Excel Importer' has been hit

Warning: 0x80019002 at Foreach Loop Container: The Execution method

succeeded, but the number of errors raised (5) reached the maximum

allowed (1); resulting in failure. This occurs when the number of

errors reaches the number specified in MaximumErrorCount. Change the

MaximumErrorCount or fix the errors.

SSIS breakpoint 'Break when the container receives the OnQueryCancel event' at executable 'Excel Importer' has been hit

SSIS breakpoint 'Break when the container receives the OnQueryCancel event' at executable 'Excel Importer' has been hit

SSIS breakpoint 'Break when the container receives the OnWarning event' at executable 'Excel Importer' has been hit

Warning: 0x80019002 at Excel Importer: The Execution method succeeded,

but the number of errors raised (5) reached the maximum allowed (1);

resulting in failure. This occurs when the number of errors reaches the

number specified in MaximumErrorCount. Change the MaximumErrorCount or

fix the errors.

SSIS breakpoint 'Break when the container receives the OnPostExecute event' at executable 'Excel Importer' has been hit

SSIS package "Excel Importer.dtsx" finished: Failure.

The program '[2460] Excel Importer.dtsx: DTS' has exited with code 0

(0x0).--Apparently, I added too much detail, and poeple are avoiding the question :-)

I've tested doing exactly the same thing, but using CSV's as the source

(i.e. text files and not excel files). And it works fine. So there's

nothing wrong with the environment, DB etc.

...aaarg|||

I do recall other people having similar problems with Excel but I don't know if there was a resolution.

Sorry...not much help I'm afraid.

-Jamie

|||Thanks for the response - at least I have had one, now

The only reference's I've found pointed me to (a) re-installing service

pack 1 to ensure SQL and SSIS etc. are all good. I did that, but no go.

The other reference seemed to say I need to write an activeX (or .Net

script) component to handle the files. That's just got to be wrong -

what's the point of having to continuously kludge different sections of

SSIS together with .Net scripting... In that case it seems that we

should rather do the whole thing in 'Net ... *sighs* ...

I guess I'm just a bit frustrated that something as simple as importing

a few excel files into a SQL table doesn't work simply ... after all

the forEach and loop controls was one of the NEW BIG THINGS that SSIS

was bringing us.... oh well, it's getting later on a Friday, and it

looks like it's going to be a long, frustrating weekend (... it might

be better if I knew how to write the .Net )

Well, I hope you (and anyone else reading my ranting) has a good weekend.|||

I was able to complete your scenario successfully following the instructions at: http://msdn2.microsoft.com/en-us/library/ms345182.aspx

Note that I paid particular attention to this paragraph:

To avoid validation errors later as you configure tasks and data flow components to use this connection manager, assign a default Excel workbook in the Excel Connection Manager Editor. After you create and configure the package, you can delete this value in the Properties window. However, after you delete this value, a validation error may occur because the connection string property of the Excel connection manager is no longer valid until the Foreach Loop runs. In this case, set the DelayValidation property to True on the connection manager, on the tasks in which the connection manager is used, or on the package.

If I did not have DelayValidation set True on the connection manager and have a valid default value in the connection manager I received the same errors as you. I also set DelayedValidation = True on the Data Flow Task for extra caution.

Donald

|||

Hi Donald,

Thanks for the link and the reponse ... I appreciate it.

I'll have a look - the article loks promising. I did already have all my 'delay validations' set to true, and I initially created the excel connection with a valid connection - I had it pointing at one of the file I want to impoty. When it is hard-coded to the excel file, my import runs fine... as soon as I delete the connection name, and replace it witht eh variable name @.[User::RankingFileFullPath], mr problems occur. I'll read through the article, and create a new package, just to ensure I don't inherit some issues I created in my attempts to appease the SSIS connections gods... :-)

thanks again

|||

I followed the article's guidelines, but am having an issue.

The sequence of actions:

Create New Project.

Create ForEach container.

Create Variables ExcelFileFullPath and ExcelConnectionProperties. Set the Container to map the ExcelFileFullPath.

Create Excel Connection, pointing to valid, existing excel file.

Alter Excel connection properties: Delay Validation = true. Expression: ConnectionString:

"Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" + @.[User::ExcelFilefullPath] + ";Extended Properties=\"" + @.[User::ExcelConnectionProperties] + "\""

All good, so far.

Create a data flow control in the container. Add an excel source. Try connecting excel source to excel connection - error: Initial error was an Hresult type error (not captured). Trying to go further, I now get 'Invlaid Arguement' when trying to choose name of excel sheet - this happens on new excel source as well.

I'm going to start again, this time creating an excel import that works correctly. Once that's done, I'll try and move it into the container, and apply the properties as described in article, and see if that makes a difference (surely it shouldn't?)....

|||

Status:

I have created a Data Flow Task that opens a HardCoded (or direct) Excel Connection, does a data conversion, then a conditional split, and inserts rows into a OLE DB table connection. I have a ForEach container that correctly populates the excel file path into the ExcelFullFilePath variable. The Data flow task is OUTSIDE the container. All runs through successfully.

Next:

I intend to move the data flow task into the Container, but not change the excel connection, and confirm that it runs successfully. Assumming success, I will then apply the process from the article, specifically:

Set Delay Validation = True on the Package, Containter, Data Flow and Excel Connection.

Set the ConnectionString (via expressions) =

"Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" + @.[User::ExcelFilefullPath] + ";Extended Properties=\"" + @.[User::ExcelConnectionProperties] + "\""

I'll respond on progress/results

-- so far:

Run with Data Flow in ForEach container - still runs through, USING Hard-coded connection

Change Delay Validation properties on all objects - still runs through, USING Hard-coded connection

Change ConnectionString property to use dynamic ("Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" + @.[User::ExcelFilefullPath] + ";Extended Properties=\"" + @.[User::ExcelConnectionProperties] + "\"") - SUCCESS.

Add additional .xls file to folder for processing - failure. BUT that was (due to a GOOD, DESCRIPTIVE ERROR MESSAGE) because the connection requires the same sheet name for the 2 excel files. I can live with that.

Update 2nd test file sheet name, re-run... SUCCESS. 2 Excel file successfully imported.

Many thanks for the response, especially yours, Donald.

I still don't understand why I had to go through the sequence I had to, i.e. create a fully working excel import, and then alter the properties, but it worked. Also, a useful note for those doing this in the future - make sure the sheets for the multiple files have the same sheet names.

HTH and thanks

|||

Thanks for these detailed notes. They will help us to improve this experience in the future.

Donald

|||Wow, thanks GethWho. I've been looking for this answer all day!

Everyone looking at this make sure the set DelayValidation to true on ALL of your components!

-Matt

Monday, March 19, 2012

Lookup transform with multiple matches

Please indulge my ignorance, as I have only been using SSIS for a couple of weeks.
I'm trying to create a data warehouse using two input tables.
A column needs to be added to one table by using a lookup into the second table.
SSIS seems to handle the "no matches" and "single match" cases perfectly.
I can't for the life of me figure out how to properly handle multiple matches.
SSIS defaults to the first match, but I need to compute the "best" match.

Many thanks in advance
Scott!

What's the criteria for the "best" match? Knowing the criteria would make it easier to help Smile|||Here are some specifics:
Table A contains repair records.
Table B contains inspection records.
Each piece of equipment can have multiple repairs and many more inspections.
A lookup can be performed based on the equipment number.

I need to determine which repair immediately proceeds each inspection.
This can be found by computing the smallest positive date difference.

Programmactically, I would normally do this with nested loops.
Using SSIS, I suppose it could also be done this way, using a foreach loop
and a condition...
|||

Interesting problem Smile

I haven't tested this, but could you join the rows in SQL on the equipment, calc the date difference and order by it, and use a rank function to get the first row?

Lookup Transform

I want to do something relatively simple with SSIS but can't find an easy way to do this (isint it always the case with SSIS )

I have a column lets say called iorg_id, and I want to lookup the matching rows for this col in a table.

In this table iorg_id may have several potential matching rows. In this table there is another col called 'Amount'.

I want to retrieve for each iorg_id the matching iorg_id in the other table but only the row with the largest value in the 'Amount' col.

I couldn't find a way to do this all in the Lookup Transform.

I can match the iorg_ids and retrieve the Amount column, but can't find a way just to retrieve the matching row with the largest value in the Amount col. The only way I can think to do this is then run the output from the Transform through an Aggregate function and determine the Max (although haven't tested this yet).

Seems strange to me in that the SQL in the Advanced tab gives me something like:

select * from
(select * from [dbo].[Table1]) as refTable
where [refTable].[iorg_id] = ?

where I believe the first 'select *' is retrieving all the cols that are listed in the LookupColumns list in the Columns tab.

I thought I would be able to amend this to something like:

select max(amount) from
(select * from [dbo].[Table1]) as refTable
where [refTable].[iorg_id] = ?

but I get a metadata type error.

So, questions are:

Is it possible to do this all in the Lookup Transform are do I have to use the Aggregate function as I think ?

Why is it not possible to amend the sql in the Advanced tab to manipulate the returned data ?

If I understand the problem correctly, couldn't you simply use (in the Lookup) some SQL similar to: -

SELECT iorg_id
, MAX(amounT)
FROM <YourTable>
GROUP BY iorg_id

A match on iorg_id should then give you what you want

|||

Yes, Paul you're right. Thanks.

My colleague had actually pointed that out to me in the interim.

Sometimes it's difficult to see to the wood for the trees !

Lookup task with NOLOCK

I have an SSIS package that unpivots data - each row from the source becomes 20-24 rows in the destination db. In the dataflow it looks up some type-data IDs before inserting into the destination table. The whole process flies through the data at an incredible rate, however I find that the final commit on data insertion takes a very long time, and two of the lookup tasks remain yellow throughout. It appears that there may be a contention issue because the final table has FKs to those lookup tables.

In T-SQL it is possible to do SELECT..... WITH (NOLOCK), but when I modified the SQL statement of the lookup task by adding this clause, it caused an error "Incorrect syntax near the keywork 'with'."

Is there any way around this problem? THNX!

PhilSky

You could split the data-flow into 2, using raw files to pass data between them. This would alleviate blocking problems (if indeed that is the issue).

Incidentally, I have a request for enhancements to the LOOKUP component that means this would be even easier. Feel free to vote for these enhancements here: http://lab.msdn.microsoft.com/productfeedback/viewfeedback.aspx?feedbackid=58d0100b-aa34-4191-ae03-7655543614a1 and leave a comment as well if you could.

-Jamie

|||I've tried that and it works fine what is your SQL statement you are trying to use. The other option is to change the transaction isolation level for the dataflow

Monday, March 12, 2012

Lookup on AS400

Hi everybody,

I'm new in SSIS and I'm trying to activate a lookup on a table reside on DB2/AS400. Iget the following message when I try to join the input column to the lookup column:

TITLE: Microsoft Visual Studio

The following columns cannot be mapped:

[STOREK_S, STOREK_S]

One or more columns do not have supported data types, or their data types do not match.


BUTTONS:

OK

I know both columns have the same format on the DB2 table.

What I'm doing wrong ?

P.S: I'm using the IBM DB2 UDB for iSeries IBMDASQL OLE DB Provider as connection.

If I try with the Microsoft OLE DB Provider for DB2, I was unable to get the list of my table on teh AS400.

Could you help me ?

Well, are you taking your dataflow and looking up a column against a DB2 source? In that case, the incoming data type may not match the data type in DB2. They could be different. You'll have to verify the DB2 data type and compare it against the incoming column.

Also, I've used both the IBM DB2 iSeries and Microsoft OLE DB for DB2 connectors and have them both working... In the MS provider, ensure that your package collection (library) is set to the same as the default schema. The catalog entry is for the database name. This helped me, and if that doesn't resolve your issue, please provide the error that you get when you try to select a table in the drop-down box.

I'd like to help you through both of your issues and get more DB2 documentation out on this forum!

Thanks,
Phil|||

Yes I'm using a Dataflow where I got 3 box on it: A DataSource Reader, a lookup and a OLE DB Destination.

My DataSource Reader used a ODBC connection (iSserie V5R4) comming with the IBM iSerie client. An SQL extract the data and it work fine if I connect it to the OLE DB Destination.

If I try to insert a lookup between the DataSource Reader and OLE DB destination, I got problem with the lookup.

The lookup use the Microsoft OLE DB provider for DB2(let focus on this connection first). I got correctly my list of table on the AS400. I got the correct list of the available Input Columns and available Lookup Columns. But when I try to specify the join columns I got an error.

The one mention in the initial request.

At the point of view of the ETL the column STOREK_S on the available Input Columns is in format [DT_NUMERIC], length: 0, scale: 0, Precision: 8, Source Component : DataSource Reader

and the available Lookup Columns, STOREK_S is in the format [DT_DECIMAL], length: 0, scale: 0, Precision: 8, Code page: (blank).

Like I also mention at the point of view of the DB2 AS400 the colomns STOREK_L got the same format.

What could be done to solve this problem ?

|||Right before the lookup, use a derived column transformation to cast the input column, STOREK_L to DT_DECIMAL from DT_NUMERIC.

Try this as your expression: (DT_DECIMAL,8)STOREK_L

I'm a little surprised to see precision of 8 with no scale, though, on your lookup column. Doesn't make sense.|||

I also sruprised by the precision of 8 with no scale, but it's the default mapping I got.

Could I change it in any way ?

Doing what you ask for I always get error in mapping but now the field derived is in the format of [DT_DECIMAL] with length: 0, scale:8 and precision : 0, that not map with the [DT_DECIMAL] with length: 0, scale:0 and precision : 8

|||

desilets wrote:

I also sruprised by the precision of 8 with no scale, but it's the default mapping I got.

Could I change it in any way ?

Doing what you ask for I always get error in mapping but now the field derived is in the format of [DT_DECIMAL] with length: 0, scale:8 and precision : 0, that not map with the [DT_DECIMAL] with length: 0, scale:0 and precision : 8

But do you get an error?|||Sorry, I forgot to write it. But I still get the problem when I try to join the columns for lookup|||The lookup field isn't by chance a packed decimal field, is it?|||Yes it is. But the lookup is always not working.|||

Phil,

Do you know if some special software was installl on the AS400 to allow the OLE to recognize the right structure of the table ? (somebody working with me, tell me something about a software, I verify this way).

|||

desilets wrote:

Yes it is. But the lookup is always not working.

I don't think packed decimals (and perhaps zoned decimals) are supported by the Microsoft OLEDB for DB2 driver. You can try a new iSeries driver, or you can try the "DB2 Connect" software also from IBM.

http://www-128.ibm.com/developerworks/forums/dw_thread.jsp?forum=292&thread=135790&cat=5