Showing posts with label task. Show all posts
Showing posts with label task. Show all posts

Friday, March 30, 2012

Losing Temp Tables Between Tasks

I have a control flow setup with 5 tasks.

1. Create Temp Table (Execute SQL)
2. Load Temp Table (Execute SQL)
3. Data Flow Task using the temp table
4. Data Flow Task using the temp table
5. Drop Temp Table (Execute SQL)

Tasks 3 & 4 are currently setup to run parallel/concurrently - meaning they both branch directly off of task #2. They both also lead to task #5. Also, my connection manager is set to retain its connection.

Here's the problem. When executing this flow, task #3 will sometimes fail, other times it will succeed - without me changing anything between runs (I simply end debugging and run it again). When it does fail it is always because it cannot find the temp table. One thing to note is that task #4 (running at the same time as #3 and using the same temp table) always finishes first because it does quite a bit less work. Last thing to note: if I set up task 4 to run after 3 it works everytime, but of course it takes much longer for the total package to execute.

How is it that the task can find the temp table on some runs, but not on others? If this is just quirky behavior that I am going to have to live with for the time being, is there a way to force a task to re-run X number of times before giving up and moving on?

Thanks,
David Martin

Temp tables are only guaranteed to remain in existence as long as they are referenced. Is it possible that the connection is being dropped and reopened by SSIS for some reason between tasks?

Or that there is some time during the processing where Sql Server cannot detect the temp table is still being referrenced, so drops it even if the connection is still open?

Seems like using temp tables is risky if you don't have full and explicit control over their existence. I want to use them also but not if I cannot guarantee they will be there when I need them!

|||

Make sure the "RetainSameConnection" property is set to "True" for the connection. This will allow the Execute SQL tasks to share the connection so the temp table can be accessed.

Frank

sql

Losing config file between child packages and *$##@!- Passwords!

I have a parent package that uses Execute Package Task against three child packages. They are all setup to use the same configuration file, and the packages are all located in the same folder. Purpose of the configuation is to use the same connection strings for each child package.

I removed the path from the config. file specification to facilitate deployment. During debugging, the child packages cannot find the config file when they are run via the parent unless the full path is specified. If run standalone, a child package can find the config. file without the path. As I said, all dtsx filaes and the config file is in the same folder.

Also, the password-removal issue is killing me in development. Everytime I touch the config. file, it removes the password from the connection string. I have "protection level" set to "don't save sensitive" - I'd appreciate help with this too.

thanks

Mark Challen wrote:

I have a parent package that uses Execute Package Task against three child packages. They are all setup to use the same configuration file, and the packages are all located in the same folder. Purpose of the configuation is to use the same connection strings for each child package.

I removed the path from the config. file specification to facilitate deployment. During debugging, the child packages cannot find the config file when they are run via the parent unless the full path is specified. If run standalone, a child package can find the config. file without the path. As I said, all dtsx filaes and the config file is in the same folder.

Also, the password-removal issue is killing me in development. Everytime I touch the config. file, it removes the password from the connection string. I have "protection level" set to "don't save sensitive" - I'd appreciate help with this too.

thanks

When you have it set to "don't save sensitive" that's what's supposed to happen. It's not saving sensitive information.|||As far as your paths go, you have to use absolute paths. So, you might be better off using an environment variable or something to store the path to the config file.

Relative paths have been requested: https://connect.microsoft.com/SQLServer/feedback/ViewFeedback.aspx?FeedbackID=253232|||

Phil Brammer wrote:


When you have it set to "don't save sensitive" that's what's supposed to happen. It's not saving sensitive information.

Do you happen to know which setting will allow the saving of a password so I can work on the packages in development without having to reset the password everytime I touch the configuration? I can understand removing the password when building the solution for deployment, having it auto-magically disappear while staying in the designer it makes no sense.

|||http://msdn2.microsoft.com/en-us/library/ms141747.aspx|||

Mark Challen wrote:

I removed the path from the config. file specification to facilitate deployment. During debugging, the child packages cannot find the config file when they are run via the parent unless the full path is specified. If run standalone, a child package can find the config. file without the path. As I said, all dtsx filaes and the config file is in the same folder.

In my current project I am using that technique; removing the path of the config file from the package configuration and it has worked fine in both ways when called from the aprent package or standalone....

Config file has to be in same folder than the apckages though.

|||

Thanks for the link Phil. Doesn't answer the question, and I'm getting increasingly frustrated by the lack of control with SSIS. Not to get off on a rant, but I personally would rather be responsible for setting security, not having MS dictate that I encrypt everything.

Anyhow, I have found a workaround - once the config. file is established on the parent I set it's "Read-only" property to true in the file system. A low-tech approach, but it seems to work.

The issue of the requirement for an absolute path for the config. file is going to be a showstopper for me (no way am I setting an environment variable; I don't see the thinking behind that - how do I establish the variable in the first place?). I will probably execute my child packages independently from the command line from my own custom wrapper code, passing in my connection strings on the command line. Too bad, I wanted to use the Configuration feature and I spent a lot of time trying to make it work. This seems to be endemic to SSIS (like checkpointing) - a cool feature at first glance, but try to use it and you hit a brick wall. </Rant>

Rafael Salas wrote:

In my current project I am using that technique; removing the path of the config file from the package configuration and it has worked fine in both ways when called from the aprent package or standalone....

Config file has to be in same folder than the apckages though.

Are you sure the package is successfully locating the config file and overriding the connection strings, or is it defaulting back to your original connection manager settings in the child?

|||

Mark Challen wrote:

The issue of the requirement for an absolute path for the config. file is going to be a showstopper for me (no way am I setting an environment variable; I don't see the thinking behind that - how do I establish the variable in the first place?).

You set up the environment variable by way of the system properties in Windows. Once it's set up, you simply reference it thereby making your configurations portable. This is a good way, in my opinion to handle changes across environments that have different configurations (file paths, etc...)

For me, I use SQL Server based configurations and environment variables. I've pretty much got everything setup so that I never have to make changes to a package when promoting it from development into QA and finally into production. Even though file paths may change (commonly C: drive to D: drive, perhaps), the packages continue to work with little interaction. This helps given that we can't promote to production -- another group does that. (Thank you Sarbanes-Oxley!)|||

Mark Challen wrote:

Are you sure the package is successfully locating the config file and overriding the connection strings, or is it defaulting back to your original connection manager settings in the child?

I am 100% sure that it works; actually it is very easy to see if it is not; each time a configuration does not take place; a warning is generated and can be seen in the progress tab of BIDS or in the logging table/file when logging is enable. I rather prefer using indirect configurations; but I have gotten a client where Env variable are not an option....

Wednesday, March 21, 2012

Lookups - multiple hits

An underlying assumption of the lookup task seems to be that there will be a unique record matching the lookup criteria. However what happens when more than one record is matched? Does it return the designated field of the first record encountered, or does it error out? Is there a way to specify what SHOULD happen in such a case? Is there a way to specify secondary criteria if the primary returns multiple hits?

Related to this - suppose that my lookup returns 2 data points, but one of them is null. It seems that the lookup should appear successful, and rows can then be redirected based on which field was null via the Truncation setting.
For example, say I'm matching on "name" and returning "Id" and "description". The name is found, and there's a valid Id, but the description is null.
Would this result in a successful lookup with the "description" truncated, or would the lookup fail?

Thanks!
PhilIn full cache mode the lookup will report a warning and use one of the records. In partial or no cache it will use one of the records with no warning. I say one of the records because this is an implementation detail and is subject to change.

You can always specify more than one join column so that you will not get multiple hits. Obviously, this is not exactly what you are asking for because it sounds like you only want the secondary match to occur if the primary is a dupe. This the lookup does not do.

NULL does not mean truncation. In the scenario you specify the lookup would be considered successful with a description that happens to be NULL. Trunctation is when you try to put a string that contains 100 characters into a column that is less than 100 characters.

Thanks,
Matt

Monday, March 19, 2012

Lookup task with NOLOCK

I have an SSIS package that unpivots data - each row from the source becomes 20-24 rows in the destination db. In the dataflow it looks up some type-data IDs before inserting into the destination table. The whole process flies through the data at an incredible rate, however I find that the final commit on data insertion takes a very long time, and two of the lookup tasks remain yellow throughout. It appears that there may be a contention issue because the final table has FKs to those lookup tables.

In T-SQL it is possible to do SELECT..... WITH (NOLOCK), but when I modified the SQL statement of the lookup task by adding this clause, it caused an error "Incorrect syntax near the keywork 'with'."

Is there any way around this problem? THNX!

PhilSky

You could split the data-flow into 2, using raw files to pass data between them. This would alleviate blocking problems (if indeed that is the issue).

Incidentally, I have a request for enhancements to the LOOKUP component that means this would be even easier. Feel free to vote for these enhancements here: http://lab.msdn.microsoft.com/productfeedback/viewfeedback.aspx?feedbackid=58d0100b-aa34-4191-ae03-7655543614a1 and leave a comment as well if you could.

-Jamie

|||I've tried that and it works fine what is your SQL statement you are trying to use. The other option is to change the transaction isolation level for the dataflow

lookup task running very slow when deal w/big tables

I try to convert a Procedure that join 8 tables with INNER AND OUTER JOIN, my understanding is that the Lookup task is the one to use and I should break these joins into smaller block, it takes a long time to load when I do this, since each of these tables had 10-40mill. rows and I have 8 tables to go thru, currently this Stored Procedure took 3-4min to run, by converting this to 8 Lookup tasks, it ran for 20min. has anyone run into this issue before and know a work around for this.

Thanks

Are you aware of the different caching types of the Lookup Tranformation? by default the lookup caches the whole data set from your lookup table/query; if you are not using a query to limit the number of columns/rows to only the ones the lookup needs, so the cached data is limited to what you really need, this would be a good point to start. Try, if possible, to re-arrange the tasks in your data flow so your lookup transforms are based on the tables with the lesser rows.

You can see more information about lookup tranformation here: http://msdn2.microsoft.com/en-us/library/ms141821.aspx

Lookup transforms may not be the best approach for all cases; if the performace you are getting for having your dataset out of a single query/SP is better than the lookup tasks approcah why would you want to change it?

Thanks

Rafael Salas

|||

Yes, I am setting 500mb Cache size and select 3 integer and 2 varchar(50) fields only, I also try to re-arrange the lookup tables, but since all of them are big, it's limited what I can do by re-arranging them. The reason that I am doing this is to evaluate if it pay to convert our current procedure or not, the main thing that I try to test is performance, it's only make sense if SSIS tasks can out performance stored procedure during our data pull. I wish that SSIS allows input parameters mapping in the reference table TAB. this will allows me to limit number of lookup rows that get load into SSIS. also I found that the OLE DB Source task is very picky w/what kind of SQL script it can run, if you have a sub query that contains parameter, it won't take it, even it's perfectly ok to run in TSQL, this is so bad since sometime you get better performance by putting parms into sub query to limit number of rows before you join to another huge table.

Anyway I think I found a work around, I am currently switched over to use Merge Join task, and it seems to work better. But it still had not out performance current stored procedure. I just hope that some one out there w/similar issue, so I can compare note.

Thanks

|||

by using Merge Join trasnformation , I think you may actually be walking away from any performace gain. By definition Merge Join is an asynchonous trasnformation, so it would use more resources(memory) than the lookup transformation and more likely to have poorer performance; it actually requieres the inputs to be sorted . I think there are good reasons for not having a parameter mapping ability in the reference table tab of the lookup and I just can see that it would requiere to execute a query for every row comming to the lookup input. Actually if you want to test it, you can go to advanced tab and edit the query to include and map your parameters; the performance will hit the floor.

BTW, 500Mb in cache size may no be enough for the volume of data in your lookups.

You can review an interesting paper about SSIS performance at:

http://www.microsoft.com/technet/prodtechnol/sql/2005/ssisperf.mspx

Sorry if my post were not helpfull at all

Rafael Salas

|||

Thanks for you responses, it does help. I am just a big disappointed, since all the seminars that I attended seems to advise people to convert their SQL script into SSIS tasks for better maintenance and performance, but I have yet to find a work around that can outperform current stored procedure, since our source schema is fairly complex and required a lot of joins in order to pull data out, I guess you were right that SSIS tasks may not be the best choice for every situation in term of performance. Also, you are right about the Merge Join task, it actually performs 2-3 times worse than current stored procedure, and I guess I will stay w /my stored procedures for now.

Thanks a lot your help

Lookup Task Datatype issue

Hello:

I'm attempting to use the lookup task and i'm running into an issue about incompatible datatypes.

The table that i'm referencing in the lookup has a field named DateCreated, and it's DataType is DateTime.

But when I hook it up to the Lookup Task, the lookup task thinks it's a DB_Timestamp, so i get an incompatible data type error.

What can i do?

i'm not sure if this would resolve your issue, but have you tried using the data conversion component?|||

Duane:

Thanks for your reply.

My problem is not with the dataflow, but with the Lookup Task misinterpreting the Data Type of one of the columns in my lookup table.

Instead of being DateTime as the column is set to, the the Lookup task thinks the column is a DB_Timestamp.

|||

SamuelEe,

maybe i misunderstood you. however, the lookup transformation is a data flow component.

|||

SamuelEe wrote:

Duane:

Thanks for your reply.

My problem is not with the dataflow, but with the Lookup Task misinterpreting the Data Type of one of the columns in my lookup table.

Instead of being DateTime as the column is set to, the the Lookup task thinks the column is a DB_Timestamp.

Duane's suggestion is still valid. Try changing the type of the column in the pipeline to DT_DBTIMESTAMP as well so that it can match with what the LOOKUP component has got.

The LOOKUP component is not misinterpreting anything. Datetime fields from database tables get interpreted as DT_DBTIMESTAMP in a SSIS data flow.

-Jamie

Lookup task based on a variable

This doesn't seem possible but I'll ask anyway...

Can I build a lookup task where the lookup query is based on a variable, rather than hardcoding the SQL staement?

No, there is no support for this.

The best you can do is write a query and then use values in your input data as parameters, to help filter the rows. Seems obvious, but often I have got stuck thinking I wanted a parameterised query, so that results are filtered before I do the lookup, which cannot be done. This is probably not as effcient, but the closest we have.

|||

That's what I figured...

On the bright side, I was able to use another new SQL Server 2005 feature...

SELECT RecId, CAST(COUNT(*) AS Char(1)) [PVL]
FROM VT
WHERE Elect IN (

SELECT Elect
FROM (
SELECT State,PVLE1,PVLE2,PVLE3,PVLE4
FROM StateOptions) opt
UNPIVOT
(Elect For State IN (PVL1, PVL2, PVL3, PVL4)) AS UPV

)

GROUP BY RecId

Of course this would have been much simpler if the options table had been built vertically instead of horizontally!

Friday, March 9, 2012

lookup - error msg

hi all,

good day!

i have dataflowtask_a and dataflowtask_b

i also have a send mail task

inside dtf_a i have a very long transformation which has 10 lookup task

each lookup task is configured to have an error output. should every lookup

encountered an error, the error output of the

lookup sends a unique error msg per lookup and is appended to a

string variable errorlst. Should dtf-a encounter an error it sets the variable

@.nogo=true.

In the control flow if @.nogo=false it proceeds to dtf_b otherwise it proceeds to

sendmail the consolidate error message.

Questions:

1. I only want to have one unique error message sent per lookup. if there are two

lookup error it should send only one message

2. the message should look like this: "error lookup1 : error lookup2 :and so soon"

how can i do this

thanks!

joey

.

I assume you are using the Script component to append messages to your errorlst variable. If you do, then it should not be a problem to append only the first lookup error.

Thanks.

|||

yeah i'm using the script component

can anyone please post a script for

the script component on how to append on a

message on a variable @.errlst with package scope

thanks

looking up client side connection info

I am trying to create an audit trail using triggers on a particular table
for inserts, updates, and deletes.
This would be a simple task if the users were logging in under different
names but there is a single login that gets used by all connections.
Luckily the application has its own user table and the table I am creating
the audit for has a column that contains the application user that modified
the row. This is fine for inserts and updates but a problem for deletes.
So before I go looking for alternatives - a stored procedure for all access,
etc. - I wanted to know whether there is a way to find the client side
connection info by way of @.@.SPID somehow. A client machine name or IP at
the very least. Is it possible to retrieve a OLE DB connection string that
the client used to connect to the server in the first place? I know that
the connection string wouldn't really be of any interest to SQL server but I
was hoping that it would forward the info anyway.
Thanks in advance.
Jiho Han wrote:
> the very least. Is it possible to retrieve a OLE DB connection string that
> the client used to connect to the server in the first place? I know that
> the connection string wouldn't really be of any interest to SQL server but I
> was hoping that it would forward the info anyway.
I have searched high and low for such information and could not find any
reliable source for it.
What you can use is SET CONTEXT_INFO on the client side to have it
deposit up to 128 bytes of any info you like. Triggers can then read
this packet of info as follows:
SELECT context_info
FROM master.dbo.sysprocesses
WHERE spid = @.@.spid
Steve
|||Thanks for the info.
I don't really control the client, well, not all of them anyway. But that
may come in handy some day...
Jiho
"Steve Troxell" <steve_troxell@.hotmail-nospamforme.com> wrote in message
news:emJfEFrGFHA.3876@.TK2MSFTNGP14.phx.gbl...[vbcol=seagreen]
> Jiho Han wrote:
that[vbcol=seagreen]
that[vbcol=seagreen]
but I
> I have searched high and low for such information and could not find any
> reliable source for it.
> What you can use is SET CONTEXT_INFO on the client side to have it
> deposit up to 128 bytes of any info you like. Triggers can then read
> this packet of info as follows:
> SELECT context_info
> FROM master.dbo.sysprocesses
> WHERE spid = @.@.spid
>
> Steve

looking up client side connection info

I am trying to create an audit trail using triggers on a particular table
for inserts, updates, and deletes.
This would be a simple task if the users were logging in under different
names but there is a single login that gets used by all connections.
Luckily the application has its own user table and the table I am creating
the audit for has a column that contains the application user that modified
the row. This is fine for inserts and updates but a problem for deletes.
So before I go looking for alternatives - a stored procedure for all access,
etc. - I wanted to know whether there is a way to find the client side
connection info by way of @.@.SPID somehow. A client machine name or IP at
the very least. Is it possible to retrieve a OLE DB connection string that
the client used to connect to the server in the first place? I know that
the connection string wouldn't really be of any interest to SQL server but I
was hoping that it would forward the info anyway.
Thanks in advance.Jiho Han wrote:
> the very least. Is it possible to retrieve a OLE DB connection string that
> the client used to connect to the server in the first place? I know that
> the connection string wouldn't really be of any interest to SQL server but I
> was hoping that it would forward the info anyway.
I have searched high and low for such information and could not find any
reliable source for it.
What you can use is SET CONTEXT_INFO on the client side to have it
deposit up to 128 bytes of any info you like. Triggers can then read
this packet of info as follows:
SELECT context_info
FROM master.dbo.sysprocesses
WHERE spid = @.@.spid
Steve|||Thanks for the info.
I don't really control the client, well, not all of them anyway. But that
may come in handy some day...
Jiho
"Steve Troxell" <steve_troxell@.hotmail-nospamforme.com> wrote in message
news:emJfEFrGFHA.3876@.TK2MSFTNGP14.phx.gbl...
> Jiho Han wrote:
> > the very least. Is it possible to retrieve a OLE DB connection string
that
> > the client used to connect to the server in the first place? I know
that
> > the connection string wouldn't really be of any interest to SQL server
but I
> > was hoping that it would forward the info anyway.
> I have searched high and low for such information and could not find any
> reliable source for it.
> What you can use is SET CONTEXT_INFO on the client side to have it
> deposit up to 128 bytes of any info you like. Triggers can then read
> this packet of info as follows:
> SELECT context_info
> FROM master.dbo.sysprocesses
> WHERE spid = @.@.spid
>
> Steve

looking up client side connection info

I am trying to create an audit trail using triggers on a particular table
for inserts, updates, and deletes.
This would be a simple task if the users were logging in under different
names but there is a single login that gets used by all connections.
Luckily the application has its own user table and the table I am creating
the audit for has a column that contains the application user that modified
the row. This is fine for inserts and updates but a problem for deletes.
So before I go looking for alternatives - a stored procedure for all access,
etc. - I wanted to know whether there is a way to find the client side
connection info by way of @.@.SPID somehow. A client machine name or IP at
the very least. Is it possible to retrieve a OLE DB connection string that
the client used to connect to the server in the first place? I know that
the connection string wouldn't really be of any interest to SQL server but I
was hoping that it would forward the info anyway.
Thanks in advance.Jiho Han wrote:
> the very least. Is it possible to retrieve a OLE DB connection string tha
t
> the client used to connect to the server in the first place? I know that
> the connection string wouldn't really be of any interest to SQL server but
I
> was hoping that it would forward the info anyway.
I have searched high and low for such information and could not find any
reliable source for it.
What you can use is SET CONTEXT_INFO on the client side to have it
deposit up to 128 bytes of any info you like. Triggers can then read
this packet of info as follows:
SELECT context_info
FROM master.dbo.sysprocesses
WHERE spid = @.@.spid
Steve|||Thanks for the info.
I don't really control the client, well, not all of them anyway. But that
may come in handy some day...
Jiho
"Steve Troxell" <steve_troxell@.hotmail-nospamforme.com> wrote in message
news:emJfEFrGFHA.3876@.TK2MSFTNGP14.phx.gbl...
> Jiho Han wrote:
that[vbcol=seagreen]
that[vbcol=seagreen]
but I[vbcol=seagreen]
> I have searched high and low for such information and could not find any
> reliable source for it.
> What you can use is SET CONTEXT_INFO on the client side to have it
> deposit up to 128 bytes of any info you like. Triggers can then read
> this packet of info as follows:
> SELECT context_info
> FROM master.dbo.sysprocesses
> WHERE spid = @.@.spid
>
> Steve

Saturday, February 25, 2012

Looking for ideas...

I have a text file that I am importing in a data flow task. Not all of the rows are the same, there are "header" rows that contain a company code.

What I want to do is keep that company code and append that to the row of text that I am writing to a CSV file.

Since you cannot change variables until the post execute, what are my options?

Hope that's clear

Thanks!

BobP

Are you writing a custom script component to handle this? You talk of not being able to change variables until post execute, just handle the locking yourself and you can as illustrated here - http://forums.microsoft.com/MSDN/ShowPost.aspx?PostID=956181&SiteID=1|||From your description, I assume you're trying to denormalize a multi-formatted text file by moving data from header rows to detail rows.

Since your rows are not all the same, set your text file source component to read each line of the source file as a single column. Then write a custom script transformation component to consume that column, parse the data by row type, and output the individual columns that will be consumed by your destination (including the header columns). When the transformation encounters one of these header rows, put the parsed data (Company Code) into global variable(s) in the transformation component, and discard or redirect the header row. For each detail row, create a new output row and populate the "detail" columns from the parsed input data and the "header" column(s) (i.e. Company Code) from the the local variable(s). This way your header data is remembered for each subsequent detail row until another header is encountered.

Hope that helps.

Monday, February 20, 2012

Looking for DTS example

Hi all,
I am trying to automate calling a SQL Copy Data task. I need to provide the
source database dynamically at the least. I am using MSDE in a C3 program.
Does anyone have any examples or sites they can point me to? I am at a loss
after much searching.
ThanksMe too.
Apparently exporting a DTS to VB gives you the option of actually
controlling the DTS batch more, but you need Visual Basic installed (dunno
if it'll run in a VBA environment).
I've tried a DTS storage file & metadata services, the DTS storage file is
in binary and I can't get metadata services working just now...
+--
To e-mail me,
replace "REPLACETHIS" with buddhashortfatguy
"S" <spamaway@.hotmail.com> wrote in message
news:OG5F71TqDHA.1408@.TK2MSFTNGP11.phx.gbl...
> Hi all,
> I am trying to automate calling a SQL Copy Data task. I need to provide
the
> source database dynamically at the least. I am using MSDE in a C3
program.
> Does anyone have any examples or sites they can point me to? I am at a
loss
> after much searching.
> Thanks
>