Thursday, December 18, 2008

PerformancePoint Server 2007

We have begun the task of implementing PerformancePoint Server 2007 (PPS) on top of our existing data marts. Right now we are in the early stages of the requirements gathering. However, I have had a good chance to play with PPS and it is very slick!

I would highly recommend anyone needing a dashboard application to download the evaluation version and spend some time on it. You can download it here.

You will see some more blogs about it in the very near future because some of the stuff this can do is worth writing about. Also, some of the pitfalls that we have hit are interesting as well!

peace

Wednesday, December 10, 2008

DateTime Columns in Slowly Changing Dimension Component

We had an interesting error when implementing a slowly changing dimension in SSIS this week.

We had a date column that we were passing thru to the database from the source. However, the source was a script task because it came in from a multi-resultset stored procedure. In other data flows, where we were using an OLE DB destination, we typed datetime columns as DT_DATE. However, when we used DT_DATE with the slowly changing dimension component, it threw an error:
Error at Import Data [Slowly Changing Dimension]:The input column "input column "ALDATE (15157)" cannot be mapped to external column "external column "ALDATE (15160)" because they have different data types. The Slowly Changing Dimension transformation does not allow mapping between column of different types except for DT_STR and DT_WDTR.
After much digging and research, we determined that we had to set the output column type on the script component to DT_DBTIMESTAMP. Once we changed that, the SCD worked just fine!

peace

Friday, December 5, 2008

Importing CSV Files with SSIS

One of the shortcomings of the comma separated value (CSV) file import in SSIS is the inability for SSIS to determine if there are too few fields in a record.

According to Microsoft, they only check for the end of record indicator (Ususally a CR/LF) when processing the last column in the file. If you are not on the last column, and a CR/LF is encountered, it is ignored and the first field of the new line is added to the last column of the previous line.

Here is an example of what you see:

SSIS is set up to accept 3 columns in a CSV file: Col1, Col2 and Col3.

File comes in like this:

Col1,Col2
A1,A2
B1,B2
C1,C2

When ssis imports this file it produces this result:
Record 1: A1, A2B1, B2
Record 2: C1,C2 Error - Column Delimiter Not Found
Now, when troubleshooting this error, you can be looking in the wrong place, since the error usually shows up only at the end of the file.

So what I have been doing to gracefully catch and report on this is to quickly pre-process the file to count the columns. Since the processes that create the CSV files for my app are automated, I can assume that every record in the file has the same number of fields, and I only need to count 1 row. If you cannot assume this, it would be just as easy to loop thru each row and determine the field count.

I would also like to point out that my code does not yet handle commas embedded within quote delimited fields. For example, if your record looked like this:
1,"this is um, the field", 1,1
I will be posting code for that next week sometime.

Ok, so here is what I am doing.

I set up a script task in the control flow, before I try to open the file with ssis. This is the code I use:
Imports System
Imports System.Data
Imports System.Math
Imports Microsoft.SqlServer.Dts.Runtime

Imports System.IO

Public Class ScriptMain

Public Sub Main()

Dim sr As StreamReader
Dim Textline As String
Dim Fields As String()

sr = File.OpenText("C:\FileName.csv")

If Not sr.EndOfStream Then
sr.ReadLine() 'Skip 1 row (Header)
End If

If Not sr.EndOfStream Then
Textline = sr.ReadLine() 'Read in the entire line of text.
Fields = Textline.Split(Convert.ToChar(",")) ' Split the line of text into a string array.
Else
sr.Close()
sr.Dispose()

Dts.TaskResult = Dts.Results.Failure 'There were no records after the header. This might not be a failure in all implementations.
Exit Sub
End If

sr.Close()
sr.Dispose()

If Fields.Length <> 25 Then ' If there are not 25 fields in the record, raise error.
Dts.TaskResult = Dts.Results.Failure
Err.Raise(vbObjectError + 513, Nothing, "Field Count is Invalid! 25 expected, " + Fields.Length.ToString() + " received.")
Exit Sub
End If

Dts.TaskResult = Dts.Results.Success

End Sub

End Class

When this code runs, it generates an error if the field count is not correct.

If you need to check multiple file layouts, you can put the script task into a foreach container in SSIS and use database lookups to get the field length based on the filename, making this code totally reusable!

Also note... I actually converted this code into a C# dll and put it in the GAC on the SSIS server so all I have to do is call that dll in my script task. I will blog about that soon.

Hope this helps!

peace

Tuesday, December 2, 2008

Using GUID Data Type in SQL Command Task

I have now run into this problem several times, so I thought I would write it down.

Just had a colleague come to me with an issue of the SQL Command task not returning the data it should have.

Looking at his stored proc, everything seemed fine. The proc took in a UniqueIdentifier and returned a SELECT from a table.

The SQL Command task in SSIS passed a GUID and put the result set into an object variable.

However, by watching Profiler, we captured that the value being sent to the stored procedure parameter from the SQL Command task was not the GUID that was being processed in SSIS.

So on the parameter mapping tab we changed the data type of the parameter from GUID to VARCHAR(70).

After that the SQL Command returned the rows that were expected.

I hope this helps someone out there