According to Microsoft, they only check for the end of record indicator (Ususally a CR/LF) when processing the last column in the file. If you are not on the last column, and a CR/LF is encountered, it is ignored and the first field of the new line is added to the last column of the previous line.
Here is an example of what you see:
Now, when troubleshooting this error, you can be looking in the wrong place, since the error usually shows up only at the end of the file.
SSIS is set up to accept 3 columns in a CSV file: Col1, Col2 and Col3.
File comes in like this:
Col1,Col2
A1,A2
B1,B2
C1,C2
When ssis imports this file it produces this result:
Record 1: A1, A2B1, B2
Record 2: C1,C2 Error - Column Delimiter Not Found
So what I have been doing to gracefully catch and report on this is to quickly pre-process the file to count the columns. Since the processes that create the CSV files for my app are automated, I can assume that every record in the file has the same number of fields, and I only need to count 1 row. If you cannot assume this, it would be just as easy to loop thru each row and determine the field count.
I would also like to point out that my code does not yet handle commas embedded within quote delimited fields. For example, if your record looked like this:
1,"this is um, the field", 1,1I will be posting code for that next week sometime.
Ok, so here is what I am doing.
I set up a script task in the control flow, before I try to open the file with ssis. This is the code I use:
Imports System
Imports System.Data
Imports System.Math
Imports Microsoft.SqlServer.Dts.Runtime
Imports System.IO
Public Class ScriptMain
Public Sub Main()
Dim sr As StreamReader
Dim Textline As String
Dim Fields As String()
sr = File.OpenText("C:\FileName.csv")
If Not sr.EndOfStream Then
sr.ReadLine() 'Skip 1 row (Header)
End If
If Not sr.EndOfStream Then
Textline = sr.ReadLine() 'Read in the entire line of text.
Fields = Textline.Split(Convert.ToChar(",")) ' Split the line of text into a string array.
Else
sr.Close()
sr.Dispose()
Dts.TaskResult = Dts.Results.Failure 'There were no records after the header. This might not be a failure in all implementations.
Exit Sub
End If
sr.Close()
sr.Dispose()
If Fields.Length <> 25 Then ' If there are not 25 fields in the record, raise error.
Dts.TaskResult = Dts.Results.Failure
Err.Raise(vbObjectError + 513, Nothing, "Field Count is Invalid! 25 expected, " + Fields.Length.ToString() + " received.")
Exit Sub
End If
Dts.TaskResult = Dts.Results.Success
End Sub
End Class
When this code runs, it generates an error if the field count is not correct.
If you need to check multiple file layouts, you can put the script task into a foreach container in SSIS and use database lookups to get the field length based on the filename, making this code totally reusable!
Also note... I actually converted this code into a C# dll and put it in the GAC on the SSIS server so all I have to do is call that dll in my script task. I will blog about that soon.
Hope this helps!
peace
1 comment:
Boss thank you very much ,you saved a lot of time buy ur post.I had to make some changes in the process but the draft was good .
thanks
Post a Comment