3

I'm trying to create a custom script in SSIS 2008 that will loop over the selected input columns and concatenate them so they can be used to create a SHA1 hash. I'm aware of the available custom components but I'm not able to install them on our system at work.

Whilst the example posed here appears to work fine http://www.sqlservercentral.com/articles/Integration+Services+(SSIS)/69766/ when I've tested this selected only a few and not all columns I get odd results. The script only seems to work if columns selected are in sequential order. Even when they are in order, after so many records or perhaps the next buffer different MD5 hashes are generated despite the rows being exactly the same throughout my test data.

I've tried to adapt the code from the previous link along with these articles but have had no joy thus far. http://msdn.microsoft.com/en-us/library/ms136020.aspx
http://agilebi.com/jwelch/2007/06/03/xml-transformations-part-2/

As a starting point this works fine to display the column names that I have selected to be used as inputs

Public Overrides Sub Input0_ProcessInputRow(ByVal Row As Input0Buffer)
    For Each inputColumn As IDTSInputColumn100 In Me.ComponentMetaData.InputCollection(0).InputColumnCollection
        MsgBox(inputColumn.Name)
    Next
End Sub

Building on this I try to get the values using the code below:

Public Overrides Sub Input0_ProcessInputRow(ByVal Row As Input0Buffer)
    Dim column As IDTSInputColumn100
    Dim rowType As Type = Row.GetType()
    Dim columnValue As PropertyInfo
    Dim testString As String = ""

    For Each column In Me.ComponentMetaData.InputCollection(0).InputColumnCollection
        columnValue = rowType.GetProperty(column.Name)
        testString += columnValue.GetValue(Row, Nothing).ToString()
    Next

    MsgBox(testString)

End Sub

Unfortunately this does not work and I receive the following error:

SSIS Error Message

I'm sure what I am trying to do is easily achievable though my limited knowledge of VB.net and in particular VB.net in SSIS, I'm struggling. I could define the column names individually as shown here http://timlaqua.com/2012/02/slowly-changing-dimensions-with-md5-hashes-in-ssis/ though I'd like to try out a dynamic method.

mheptinstall
  • 1,959
  • 2
  • 22
  • 41
  • Almost all cases of NullReferenceException are the same. Please see "[What is a NullReferenceException in .NET?](http://stackoverflow.com/questions/4660142/what-is-a-nullreferenceexception-in-net)" for some hints. – John Saunders Jun 27 '12 at 20:15

2 Answers2

2

Your problem is trying to run ToString() on a NULL value from your database.

Try Convert.ToString(columnValue) instead, it just returns an empty string.

Kyle Hale
  • 7,447
  • 1
  • 32
  • 56
1

The input columns are not guaranteed to be in the same order each time. So you'll end up getting a different hash any time the metadata in the dataflow changes. I went through the same pain when writing exactly the same script.

Every answer on the net I've found states to build a custom component to be able to do this. No need. I relied on SSIS to generate the indexes to column names when it builds the base classes each time the script component is opened. The caveat is that any time the metadata of the data flow changes, the indexes may change and need to be updated by re-opening and closing the SSIS script component.

You will need to override ProcessInput() to get store a reference to PipelineBuffer, which isn't exposed in ProcessInputRow, where you actually need to use it to access the columns by their index rather than by name.

The list of names and associated indexes are stored in ComponentMetaData.InputCollection[0].InputColumnCollection, which needs to be iterated over and sorted to guarantee same HASH every time.

PS. I posted the answer last year but it vanished, probably because it was in C# rather than VB (kind of irrelevant in SSIS). You can find the code with all ugly details here https://gist.github.com/danieljarolim/e89ff5b41b12383c60c7#file-ssis_sha1-cs