This week’s question: How do I ensure fields are in a data stream?
This is a common issue when you are reading data that changes from time to time. If you use a Crosstab Tool or are reading from an XML file, the output fields can change based on the presence (or absence) of specific input data. This makes it very hard to write formulas, or really do any down stream processing on those fields. In particular, what we want to do is ensure fields are present, but NOT change them if they are already in the record.
So to start with the answer: EnsureFieldsOnDataStream.yxzp is a macro that makes this easy and adds very little overhead whether or not the fields are already present in the record. It wraps it up into a simple tool with a simple user interface. All you have to do is specify a list of fields and the type. Hopefully this leverages the macro ability in Alteryx to make a tool that is as good as it would have been if it was written in native code.
The basis of adding the fields is the Union tool. The Union tool’s main purpose is to combine 2 streams together into a single, longer, stream. It has the side effect of adding all the fields from all the inputs into the output. I utilize this side effect here with a second stream with no records, that just has the fields the user wants to add. If the fields were already present in the original stream, it has no effect.
I used a text input tool with no records in order to create the fields. Since the Text Input auto types its fields, with no records they all default to bool so it needs to be followed by a select tool to convert all the new fields to the proper type.
The macro would have been very easy if it asked for a single field. In that case it would just be simple actions to update the field name in the Text Input tool and the Select tool. Since I chose to make the tool accept any number of fields it was a little more complicated. In this case it has to use Update Raw XML actions and a slightly complex formula to manufacture the same XML that the GUI would have; had I specified multiple fields. If you are an advanced Alteryx developer, it is probably worth opening the macro to see how this works. This is a rarely used action (I hope) but it enables building macros that do all kinds of interesting things that would be impossible otherwise.
As a side note: I had a choice of connecting the ‘no record text input’ 1st or 2nd into the Union. Aesthetically, connecting it 2nd would have been nicer since it would add the new fields at the end and not change the order of existing fields. The downside is that it would cause the data in the stream to have to be cached to a temporary file (or memory if it fits) and stall processing. Definitely doesn’t meet the goal of low overhead.
Thanks for listening,
If you don’t yet have a copy of Alteryx yet, by all means go get one. Its free (for limited use)… http://www.alteryx.com/download