Rules are defined in external files. When you run PDI the rules are parsed and executed against each row of the dataset. Rules are combined into rule groups. This allows you to connect the rules in a flexible way using "and" or "or" conditions. Many checks you might want to run against the data are predefined: check for equality, greater then, less then, not null, is in list, is null, is empty, etc. There are also other checks such as using regular expressions or soundex. A check will check two fields against each other or a field against a desired value.
The ruleengine step can output data to one or two output steps. The main output step adds a couple of fields to the stream showing the statistics of the results of the ruleengine: how many rules failed, how many rule groups failed, etc. The rule results step shows all the details of what happened when the rules were executed. For each row and rule an output row is created so that you can see in detail which rules failed and which ones passed.
Additionally rule results can trigger actions. So e.g. you can concat fields, set field values, prepend data to fields and then update another field.
The ruleengine is open source and free. It is also extendable: you can write your own actions if you need more complex ones or define other checks. Because it is written in Java, you have all