- A screenshot of a basic example to process a CSV file, run the ruleengine and create files in the filesystem for each row of data that passed the business logic
- A screenshot which does the same as above but additionally outputs the detailed results of running the ruleengine. The generated file for the detailed results show why rules passed or failed.
Here are two screenshots from my Nifi flow:
I created an ExecuteRuleEngine processor for Apache Nifi. It allows to check the incomming flow file against defined business rules.
But the ruleengine can not only check the data, but also update it. Based on defined ruleengine actions the value of the individual field(s) may be changed. Actions may e.g. concat the values of fields, do mathematical calculations, do date calculations, trim field values, set a field to a constant value and more.
You can filter the data and route it to another processor. Or you can also update and filter the data and route it.
The busines rules can be defined and orchestrated using the Business Rules Maintenance Tool - a web application which is freely available. Once the logic is complete, it can be exported into a single project zip file. This file is used by the ExecuteRuleEngine processor.
Now if the logic changes - because the project zip file is external to Nifi - you won't have to change your Nifi flow - just update the logic in the Business Rules Maintenance Tool and export the project. So you won't need to hardcode logic in your flow and constantly adjust the flow if logic changes.
All components are available on my GitHub account. All is open source under the Apache licence.
I spent about a week now working with Asciidoc. It is a really, really good way of writing documentation. And much simpler than using a word processor application. AND much more consistent. You can use any text editor. Atom is a very good one to use.
So I started converting some of my documentation to asciidoc. From there it is easy to output to presentation slides (Slidy), HTML and to PDF. All out of the box and easy. I also created some scripts for Nautilus (file manager) that do the conversion from within the file manager.
Have a look - it's really good: Asciidoc
Just wanted to leave a quick note here, that I have changed the license of all my software to the Apache License.
So with immediate effect all versions are according to this license. The relevant remarks and notices can be found in various readme files or in the source code.
Feel free to use my software in accordance to this license agreement.
I have worked on the ruleengine several years now. It is in a mature state and at least two big companies are using it in production. So it was naturally for me to look around and see where I can integrate it. So I have created a plugin for the Pentaho ETL tool. I also did a proof-of-concept for Hadoop Mapreduce - see my post on the Raspberry Pi Cluster. And now I have also created a processor for Apache Nifi for the RuleEngine.
The ruleengine will allow you to maintain complex business logic outside your Nifi flow. This is done through a web application that is freely available. You can check the content of flow files and the ruleengine will return some indicators, if the content is according to the defined business rules/logic.
These checks could be realized inside the Nifi flow as well, but as the logic gets more complex, if you implement it in the flow, the flow gets more cluttered. A cluttered flow is more difficult to understand and work on and contradicts the agile way for working with changed requirements. Embedding you business logic in the flow is also a quality issue: if there is no central place for keeping the business logic, it will be spread all over the flow. Over time the logic grows and it gets harder and harder to overview and understand it.
The other argument in this discussion is, that the business experts should define the business logic and not the IT expert. This is a clear devision of responsabilities and each expert is doing what he or she is best at in his/her domain.
Also, when you use the ruleengine, you have a central place for your business logic. That means if somebody wants to understand what logic is in place, then he does not need to search for it in various tools and applications and IT code, but it is there in the Business Rules Maintenance Tool (web application) to be reviewed. Again, this enhances quality and avoids errors.
Here is now an example of a Nifi flow using the ruleengine. The flow conists of 6 processors:
The other component that you will need is the Business Rules Maintenance Web application. It's an application to construct and orchestrate your business logic. It requires Apache Tomcat and a MySql/Mariab database. Download it from GitHub here. Read the instructions on the GitHub page to find out how to get started. Basically first download the database schema and import it to the database. Then put the "war" (web archive) file into the Tomcat "webapps" folder and the web application should be ready to be used.
Here is a screenprint of the application after I logged in and clicked on "Projects" in the menu on the left side:
The screenprint shows a project I created to be used with the Nifi flow shown above. Once the business rules for the project "Test Nifi 2" are completed, I will export the project. This creates a single Zip file containing all the business logic. In the RuleEngine processor in Nifi I will reference this Zip file. When the processor runs it will parse the Zip file and execute the business rules against the flow file content.
Clicking on the project name "Test Nifi 2" will reveal the groups that are inside the project. Groups bundle the logic for a certain purpose together. As there is only one group here in this example, I skip this and show you what is inside the group:
The group contains one subgroup. You can have multiple subgroups and within one subgroup the rules are connected either by an "and" or an "or" condition. Multiple subgroups can also be connected using an "and" or an "or" condition. This way you can created any complex logic you can think of. It sounds difficult but really after you have used it a bit it is very easy and straight ahead.
Here is a sample from the CSV file I use. It contains free information about geographical placenames. You can download it from geonames.org. The individual fields are separated by a tab character. Besides other information each line has a feature code (in this case: "SLP") and also timezone information ("Europe/Andorra").
3038814 Costa de Xurius Costa de Xurius 42.50692 1.47569 T SLP AD 07 0 1839 Europe/Andorra 2015-03-08
The subgroup shown contains 2 rules. A rule checks a field against another field or checks a field against a fixed value. The first rule checks if the feature code equals to "SLP" and the second rule checks if the timezone starts with "Europe". The rules are connected with an "and" condition (defined in the subgroup) and so both conditions must be true so that the group passes the business logic. Otherwise - of course - the group fails.
And there is an action. Actions are related to the group as a whole. An action is fired either if the group of rules passes the logic or if it fails the logic. In this case if the data passes the logic, the timezone field is updated to contain "XXXX".
The project in the web tool can be exported as a Zip file as mentioned above. It contains all the logic as discussed above. In the Nifi RuleEngine processor, reference the Zip file as shown below. Specify the names of the individual fields of the CSV row. If you do this, then in the rules you can reference these field names. Otherwise you will need a use the index of the fields which is more cumbersome. Also specify how the individual fields of the rows are separated from each other (here: a tab character).
Now you can run the Nifi flow. It will get the CSV file, split it into individual rows and pass it to the ruleengine. The ruleengine will run all rules and actions against the data and the result is a "passed" or "failed". The ruleengine will add some attributes to the flow file indicating how many rules and actions ran, how many failed, etc. One interesting attribute is: how many groups failed. As a group is a container for the logic that belongs together, you can use this attribute to decide how the flow file continues in the flow.
If you have e.g. 5 groups, then you can decide that only if zero groups failed, the flow file is routed onwards. The other way around, you can decide that in case all groups failed the flow file is routed somewhere else. You have many possibilities here.
In the flow I use the RouteOnAttribute processor after the RuleEngine processor. Here I evaluate if the number of failed groups is equal to zero. And then I pass there flow files onwards to the PutFile processor.
I have run the Nifi flow against the CSV file from geonames.org with the business logic shown above. I only used 100 lines from the CSV file. From these 100 lines, 6 lines passed the business logic. This is the result from one file that was stored in the filesystem by the PutFile processor.
3038838 Costa Verda Costa Verda 42.48297 1.66086 T SLP AD 08 0 2572 XXXX 2015-05-06
The row from the CSV passed the business logic (zero failed groups), as the faeture code field value is "SLP" and the timezone field value starts with "Europe" (this is the logic from the rules discussed above). And as the group passed the business logic, also the action was fired: it updated the timezone field value to "XXXX".
But now comes the really interesting part: If you change the business logic, then you DO NOT have to touch your flow at all! Update the business logic in the Business Rules web application and export the project again. The RuleEngine processor in the flow will now use the new business logic definition and produce the new result.
E.g. you could add a rule checking if the name (second column) equals "Costa Verda". So only rows where the feature code is "SLP" and the timezone starts with "Europe" AND where the name is "Costa Verda" would pass the business logic.
In the screenprint below, you can see that I added this rule to check the name column. I exported the project again and used it with the Nifi flow. This time, out of the 100 input lines, the ruleengine found only 2 rows that passed the business logic.
So you have a flexible way of defining and changing your business logic. In fact this could be done by the domain expert for this logic. Your flow is cleaner and stay clean even if the business logic changes.
I hope you enjoyed this blog post. Next time I will blog about how to see the detailed results of what happened when the ruleengine ran. You can output these details also to flow files and it will show you exactly which rule passed or failed and why.