For a while now I try to promote the idea of seperating the responsibilities in application maintenance. Many applications contain a mix of IT logic and Business logic. So the technical logic such as e.g. unzipping files, checking servers or folders, loading configuration files, accessing databases and much more is mixed with logic from the business: adjusting field values, manipulating data, calculating values. So why do we create this mixture of IT and business logic? Well sometimes the simple answer is, that in IT we have the best analytical thinking and the required tools. Another reason is that many times we are sitting in the middle: between a source system that can not be changed and a target system that takes a long time to change or is expensive to change (meaning it also can not be changed). ETL processes are a good example for this: IT has the tools and they are flexible, scalable and configurable. If source and target can not be changed, IT in the middle can do so. Many times IT is correcting, enhancing or streamlining data, because bad data comes from source systems and as indicated those are hard to change or the business processes capturing the data can not be changed. Changes are applied quicker because both experts are professionals in their domain. The IT application or process is not intermingled with business logic so IT can concentrate on a cleaner design without dependencies to business logic.
0 Comments
So here is the second part of the starter for Apache Nifi. Apache Nifi works with processors and connections between them. That's what you see on the flow above. Processors are sort of puzzle pieces that do a distinct task and then you connect them together to design a flow. Now the questions was how to use a common processing flow for both "GetTwitter" processors - not to duplicate things - and yet being able to devide the results into seperate files later on. I used the "UpdateAttribute" processor. It allows to assign properties, so I am tagging my different flows: they get a property named "tweettype" of "privatetweet" versus "worktweet". After this, the both path flow into the "EvaluateJsonPath" processor. This processor pulls out some attributes from the tweet.Next comes a "RouteOnAttribute" processor. It evaluates, if the tweet actually has a message assigned. So tweets without a message (an empty message) will be dropped. It uses the Nifi Expression Language to make the evaluation.Twitter tweets - in the form of Json data - contain a lot of information. Information I don't want to store. I am only interested in the information about the user and the message itself. So I had to look for a way to eliminate the rest of the information in the flow. I have setup my main logic by now. Now I want to store the results in two seperate folders - one for private tweets and one for work related tweets. I will evaluate the property "tweettype" which I assigned earlier and make the decision to route the data based on the value of this property. The property "private" will result in a true or false condition and when I connect the processor I can route the results based on this true or false condition. If property "private" is true, the result is routed to the "PutFile: Private" processor. If not then it is routed to the "PutFile: Work" processor. The "PutFile" processor saves the file to a given folder. I have setup two folders, one for the work related tweets and one for private ones. The property "Directory" defined where the file shall be stored. That's it. To summarize: I retrieve private and work related tweets, extract the information I am interested in and store it in different folders.
Two days ago I saw a tweet about Apache Nifi, got curious and had a deeper look into it. It immediately looked interesting to me and so I spent a couple of hours understanding the basics. I read through the excellent documentation and different posts. As with any new tool, at the beginning there are many open questions.
There are two distinct modes available with the Ruleengine JaRE:
|
AuthorUwe Geercken Categories
All
Archives
September 2020
|