When using the API to load and use a Pentaho PDI transformation, there is more to do than one would think at the beginning:
I really only want to find out, which input fields are used by the Rule Engine step. That sounds easy enough, but it's not: The API loads the transformation and has to evaluate all steps prior to the Rule Engine step to find out what fields are used.
If the transformation uses non-standard steps, then the web server serving the Rule Maintenance Tool needs to have those steps also installed.
If the transformation uses parameters - and e.g. those parameters are used in a SQL query - then the parameters have to be specified to successfully use the API.
If the Rule Maintenance Web App runs on a seperate server - where no PDI instance is available - then I end up in a situation where the webapp has to have the core Kettle/PDI jar files and the plugins. That is a lot of work for the people setting up the web app. Also, it makes the web app much more complex and in case there is a new PDI version or new versions of steps/plugins, then there is also some work to be done so that the web app works properly.
I wish PDI had a way to cache fields - names and types. So that it would be easy to retrieve fieldnames for a step without the requirement to have half of the PDI tool available.
But for the moment I will stop at the point where I am and think if there is maybe a different way of how to implement a similar functionality: all I need is the field names and types to make it easier for the user to write rules based on these fields.