Jaspersoft ETL // Pentaho Data Integration (Spoon)

Finally got the community edition of Jaspersoft installed. Was a JAVA VM issue. Updated to 64-bit java and now running. Previously had a ‘PATH’ error.

There’s a MongoDB connector or ‘adapter’ in Jaspersoft parlance. Looking good so far. Don’t forget to fire up mongod.exe to start up the DB…..but, It turns out that Jaspersoft just white label Talend and the Talend ETL download points to the wrong software (iStudio, which is a visual report design tool).Image

So I’m switching to Kettle (Pentaho) for ETL.

PDI

My aim for the day is to find some open data (maybe the Guardian) and push it into MongoDB, through Kettle, as a new collection. Let’s see!

Installed Kettle. Now need to install a MongoDB driver and stick it in the libext folder (you can extract the kettle zip installation to anywhere, i’ve extracted to C:\Kettle.

https://github.com/mongodb/mongo-java-driver/downloads
– Basically following Matt Casters’ tutorial http://www.ibridge.be/?p=196, although that now seems a bit outdated and the functionality now baked in to PDI, see the updated demo http://wiki.pentaho.com/display/EAI/MongoDB+Input

So far, so good. Connected to my MongoDB, read in the collection.

Reading data out of my local Mongo Collection and in to PDI for onward transformation etc

But, get an error in the flow. Looks like PDI can’t talk/write to MongoDB. Head-scratching…

...So near and yet so far!

Taking quite a like to Kettle (PDI) and working my way through the 674 pages in the excellent http://www.amazon.co.uk/Pentaho-Kettle-Solutions-Integration-ebook/dp/B0042JSLWO/ref=dp_kinw_strp_1