MongoDB’s “Aggregation Framework”

In my earlier post I built up a very simple report using the MongoDB connector in Jaspersoft iReport.

Today I noticed you can natively query MongoDB from within Jaspersoft. See more here

Some examples;

In SQL:

SELECT *
  FROM users
  WHERE age > 25
   AND   age <= 50
  ORDER BY age

When using Mongo:

{
   collectionName : 'users',
   findQuery : { 
        age: { $gt: 25, $lte: 50 } 
   },
   sort : {
        'age': 1
   }
}

See: http://docs.mongodb.org/manual/reference/sql-comparison/

An examples of using MongoDB’s ‘Aggregation Framework‘ which I think as pre-computed views/aggregations/filtered results sets

In SQL:
SELECT cust_id, ord_date, SUM(price) AS total
FROM orders
GROUP BY cust_id, ord_date
HAVING total > 250
ORDER BY total DESC

When using Mongo:

{
   runCommand: {
        aggregate : 'orders',
        pipeline : [
                {
                        $group : {
                                _id : { cust_id: '$cust_id', ord_date: '$ord_date' }
                                total: { $sum : '$price' }
                        }
                },
                { 
                        $match: { 
                                total: { $gt: 250 } } 
                        }
                },
                {
                        $sort : {
                                total : -1
                        }
                }
        ]
}

This got me curious about the aggregation frameworkHere’s another couple of examples of how it might be applied.

Each document in this collection has the following form:

{
  "_id": "10280",
  "city": "NEW YORK",
  "state": "NY",
  "pop": 5574,
  "loc": [
    -74.016323,
    40.710537
  ]
}

In these documents:

  • The _id field holds the zipcode as a string.
  • The city field holds the city.
  • The state field holds the two letter state abbreviation.
  • The pop field holds the population.
  • The loc field holds the location as a latitude longitude pair.

To Calculate States with Populations Over 10 Million

db.zipcodes.aggregate( { $group :
                         { _id : "$state",
                           totalPop : { $sum : "$pop" } } },
                       { $match : {totalPop : { $gte : 10*1000*1000 } } } )

MongoDB iReport, @kchodorow books

MongoDB iReport

Yippee! I can now write data (manually at the moment!) into MongoDB and read it out into iReport!

Big thanks to @mattdahlman

Tomorrow I’m going to explore unetbootin (thanks to Pete Griffiths) and maybe reformat the old dell to be a Linux machine. I can’t help but think having 3 different O/S in the eventual cluster is going to be troublesome.

I’ve also ordered 2 more @kchodorow books ‘50 Tips and Tricks for MongoDB Developers

http://www.amazon.co.uk/Tips-Tricks-MongoDB-Developers-ebook/dp/B005011IIM/ref=dp_kinw_strp_1

& ‘Scaling MongoDB’ – which hopefully will help me get my head around clustering/sharding within MongoDB.

http://www.amazon.co.uk/Scaling-MongoDB-ebook/dp/B004LRPBD4/ref=dp_kinw_strp_1

MongoDB connection from Jaspersoft ‘iReport’

I have created a few DBs (largely copying pp 137 from the excellent ‘7 databases in 7 weeks’) and manually entered a dozen or so rows. The Mongo equivalent of select * from is dbfind()

http://www.google.co.uk/url?sa=t&rct=j&q=&esrc=s&source=web&cd=2&cad=rja&ved=0CCgQFjAB&url=http%3A%2F%2Fwww.amazon.co.uk%2FSeven-Databases-Weeks-Modern-Movement%2Fdp%2F1934356921&ei=wzGiUIGQO82W0QXwp4HAAg&usg=AFQjCNGDmifztGqAE-XoQzebWHDB3KjYVg

And I’ve managed to install Japsersoft iReport.

So far, so good.

But…When I create a datasource connection from within iReport to MongoDB I am asked for 3 things
1 MongoURI
2 Username
3 Password

The pre-filled default for 1 is ‘mongodb://HOST:27017/DB_NAME’
I assume I substitute DB_NAME for towns (my recently created database) – but this doesn’t work.
I also assume I can leave a username and password as blank?
– As show users (in Mongo shell) returns nothing (and I haven’t yet created any)

So, brick wall. Have raised a JIRA @ MongoDB, hopefully someone can help me out!