Chapter 12: Structured search

Combining Filters

SELECT product
FROM products
WHERE (price 20 OR productID = ‘XXYYZZ’)
AND (price !=30)

Bool Filter

The bool filter is comprised 3 sections

“bool” : {
“must”:          [ ],
“should”:       [ ],
“must_not”:   [ ],

-MUST: All of the clauses must match. The equivalent of AND
-SHOULD: At least ONE of the clauses must match. The equivalent of OR
-MUST_NOT: All of the clauses must NOT match. The equivalent of NOT


To replicate the preceding SQL example, we will take the two term filters that we used previously and place them inside the should clause of a bool filter, and add another clause to deal with the NOT condition:

GET /my_store/products/_search
   "query" : {
      "filtered" : { 
         "filter" : {
            "bool" : {
              "should" : [
                 { "term" : {"price" : 20}}, 
                 { "term" : {"productID" : "XHDK-A-1293-#fJ3"}} 
              "must_not" : {
                 "term" : {"price" : 30} 
Note that we still need to use a filtered query to wrap everything.
These two term filters are children of the bool filter, and since they are placed inside the should clause, at least one of them needs to match.
If a product has a price of 30, it is automatically excluded because it matches a must_notclause.

Our search results return two hits, each document satisfying a different clause in the bool filter:

"hits" : [
        "_id" :     "1",
        "_score" :  1.0,
        "_source" : {
          "price" :     10,
          "productID" : "XHDK-A-1293-#fJ3" 
        "_id" :     "2",
        "_score" :  1.0,
        "_source" : {
          "price" :     20, 
          "productID" : "KDKE-B-9947-#kL5"
Matches the term filter for productID = "XHDK-A-1293-#fJ3"
Matches the term filter for price = 20

SQL > Mongo?

I’ve previously talked about translating/moving data from Mongo to SQL, but how about the reverse of that?! 

data translator for moving your SQL data to MongoDB.

Mongify helps you move your data without worrying about the IDs or foreign IDs. It even allows you to embed your data into other documents.




What is mongoSQL?

mongoSQL is a free Mac, Windows, Linux/Unix and Web client UI application that talks directly to MongoDB by MongoDB, Inc.  It does not require any other server software other than MongoDB.  SQL scripts can be created, executed, saved and opened from the UI.  When executed, the SQL is translated behind the scenes into the appropriate MongoDB API calls.  Data can be inserted, updated, deleted and selected.  Selected documents are returned and displayed.  The user only has to work with familiar SQL syntax and not with the JSON and/or Javascript that the MongoDB shell and APIs require.  (Although for more expert MongoDB users, the SQL can be dissected and the corresponding JSON and/or Javascript will be displayed.)


Why mongoSQL?

Using SQL to query and filter unstructured data, such as that in Big Data databases like MongoDB, is sometimes debated.  Many people believe that SQL is not really the best tool.  While that is true sometimes, for many jobs it is definitely up to the task.  It carries with it some of the same limitations as MongoDB, such as no joins.  But it supports simple and complex SQL along with aggregation that can optionally use either MongoDB’s aggregation framework or map reduce.  There are no architectural changes necessary because mongoSQL is a client UI.

The most likely users of mongoSQL are people who:

– Are more comfortable with SQL than with writing JSON / Javascript when specifying conditions, aggregations, sorts, etc.

– Do not want to commit architecturally yet to additional Big Data tools.

– Are not proficient with the available Big Data tools or do not require much of the functionality of these tools.

– Want to use a friendly SQL UI.

– Need to vet and possibly modify MongoDB data directly.

– Want to translate SQL into the necessary JSON / Javascript required by the MongoDB shell or API.


“The Definitive Guide to MongoDB”

Just started reading this.

Looks like there are some useful chapters and sections, particularly:

3 The Data Model
4 Working with data
10 Optimisation
11 Replication
12 Sharding

“MongoDB is a new breed of database that has no concept of tables, schemas, SQL or rows. It doesn’t have transactions, ACID compliance, joins, FKs, or many of the other features…MongoDB is probably a very different DB to what you’re used to, especially if you’ve used an RDBMS in the past.”

“For many years, traditional relational DBs have been used for storing content of all types. It didn’t matter whether the data was a good fit for the relational model; the data was stuffed in there, anyway.”

“While using the DB as a storage bin works, developers always have to work against the flow. It’s usually obvious when we’re not using the DB the way it’s intended.” – usually results in multiple tables, multiple joins, complex and poor-nonperforming queries.

“MongoDB is not an ideal candidate for all situations…For example, its lack of transaction support means that you wouldn’t want to use MongoDB to write an accounting application..Once you’re comfortable with the idea that MongoDB may not solve all your problems, you will discover that there are certain problems that MongoDB is perfect for resolving, such as analytics (think a realtime Google Analytics for your website) and complex data structures (e.g blog posts and comments, Tweets).

“You don’t have to worry about squeezing your data into a table – just put the data together and pass it to MongoDB for handling.”

“Consider this real-world example. I needed to store a set of search results (up to 100) and I needed an easy way to associate the results with users in my DB.  Had I been using mySQL, I would have had to design a table to store the data (with potentially dozens of redundant columns), write code to insert the data, and then write code (joins) to piece it all back together again…In MongoDB I added this line of code:

request['search_results'] = search_results_array

In the preceding example,

request is my document
search_results is the key
search_results_array contains the results output

The second line saves my changes and writes to MongoDB.

Map/Reduce Queries in MongoDB – for relational dummies (like me!)

Thanks to my ex-colleague and Genius Dina Mohammad for sharing this with me.

And an example of the different query constructs in SQL vs NoSQL worlds, although the example uses Map/Reduce rather than native querying. I know which I prefer!

SELECT person, SUM(score), AVG(score), MIN(score), MAX(score), COUNT(*)
FROM demo
WHERE score > 0 AND person IN (‘bob’,’jake’)
GROUP BY person;{
“key”: {
“person”: true
“initial”: {
“sumscore”: 0,
“sumforaveragescore”: 0,
“countforaveragescore”: 0,
“minscore”: 0,
“maxscore”: 0,
“countstar”: 0
“reduce”: function(obj, prev) {
prev.sumscore += obj.score;
prev.sumforaveragescore += obj.score;
prev.minscore = Math.min(prev.minscore, obj.score);
prev.maxscore = Math.max(prev.maxscore, obj.score);
“finalize”: function(prev) {
prev.averagescore = prev.sumforaveragescore / prev.countforaveragescore;
delete prev.sumforaveragescore;
delete prev.countforaveragescore;
“cond”: {
“score”: {
“$gt”: 0
“person”: {
“$in”: [“bob”, “jake”]

MongoDB Semantics

Now, I need to begin losing my relational vocabulary and start talking Mongo !

Some useful additional background here:
The MongoDB query language is not SQL, but 10gen describes it as a simple, expressive language with a straightforward syntax for efficient querying. Examples of simple query statements include “sum,” “min,” “max,” and “average.” These sorts of operators would be familiar to any database veteran or analyst, and they’re applied in a real-time data-processing pipeline that delivers sub-second performance, according to 10gen.

Other available query statements include “project,” which is used to select desired attributes and ignore everything else. “Group” lets you combine results with desired attributes. “Match” is a filter than can be used to eliminate documents from a query. “Limit,” “skip” and “sort,” are statements used in much the same way they’re used in SQL: to limit a query to a desired number of results, to skip over a given number of results, and to sort results alphabetically, numerically or by some other value.

SQL Terms/Concepts MongoDB Terms/Concepts
database database
table collection
row document or BSON document
column field
index index
table joins embedded documents and linking
primary keySpecify any unique column or column combination as primary key. primary keyIn MongoDB, the primary key is automatically set to the _idfield.
aggregation (e.g. group by) aggregation frameworkSee the SQL to Aggregation Framework Mapping Chart.

And query operators

WHERE $match
GROUP BY $group
HAVING $match
SELECT $project
ORDER BY $sort
LIMIT $limit
SUM() $sum
COUNT() $sum
join No direct corresponding operator; however, the $unwind operator allows for somewhat similar functionality, but with fields embedded within the document.

+ Some example query statements

SQL                                                           MongoDB


FROM users
WHERE status = "A"
    { status: "A" } )
FROM users
WHERE status = "A"
OR age = 50
    { $or: [ { status: "A" } ,
             { age: 50 } ] } )
FROM users
WHERE age > 25
AND   age <= 50
   { age: { $gt: 25, $lte: 50 } } )

And some really helpful analytical scenarios here