SQL > Mongo?

I’ve previously talked about translating/moving data from Mongo to SQL, but how about the reverse of that?! 

Mongify
data translator for moving your SQL data to MongoDB.

Mongify helps you move your data without worrying about the IDs or foreign IDs. It even allows you to embed your data into other documents.

 

Link

Taming Text

51x8aYfiy7L._SL160_PIsitb-sticker-arrow-dp,TopRight,12,-18_SH30_OU01_AA160_[1]

Just beginning ‘Taming Text – How to find, organize and manipulate it’

  1. Getting started and taming text
  2. Foundations of taming text
  3. Searching
  4. Fuzzy string matching
  5. Identifying people, places and things
  6. Clustering text
  7. Classification, categorization and tagging
  8. Building an example question answering system
  9. Untamed text: exploring the next frontier

MapReduce links

http://wiki.summercode.com/mongodb_aggregation_functions_and_ruby_map_reduce_basics

https://github.com/shimondoodkin/nodejs-mongodb-app/wiki/explanation-of-map-reduce

http://www.mongovue.com/2010/11/03/yet-another-mongodb-map-reduce-tutorial/#comment-117

http://cookbook.mongodb.org/patterns/pivot/

http://stackoverflow.com/questions/9337343/mongodb-map-reduce-tutorial

(Example) Lamba Architecture

Lambda_Arch2-613x4881[1]
Batch
Hadoop
Cassandra (as storage engine)
MongoDB (as storage engine)
Serving
ElephantDB
Speed
Storm
   Fast retrieval K/V DB?

MongoImport

When running the mongoimport, the options to use are:
–collection – the collection to create or import into
–jsonArray- tells mongoimport to expect multiple documents, all contained within a single array.
–file – the file to be imported An example of the mongoimport run:

MongoDB Aggregation Framework

Image

Lynn, John (2013-06-17). MongoDB Aggregation Framework Principles and Examples

What is the aggregation framework?

The aggregation framework was created to provide a “means to calculate aggregated values without having to use map-reduce” according to the introduction in the MongoDB documentation at: http:// docs.mongodb.org/ manual/ core/aggregation/# overview . Not that there’s anything wrong with using map-reduce, but it’s often viewed as a complicated way to do a simple thing – calculate aggregate values. The aggregation framework simply offers a easier way to do this simple thing.”

A simple example: US states with population over 10 million

db.zipcodes.aggregate(
{$ group:{ _id:” $ state”, totalPop:{ $ sum:” $ pop”}}},
{$ match:{ totalPop:{ $ gte: 10* 1000* 1000}}}
)

The idea here is that the pipeline operators flow results from one operator to the next . So in this example, the documents from the zipcodes collection in its entirety are presented to the $group operator to work on; And then the documents coming out of $group are presented as input to $match.

$skip
// skip 1 document, starting from the beginning of the set of documents, and then output the rest. This will output all but the first document, which may be a lot of output
printjson( db.zipcodes.aggregate( { $ skip : 1} ))

$unwind
In our northwind sample collection of customer orders, notice that the line items in each order are contained in an array called orderItems. Each item in this array is a sub-document with the product category, supplier, product name, and pricing information contained in unitPrice, as well as quantity of the item ordered. A common aggregation to be performed is: calculating order total – the total for each order, based on the line items.
db.northwind.aggregate(
{$ match: {” orderId”: 10253}}
, {$ unwind: “$ orderItems”}
)

Now that the document for order 10253 has been exploded into 3 documents by $ unwind, a simple $ group operator can be utilized to get an order total.

db.northwind.aggregate( {$ match: {” orderId”: 10253}} , {$ unwind: “$ orderItems”} , {$ group: {_id: “$ orderId”, “OrderTotal”: {$ sum: “$ orderItems.unitPrice”
}}
)

$ project
The $ project operator is used to “re-shape” the documents, inserting and deleting fields or whole sub-documents; creating computed values using existing fields or constants as input; conditionally including parts of documents, and more. Using $ project, you can do some very creative manipulation of the documents in the pipeline.