MongoDB Applied Design Patterns

Just started reading the O’Reilly book by Rick Copeland

I’m hoping to get some ideas for my project from chapter4: ‘Operational Intelligence’, Chapter6: ‘CMSs’ & Chapter 8: ‘Social Networking’.

By page 12 I had learned a new word ARITY: the number of arguments that a function can take.

Chapter one considers the pros and cons of referencing vs embedding. There’s some further info here and here.

The chapter concludes stating “Schema design in MongoDB tends to be more of an art than science. 

The two largest benefits to embedding subdocs are data locality within a document and the ability of MongoDB to make atomic updates to a document.

Weighing against these benefits is a reduction in flexibility as you have pre-joined your documents, as well as potential for problems if you have a high-arity relationship



Embedding vs Referencing Information in documents


Embedding vs. Referencing Information in Documents
From The Definitive Guide to MongoDB (Plugge, Membrey & Hawkins)

You can choose either to embed information into a document or reference that information in another document. Embedding information simply means that you place a certain type of data (e.g., an array containing more data) into the document itself.

Referencing information simply means that you create a reference back to another document that contains that specific data.

Typically, you reference information when you use a relational database.

If you used an RDBMS to model your CDs, DVDs and books collection, you’d probably  have one table for your CD collection and another table that stores the tracklists of your CDs. You’d need to join across >1 table to get a list of tracks from a specific CD.

With MongoDB , however, it would be much easier to embed such information instead. This keeps your database nice and tidy, ensures that all related information is kept in one single document, and even works much faster because the data is then co-located on the disk.
In the relational approach, your data structure might look something like this:



[columns] id, artist, title, genre, releasedate


[columns] cd_id, songtitle, length

In the non-relational approach, your data structure might look something like this:




In the noSQL approach, the document might look something like the following:
“Type”: “CD”,
“Artist”: “One Direction”,
“Title”: “Nevermind”,
“Genre”: “Kids!!!”,
“Release date”: “”,
“Tracklist”: [

“Track” : “1”,
“Title” : “Live While We’re Young”,
“Length” : “3:20”

“Track” : “2”,
“Title” : “Kiss You”,
“Length” : “3:03”

“Track” : “3”,
“Title” : “Unknown”,
“Length” : “m:ss”