MongoDB Applied Design Patterns

Just started reading the O’Reilly book by Rick Copeland
Image

I’m hoping to get some ideas for my project from chapter4: ‘Operational Intelligence’, Chapter6: ‘CMSs’ & Chapter 8: ‘Social Networking’.

By page 12 I had learned a new word ARITY: the number of arguments that a function can take.

Chapter one considers the pros and cons of referencing vs embedding. There’s some further info here and here.

The chapter concludes stating “Schema design in MongoDB tends to be more of an art than science. 

The two largest benefits to embedding subdocs are data locality within a document and the ability of MongoDB to make atomic updates to a document.

Weighing against these benefits is a reduction in flexibility as you have pre-joined your documents, as well as potential for problems if you have a high-arity relationship

 

Advertisements

MongoDB – embedded documents

Suppose we have a document representing a person and later on are provided with his/her address information. We have a design choice to create 2 separate documents or to link the two by nesting the secondary data (address) into the primary document (person)

{
“name” : “Stuart Hayes”,
“address” : {
“street” : “Addison Road”,
“county” : “Kent”,
“country”: “UK”
}
}

We could even make use of an array for ‘horizontal’ data

{
“name” : “Stuart Hayes”,
“address” : {
“street” : “Addison Road”,
“county” : “Kent”,
“country”: “UK”,
“phone numbers” : [“07813 241 390”, “home not provided”, null]
}
}

You can begin to feel how working with embedded documents and denormalsing may provide a more natural representation of the data. In an RDBMS, these would probably be modelled as two separate rows in two separate tables (tbl_people, tbl_addresses), although in the eventual presentation layer, they may be flattened into a star-schema. I’m wondering if this sort of model wouldn’t work well as the end data presentation layer – a sort of flattened structure/star-schema.

The flip-side of this is more data repetition within MongoDB. If the normalised wold, if an address changed and we processed the change in the 3NF DB and then we joined tbl_people and tbl_address, we’d get the updated address cascading for all people at the changed address. In this embedded example, we’d have to update the address in each person’s document.

A post on querying within arrays is here