(Some) Sharding options

  1. Shard by time
  2. Shard by a semi-random key
  3. Shard by an evenly distributed key in the data set
  4. Shard by combining a natural and synthetic key

There are pros and cons for each – for example – with option #1 – all inserts (same timestamp) will flow to the same shard. Most reads will tend to cluster on the same shard, assuming the users query the recent data more frequently.

Sharding

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s