Ch3: Data In, Data Out (inverted indexes)

In the real world, not all entities of the same type looks the same. One person might just have a home telephone, another a cell # and another both, another none of these. In the RDBMS world, each entities demands its own column, or to be modelled as a KV pair. This leads to waste and redundancy. Brazillian people may have 10+ family names, a Brit one.

The problem comes when we need to store these entities. Traditionally, this as been accomplished through a RDBMS with columns and rows.

Of course, we don’t only need to store the data, we need to query, use the data. While noSQL solutons (eg MongoDB) exist that allow us to store objects as documents, they still require us to think about how we want to query our data, and which fields require an index to speed up retrieval.

In ES, all data in every field is indexed by default. Every field has a dedicated inverted index and unlike most other DBs, it can use all of those inverted indices in the same query

What is an inverted index?

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s