Ch3: Data In, Data Out (inverted indexes)

In the real world, not all entities of the same type looks the same. One person might just have a home telephone, another a cell # and another both, another none of these. In the RDBMS world, each entities demands its own column, or to be modelled as a KV pair. This leads to waste and redundancy. Brazillian people may have 10+ family names, a Brit one.

The problem comes when we need to store these entities. Traditionally, this as been accomplished through a RDBMS with columns and rows.

Of course, we don’t only need to store the data, we need to query, use the data. While noSQL solutons (eg MongoDB) exist that allow us to store objects as documents, they still require us to think about how we want to query our data, and which fields require an index to speed up retrieval.

In ES, all data in every field is indexed by default. Every field has a dedicated inverted index and unlike most other DBs, it can use all of those inverted indices in the same query

What is an inverted index?