将文档路由到从库中
When you index a document, it is stored on a single primary shard. How doesElasticsearch know which shard a document belongs to? When we create a newdocument, how does it know whether it should store that document on shard 1 orshard 2?
The process can"t be random, since we may need to retrieve the document in thefuture. In fact, it is determined by a very simple formula:
shard = hash(routing) % number_of_primary_shards
The routing
value is an arbitrary string, which defaults to the document"s_id
but can also be set to a custom value. This routing
string is passedthrough a hashing function to generate a number, which is divided by thenumber of primary shards in the index to return the remainder. The remainderwill always be in the range 0
to number_of_primary_shards - 1
, and givesus the number of the shard where a particular document lives.
This explains why the number of primary shards can only be set when an indexis created and never changed: if the number of primary shards ever changed inthe future, all previous routing values would be invalid and documents wouldnever be found.
All document APIs (get
, index
, delete
, bulk
, update
and mget
)accept a routing
parameter that can be used to customize the document-to-shard mapping. A custom routing value could be used to ensure that all relateddocuments -- for instance all the documents belonging to the same user -- arestored on the same shard. We discuss in detail why you may want to do this in<>.