映射
As explained in <>, each document in an index has a type.Every type has its own mapping or schema definition. A mappingdefines the fields within a type, the datatype for each field,and how the field should be handled by Elasticsearch. A mapping is also usedto configure metadata associated with the type.
We discuss mappings in detail in <>. In this section we"re goingto look at just enough to get you started.
[[core-fields]]==== Core simple field types
Elasticsearch supports the following simple field types:
[horizontal]String: :: string
Whole number: :: byte
, short
, integer
, long
Floating point: :: float
, double
Boolean: :: boolean
Date: :: date
When you index a document which contains a new field -- one previously notseen -- Elasticsearch will use <> to tryto guess the field type from the basic datatypes available in JSON,using the following rules:
[horizontal]JSON type: :: Field type:
Boolean: true
or false
:: "boolean"
Whole number: 123
:: "long"
Floating point: 123.45
:: "double"
String, valid date: "2014-09-15"
:: "date"
String: "foo bar"
:: "string"
NOTE: This means that, if you index a number in quotes -- "123"
it will bemapped as type "string"
, not type "long"
. However, if the field isalready mapped as type "long"
, then Elasticsearch will try to convertthe string into a long, and throw an exception if it can"t.
==== Viewing the mapping
We can view the mapping that Elasticsearch has for one or more types in one ormore indices using the /_mapping
endpoint. At the <> we already retrieved the mapping for type tweet
in indexgb
:
[source,js]
GET /gb/_mapping/tweet
This shows us the mapping for the fields (called properties) thatElasticsearch generated dynamically from the documents that we indexed:
[source,js]
{ "gb": { "mappings": { "tweet": { "properties": { "date": { "type": "date", "format": "dateOptionalTime" }, "name": { "type": "string" }, "tweet": { "type": "string" }, "user_id": { "type": "long" } } } } }
}
[TIP]
Incorrect mappings, such as having an age
field mapped as type string
instead of integer
, can produce confusing results to your queries.
Instead of assuming that your mapping is correct, check it!
[[custom-field-mappings]]==== Customizing field mappings
The most important attribute of a field is the type
. For fieldsother than string
fields, you will seldom need to map anything otherthan type
:
[source,js]
{ "number_of_clicks": { "type": "integer" }
}
Fields of type "string"
are, by default, considered to contain full text.That is, their value will be passed through an analyzer before being indexedand a full text query on the field will pass the query string through ananalyzer before searching.
The two most important mapping attributes for string
fields areindex
and analyzer
.
===== index
The index
attribute controls how the string will be indexed. Itcan contain one of three values:
[horizontal]analyzed
:: First analyze the string, then index it. In other words, index this field as full text.
not_analyzed
:: Index this field, so it is searchable, but index the value exactly as specified. Do not analyze it.
no
:: Don"t index this field at all. This field will not be searchable.
The default value of index
for a string
field is analyzed
. If wewant to map the field as an exact value, then we need to set it tonot_analyzed
:
[source,js]
{ "tag": { "type": "string", "index": "not_analyzed" }
}
The other simple types -- long
, double
, date
etc -- also accept theindex
parameter, but the only relevant values are no
and not_analyzed
,as their values are never analyzed.
===== analyzer
For analyzed
string fields, use the analyzer
attribute tospecify which analyzer to apply both at search time and at index time. Bydefault, Elasticsearch uses the standard
analyzer, but you can change thisby specifying one of the built-in analyzers, such aswhitespace
, simple
, or english
:
[source,js]
{ "tweet": { "type": "string", "analyzer": "english" }
}
In <> we will show you how to define and use custom analyzersas well.
==== Updating a mapping
You can specify the mapping for a type when you first create an index.Alternatively, you can add the mapping for a new type (or update the mappingfor an existing type) later, using the /_mapping
endpoint.
[IMPORTANT]
While you can add to an existing mapping, you can"t change it. If a fieldalready exists in the mapping, then it probably means that data from thatfield has already been indexed. If you were to change the field mapping, then
the already indexed data would be wrong and would not be properly searchable.
We can update a mapping to add a new field, but we can"t change an existingfield from analyzed
to not_analyzed
.
To demonstrate both ways of specifying mappings, let"s first delete the gb
index:
[source,sh]
DELETE /gb
// SENSE: 052_Mapping_Analysis/45_Mapping.json
Then create a new index, specifying that the tweet
field should usethe english
analyzer:
[source,js]
PUT /gb <1>{ "mappings": { "tweet" : { "properties" : { "tweet" : { "type" : "string", "analyzer": "english" }, "date" : { "type" : "date" }, "name" : { "type" : "string" }, "user_id" : { "type" : "long" } } } }
}
// SENSE: 052_Mapping_Analysis/45_Mapping.json
<1> This creates the index with the `mappings` specified in the body. Later on, we decide to add a new `not_analyzed` text field called `tag` to the`tweet` mapping, using the `_mapping` endpoint: ### [source,js] PUT /gb/_mapping/tweet{ "properties" : { "tag" : { "type" : "string", "index": "not_analyzed" } } ### } // SENSE: 052_Mapping_Analysis/45_Mapping.json Note that we didn"t need to list all of the existing fields again, as we can"tchange them anyway. Our new field has been merged into the existing mapping. ==== Testing the mapping You can use the `analyze` API to test the mapping for string fields byname. Compare the output of these two requests: ### [source,js] GET /gb/_analyze?field=tweetBlack-cats <1> GET /gb/_analyze?field=tag ### Black-cats <1> // SENSE: 052_Mapping_Analysis/45_Mapping.json <1> The text we want to analyze is passed in the body. The `tweet` field produces the two terms `"black"` and `"cat"`, while the`tag` field produces the single term `"Black-cats"`. In other words, ourmapping is working correctly.