# 管理单元

ES 的管理单元从上到下依次分为：

index
type（已弃用）
document

# index

ES 中可以创建多个索引（index）。
index 名称可以包含 [0-9a-z._-] 字符，只能以字母或数字开头。

# 增

相关 API ：

PUT   /<index>  [request_body]    # 创建索引，可以在 request_body 中加上配置信息，如果不加则使用索引模板的默认配置

例：直接创建索引，这会采用默认配置

[root@CentOS ~]# curl -X PUT 127.0.0.1:9200/test_log?pretty

如果创建成功，则会收到以下响应：

{
  "acknowledged" : true,          # 表示操作成功
  "shards_acknowledged" : true,   # 表示已经启动分片
  "index" : "test_log"            # 索引名
}

如果请求创建的索引已存在，则会返回 HTTP 400 。

例：在创建索引时附带配置信息

PUT /test_log
{
  "aliases": {},
  "mappings": {
    "properties": {
      "pid": {
        "type": "long"
      },
      "level": {
        "type": "keyword",
        "ignore_above": 256
      },
      "message": {
        "type": "text"
      }
    }
  },
  "settings": {
    "index": {
      "refresh_interval": "1s",
      "number_of_shards": "1",
      "number_of_replicas": "0"
    }
  }
}

index 的配置主要分为 aliases、mappings、settings 三种。
创建 index 时，未指定的配置参数，会继承 index template 中的默认配置。

# 查

相关 API ：

GET   /<index>                  # 查询索引的配置信息，包括 alias、mappings、settings
GET   /<index>/_alias
GET   /<index>/_mapping
GET   /<index>/_settings
GET   /<index>/_stats           # 查询索引的统计信息

HEAD  /<index>                  # 检查索引是否存在。如果查询的所有索引都存在，则返回 HTTP 200 ，否则返回 HTTP 404
GET   /_resolve/index/<name>    # 解析一个名称，返回与之匹配的所有的 indices、aliases、data_streams

<index> 可以填一个或多个索引名，语法参考 index pattern 。

# 改

相关 API ：

PUT   /<index>/_alias     <request_body>  # 修改索引的别名
PUT   /<index>/_mapping   <request_body>
PUT   /<index>/_settings  <request_body>  # 修改索引的配置。这是增量修改，不会覆盖未指定的配置参数

POST  /<index>/_close   # 关闭索引。该索引不会载入内存，禁止读、写操作，可以删除索引
POST  /<index>/_open    # 重新打开已关闭的索引

# 删

相关 API ：

DELETE /<index>         # 删除索引。如果该索引不存在，则会返回 HTTP 404

# type

官方计划逐步取消 type 功能：

旧版本的 ES ，创建索引时，允许在 mappings 中定义多个 type ，用于写入多种数据结构的文档，相当于 SQL 数据库的多张 table 。格式如下：

PUT /my-index-1
{
  "mappings": {
    "_default_": {      # 该 type 名为 _default_ ，会用作新增文档时的默认类型
      "dynamic_templates": [...],
      "properties": {...}
    },
    "type-1": {
      "dynamic_templates": [...],
      "properties": {...}
    },
    "type-2": {
      "dynamic_templates": [...],
      "properties": {...}
    }
  }
}

ES v6.0 开始，创建索引时，只能定义一个 type ，默认命名为 _doc 。格式如下：

PUT /my-index-1
{
  "mappings": {
    "_doc": {
      "dynamic_templates": [...],
      "properties": {...}
    }
  }
}

ES v7.0 开始，创建索引时，建议不包含 type 。格式如下：

PUT /my-index-1
{
  "mappings": {
    "dynamic_templates": [...],
    "properties": {...}
  }
}

ES v8.0 开始，创建索引时，禁用 type 。

# document

一个 index 中可以写入多个 JSON 格式的文档（document）。

# 增

相关 API ：

POST  /<index>/_doc   <request_body>  # 新增文档

新增文档时，可以附加以下 URL 参数：
- timeout
  - 默认为 1m 。
- op_type
  - ：操作类型。如果在 URL 中指定了文档 _id ，则默认为 index 类型，否则为 create 类型。
  - 若为 create ，则不存在相同 _id 的文档时才能写入，否则报错：version conflict, document already exists
  - 若为 index ，则写入的文档会覆盖相同 _id 的文档。
- refresh
  - 默认为 false 。
  - 若为 true ，则立即执行一次 refresh ，刷新完之后才返回响应。注意只会刷新当前修改的 Shards ，不会影响所有 Shards 。
  - 若为 wait_for ，则等到下一次定时刷新之后，才返回响应。
  - 建议尽量不要主动 refresh ，而是由 refresh_interval 自动刷新索引。因为 refresh 的开销较大，每分钟几千次 refresh 会大幅增加 ES 的 CPU、内存负载。
- wait_for_active_shards
  - ：等待写操作同步到多少个 shard ，才返回响应。
  - 默认为 1 ，即只需写入 1 个主分片，不需要同步到副分片。
- routing
  - ES 会计算 hash(routing) % number_of_shards = shard_id ，决定将 HTTP 请求路由到当前索引的哪个分片。
  - routing 参数默认等于文档的 _id ，也可以指定其它值。
  - 因此，ES 创建索引之后不支持修改 number_of_shards ，否则路由关系变化，不能定位已有的文档。

例：新增文档

curl -X POST 127.0.0.1:9200/test_log/_doc?pretty -H 'content-Type:application/json' -d '
> {
>   "pid": 1,
>   "level": "INFO",
>   "message": "process started."
> }'

响应如下：

{
  "_index" : "test_log",            # 该文档所属的索引名
  "_id" : "ZhzMiXABzduhqPWX7mUX",   # 同一 index 中，每个文档都有一个取值唯一的 _id
  "_version" : 1,         # 一个从 1 开始递增的数字，表示该 _id 的文档经过了几次写操作（包括 POST、PUT、DELETE）
  "result" : "created",   # 表示成功创建了该文档
  "_shards" : {
    "total" : 1,          # 总共尝试写入多少个分片（包括主分片、副分片）
    "successful" : 1,
    "failed" : 0
  },
  "_seq_no" : 0,          # 一个从 0 开始递增的数字，表示这是对该 index 的第几次写操作
  "_primary_term" : 1     # 一个从 0 开始递增的数字，每次副分片变成主分片，就会将该值加 1
}

新增文档时，如果目标 index 不存在，则 ES 会自动创建该 index ，采用默认的索引配置。
新增文档时，如果不指定 _id 字段的值，则 ES 会自动为 _id 字段赋一个随机值。

# 查

相关 API ：

GET   /<index>/_doc/<_id>   # 查询指定 _id 的文档的内容
HEAD  /<index>/_doc/<_id>   # 查询指定 _id 的文档是否存在

GET   /_count               # 查询所有索引包含的文档数
GET   /<index>/_count       # 查询指定索引包含的文档数

例：查询指定 _id 的文档

GET /test_log/_doc/ZhzMiXABzduhqPWX7mUX

如果没找到，则返回响应：

{
  "_index": "test_log",
  "_id": "ZhzMiXABzduhqPWX7mUX",
  "found": false
}

如果找到了，则返回响应：

{
  "_index" : "test_log",
  "_id" : "ZhzMiXABzduhqPWX7mUX",
  "_version" : 1,
  "_seq_no" : 0,
  "_primary_term" : 1,
  "found" : true,       # 表示找到了该文档
  "_source" : {         # _source 字段记录了该文档的原始 JSON 内容，即新增该文档时的 POST body
    "pid" : 1,
    "level" : "INFO",
    "message": "process started."
  }
}

# 改

相关 API ：

POST  /<index>/_doc/<_id>   <request_body>      # 写入文档，覆盖指定 _id 的文档

POST  /<index>/_update_by_query  [request_body] # 修改与 DSL 查询条件匹配的文档

# 删

相关 API ：

DELETE /<index>/_doc/<_id>  # 删除指定 _id 的文档

POST   /<index>/_delete_by_query  [request_body]  # 删除与 DSL 查询条件匹配的文档

# _bulk

该 API 用于批量处理多个文档，比一次请求只处理一个文档更高效。

用法示例：

准备一组 JSON 文本，描述需要处理的多个文档：

{ "index" : { "_index" : "test_log", "_id" : "1" } }
{ "pid" : 1, "level" : "INFO"}

{ "create" : { "_index" : "test_log", "_id" : "1" } }
{ "pid" : 1, "level" : "INFO"}

{ "update" : {"_index" : "test_log", "_id" : "1"} }
{ "pid" : 1, "level" : "WARN"}

{ "delete" : { "_index" : "test_log", "_id" : "1" } }

每个文档需要描述两行信息，格式如下：

{ action: { metadata }}
{ request_body        }

action 表示操作类型，分为多种：

create    # 新增文档。如果不指定 id ，则自动生成。如果指定了 id ，且已存在相同 id 的文档，则不新增
index     # 写入文档。如果不指定 id ，则自动生成。如果指定了 id ，且已存在相同 id 的文档，则覆盖它
update    # 更新文档。如果不指定 id ，则自动生成。如果指定了 id ，且已存在相同 id 的文档，则更新它的部分内容
delete    # 删除文档，必须指定 id ，不需要带上 request_body

向 ES 的 _bulk API 发送 POST 请求：

curl -X POST 127.0.0.1:9200/_bulk?pretty -H 'content-Type:application/json' --data-binary "@test_log.json"