Elasticsearch学习

栏目: 服务器 · Apache · 发布时间: 5年前

内容简介:Elasticsearch是一个基于Apache Lucene(TM)的开源搜索引擎(以下简称ES),是目前全文搜索引擎的首选。它可以快速存储、搜索和分析海量数据,Github,StackOverflow都在采用它。ES对照RMDB快速了解ES基本组成,它可以包含多个索引(indices)(数据库),每一个索引可以包含多个类型(types)(表),每一个类型包含多个文档(documents)(行),然后每个文档包含多个字段(Fields)(列),简化如下:索引

目录

    1. 3. 查看集群有哪些索引
    2. 6. 新增文档并建立索引
    3. 11. 查询索引的表和字段定义
    4. 12.查询DSL(Domain Specified Language,特定领域的语言 )

Elasticsearch是一个基于Apache Lucene(TM)的开源搜索引擎(以下简称ES),是目前全文搜索引擎的首选。它可以快速存储、搜索和分析海量数据,Github,StackOverflow都在采用它。

一、ES组成

ES对照RMDB快速了解ES基本组成,它可以包含多个索引(indices)(数据库),每一个索引可以包含多个类型(types)(表),每一个类型包含多个文档(documents)(行),然后每个文档包含多个字段(Fields)(列),简化如下:

索引 -> 数据库

类型 ->

文档 ->

字段 ->

二、常用查询命令

1. 查看_cat相关命令

GET /_cat/

结果:

➜  ~ curl -i -XGET http://192.168.11.119:9200/_cat/
HTTP/1.1 200 OK
content-type: text/plain; charset=UTF-8
content-length: 493

>=^.^=
/_cat/allocation
/_cat/shards
/_cat/shards/{index}
/_cat/master
/_cat/nodes
/_cat/tasks
/_cat/indices
/_cat/indices/{index}
/_cat/segments
/_cat/segments/{index}
/_cat/count
/_cat/count/{index}
/_cat/recovery
/_cat/recovery/{index}
/_cat/health
/_cat/pending_tasks
/_cat/aliases
/_cat/aliases/{alias}
/_cat/thread_pool
/_cat/thread_pool/{thread_pools}
/_cat/plugins
/_cat/fielddata
/_cat/fielddata/{fields}
/_cat/nodeattrs
/_cat/repositories
/_cat/snapshots/{repository}
/_cat/templates

2.查看集群健康

GET /_cat/health?v

结果:

➜  ~ curl -XGET http://192.168.11.119:9200/_cat/health\?v
epoch      timestamp cluster       status node.total node.data shards pri relo init unassign pending_tasks max_task_wait_time active_shards_percent
1533717572 08:39:32  elasticsearch yellow          1         1    315 315    0    0      315             0                  -                 50.0%

green:每个索引的primary shard和replica shard都是active状态的

yellow:每个索引的primary shard都是active状态的,但是部分replica shard不是active状态,处于不可用的状态

red:不是所有索引的primary shard都是active状态的,部分索引有数据丢失了

为什么现在会处于一个yellow状态?

我们现在就一台服务器,就启动了一个es进程,相当于就只有一个node。现在es中有一个index,就是kibana自己内置建立的index。由于默认的配置是给每个index分配5个primary shard和5个replica shard,而且primary shard和replica shard不能在同一台机器上(为了容错)。现在kibana自己建立的index是1个primary shard和1个replica shard。当前就一个node,所以只有1个primary shard被分配了和启动了,但是一个replica shard没有第二台机器去启动。

3. 查看集群有哪些索引

GET /_cat/indices\?v
结果:

➜  ~ curl -i -XGET http://192.168.11.119:9200/_cat/indices\?v
HTTP/1.1 200 OK
content-type: text/plain; charset=UTF-8
content-length: 8840

>health status index                                    uuid                   pri rep docs.count docs.deleted store.size pri.store.size
yellow open   es_es_category_products                  u2TdPYcXS5yyFF8P3a3jYQ   5   1      95311         7103      156mb          156mb
yellow open   web_product_ar_new                       8qhhh9C7QvuwEEu-YYrIgA   5   1      37610           77     55.6mb         55.6mb
yellow open   en_27_category_product                   VtVXVTuHQ3-xyNw4txpEXg   5   1      41206           20       68mb           68mb
yellow open   ar_27_category_product                   Id43cmuDQnKYkhaCepxrIg   5   1      41206           17     67.9mb         67.9mb
yellow open   it_28_category_product                   Gltx9R80Qn6PI22i6-Mflg   5   1      12659           25     22.5mb         22.5mb
yellow open   db_search                                WKYGbjjLSZmh0s_LyuT2tQ   5   1     230133            0     28.7mb         28.7mb
yellow open   de_28_category_product                   IUCYcmTIR6K4AzUpAWJmHg   5   1      12659           27     22.5mb         22.5mb

4. 创建索引

PUT /test_index?pretty

结果:

➜  ~ curl -i -XPUT http://192.168.11.119:9200/test_index\?pretty
HTTP/1.1 200 OK
content-type: application/json; charset=UTF-8
content-length: 60
{
  "acknowledged" : true,
  "shards_acknowledged" : true

5.删除索引

DELETE /test_index?pretty

6. 新增文档并建立索引

语法格式:

PUT /index/type/id
{
    "json数据"
}

index索引名、type类型名、id数据的id

PUT /test_index/user/1
{
"name": "小明",
"email": "[email protected]",
"tags": ["篮球","游泳"]
}

结果如下:

➜  ~ curl -i -XPUT http://192.168.11.119:9200/test_index/user/1 -d '{
"name": "小明",
"email": "[email protected]",
"tags": ["篮球","游泳"]
}'

>HTTP/1.1 201 Created
Location: /test_index/user/1
Warning: 299 Elasticsearch-5.5.2-b2f0c09 "Content type detection for rest requests is deprecated. Specify the content type using the [Content-Type] header." "Wed, 08 Aug 2018 08:58:29 GMT"
content-type: application/json; charset=UTF-8
content-length: 143

>{"_index":"test_index","_type":"user","_id":"1","_version":1,"result":"created","_shards":{"total":2,"successful":1,"failed":0},"created":true}%

6.查询新增的文档

GET /索引/类型/字段值

例如:

➜  ~ curl -i -XGET http://192.168.11.119:9200/test_index/user/1\?pretty
HTTP/1.1 200 OK
content-type: application/json; charset=UTF-8
content-length: 232
{
"_index" : "test_index",
"_type" : "user",
"_id" : "1",
"_version" : 1,
"found" : true,
"_source" : {
"name" : "小明",
"email" : "[email protected]",
"tags" : [
"篮球",
"游泳"
]
}
}

7.修改文档

修改分为全部修改或部分修改,全部修改就是直接替换,需要带上全部字段才能修改,例如:

➜  ~  curl -i -XPUT http://192.168.11.119:9200/test_index/user/1 -d '{
"name": "小明",
"email": "[email protected]",
"tags": ["篮球","游泳","足球"]
}'
HTTP/1.1 200 OK
Warning: 299 Elasticsearch-5.5.2-b2f0c09 "Content type detection for rest requests is deprecated. Specify the content type using the [Content-Type] header." "Wed, 08 Aug 2018 09:15:45 GMT"
content-type: application/json; charset=UTF-8
content-length: 144
{"_index":"test_index","_type":"user","_id":"1","_version":2,"result":"updated","_shards":{"total":2,"successful":1,"failed":0},"created":false}

注意全部修改用的是PUT方法.

部分修改就是只更新部分,用的POST方法,参数部分增加了一个doc的key,例如:

➜  ~ curl -i -XPOST http://192.168.11.119:9200/test_index/user/1/_update -d '{
"doc":{
"email": "[email protected]"
}
}'
HTTP/1.1 200 OK
Warning: 299 Elasticsearch-5.5.2-b2f0c09 "Content type detection for rest requests is deprecated. Specify the content type using the [Content-Type] header." "Wed, 08 Aug 2018 09:18:26 GMT"
content-type: application/json; charset=UTF-8
content-length: 128
{"_index":"test_index","_type":"user","_id":"1","_version":3,"result":"updated","_shards":{"total":2,"successful":1,"failed":0}}

8.删除文档

DELETE /test_index/user/1

例如:

➜  ~ curl -i -XDELETE http://192.168.11.119:9200/test_index/user/2
HTTP/1.1 200 OK
content-type: application/json; charset=UTF-8
content-length: 141
{"found":true,"_index":"test_index","_type":"user","_id":"2","_version":2,"result":"deleted","_shards":{"total":2,"successful":1,"failed":0}}

9.查询字符串

GET /test_index/user

例如:

➜  ~ curl -i -XGET http://192.168.11.119:9200/test_index/user/_search\?pretty
HTTP/1.1 200 OK
content-type: application/json; charset=UTF-8
content-length: 793
{
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
},
"hits" : {
"total" : 2,
"max_score" : 1.0,
"hits" : [
{
"_index" : "test_index",
"_type" : "user",
"_id" : "2",
"_score" : 1.0,
"_source" : {
"name" : "小王",
"email" : "[email protected]",
"tags" : [
"游泳"
]
}
},
{
"_index" : "test_index",
"_type" : "user",
"_id" : "1",
"_score" : 1.0,
"_source" : {
"name" : "小明",
"email" : "[email protected]",
"tags" : [
"篮球",
"游泳",
"足球"
]
}
}
]
}
}

查询返回值参数说明

took:耗费了几毫秒
timed_out:是否超时,这里是没有
_shards:数据拆成了5个分片,所以对于搜索请求,会打到所有的primary shard(或者是它的某个replica shard也可以)
hits.total:查询结果的数量,3个document
hits.max_score:score的含义,就是document对于一个search的相关度的匹配分数,越相关,就越匹配,分数也高
hits.hits:包含了匹配搜索的document的详细数据

搜索名字为bruce的用户,而且按照email倒序

➜  ~ curl -i -XGET http://192.168.11.119:9200/test_index/user/_search\?pretty\&q=name:'bruce'&sort=email:desc
[1] 26574
HTTP/1.1 200 OK
content-type: application/json; charset=UTF-8
content-length: 479
{
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
},
"hits" : {
"total" : 1,
"max_score" : 1.1727304,
"hits" : [
{
"_index" : "test_index",
"_type" : "user",
"_id" : "4",
"_score" : 1.1727304,
"_source" : {
"name" : "Bruce",
"email" : "[email protected]",
"tags" : [
"Hello"
]
}
}
]
}
}
[1] + 26574 done curl -i -XGET

通过这个例子发现这样搜索是不区分大小写的.适用于临时的在命令行使用一些工具,比如curl,快速的发出请求,来检索想要的信息;但是如果查询请求很复杂,是很难去构建,在实际的生产环境中,几乎很少使用查询字符串.

11. 查询索引的表和字段定义

查询es所有的表和字段定义

GET /_mapping

查询某个索引的表定义

GET /test_index/_mapping

查询某个索引的表的字段定义

GET /test_index/user/_mapping

例如:

➜  ~ curl -i -XGET http://192.168.11.119:9200/test_index/_mapping\?pretty
HTTP/1.1 200 OK
content-type: application/json; charset=UTF-8
content-length: 1267
{
  "test_index" : {
    "mappings" : {
      "user" : {
        "properties" : {
          "email" : {
            "type" : "text",
            "fields" : {
              "keyword" : {
                "type" : "keyword",
                "ignore_above" : 256
              }
            }
          },
          "name" : {
            "type" : "text",
            "fields" : {
              "keyword" : {
                "type" : "keyword",
                "ignore_above" : 256
              }
            }
          },
          "tags" : {
            "type" : "text",
            "fields" : {
              "keyword" : {
                "type" : "keyword",
                "ignore_above" : 256
              }
            }
          }
        }
      },
      "role" : {
        "properties" : {
          "flag" : {
            "type" : "text",
            "fields" : {
              "keyword" : {
                "type" : "keyword",
                "ignore_above" : 256
              }
            }
          },
          "name" : {
            "type" : "text",
            "fields" : {
              "keyword" : {
                "type" : "keyword",
                "ignore_above" : 256
              }
            }
          }
        }
      }
    }
  }
}

12.查询DSL(Domain Specified Language,特定领域的语言 )

http request body:请求体,可以用json的格式来构建查询语法,比较方便,可以构建各种复杂的语法,比查询字符串肯定强大多了

  • 12.1查询所有文档
➜  ~ curl -i -XGET http://192.168.11.119:9200/test_index/user/_search\?pretty -d '
{
"query": {
"match_all": {
}
}
}
'
HTTP/1.1 200 OK
Warning: 299 Elasticsearch-5.5.2-b2f0c09 "Content type detection for rest requests is deprecated. Specify the content type using the [Content-Type] header." "Wed, 08 Aug 2018 12:58:15 GMT"
content-type: application/json; charset=UTF-8
content-length: 1895
{
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
},
"hits" : {
"total" : 6,
"max_score" : 1.0,
"hits" : [
{
"_index" : "test_index",
"_type" : "user",
"_id" : "5",
"_score" : 1.0,
"_source" : {
"name" : "bruce",
"email" : "[email protected]",
"tags" : [
"游泳1"
]
}
},
{
"_index" : "test_index",
"_type" : "user",
"_id" : "3",
"_score" : 1.0,
"_source" : {
"name" : "Alex",
"email" : "[email protected]",
"tags" : [
"吃饭"
]
}
}
]
}
}

注意match_all是包含在query字典里的,query处于root节点位置

  • 12.2查询包含输入字符的文档

query还是处于root节点,增加一个键值sort排序与query同级,示例:

➜  ~ curl -i -XGET http://192.168.11.119:9200/test_index/user/_search\?pretty -d '
{
  "query": {
         "match": {
            "name" : "br"
          }
  },
  "sort": [
           {
             "email" : "desc"
           }
  ]
}
'
HTTP/1.1 200 OK
Warning: 299 Elasticsearch-5.5.2-b2f0c09 "Content type detection for rest requests is deprecated. Specify the content type using the [Content-Type] header." "Wed, 08 Aug 2018 13:03:30 GMT"
content-type: application/json; charset=UTF-8
content-length: 193
{
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "failed" : 0
  },
  "hits" : {
    "total" : 0,
    "max_score" : null,
    "hits" : [ ]
  }
}

查询包含Br字符的文档(行),并对结果以email倒序。第一次运行上面语句时报错 Fielddata is disabled on text fields by default. Set fielddata=true on [email] in order to load fielddata in memory by uninverting the inverted index. Note that this can however use significant memory." ,经查询资料,应该是5.x后对 排序 、聚合相关操作用单独的数据结构fileddata缓存到内存里,需调接口开启使用到的字段, 官方解释 , 执行下面的操作开启:

➜  ~ curl -i -XPUT http://192.168.11.119:9200/test_index/_mapping/user\?pretty -d '
{
  "properties": {
    "email": {
      "type": "text",
      "fielddata": true
    }
  }
}'
HTTP/1.1 200 OK
Warning: 299 Elasticsearch-5.5.2-b2f0c09 "Content type detection for rest requests is deprecated. Specify the content type using the [Content-Type] header." "Wed, 08 Aug 2018 13:11:11 GMT"
content-type: application/json; charset=UTF-8
content-length: 28
{
  "acknowledged" : true
}

很多查询出来结果集很大,需要做分页,用DSL很简单,和query同级增加from和size键值,分表表示起始值和步长,示例

curl -i -XGET http://192.168.11.119:9200/test_index/user/_search?pretty -d '
{
  "query": {
   		"match_all": {
   		} 
  },
  "from" : 1,
  "size" : 2,
  "_source" : ["email"],
  "sort": [
  		{
  			"email" : "asc"
  		}
  ]
}
'
  • 12.3查询过滤器

搜索商品名包含Rhinestone,售卖价格小于3大于等于1的商品,结果按售卖价升序,构造DSL语句:

curl -i -XGET http://192.168.11.119:9200/en_es_category_products/product/_search?pretty -d '
{
  "query": {
   		"bool": {
   			"must" : {
   				"match" : {
   					"product_name" : "Rhinestone"
   				}
   			},
   			"filter" : {
   				"range" : {
   					"store_price" : {
   					   "gte" :  1
   						"lt" : 3
   					}
   				}
   			}
   		}
  },
  "_source" : [
  		"product_id",
  		"product_name",
  		"store_price",
  		"icon"
  ],
    "sort": [
  		{
  			"store_price" : "asc"
  		}
  ]
}
'

range操作符包含:

* gt :: 大于
* gte:: 大于等于
* lt :: 小于
* lte:: 小于等于

查询结果:

HTTP/1.1 200 OK
Warning: 299 Elasticsearch-5.5.2-b2f0c09 "Content type detection for rest requests is deprecated. Specify the content type using the [Content-Type] header." "Wed, 08 Aug 2018 13:15:52 GMT"
content-type: application/json; charset=UTF-8
content-length: 1141
{
  "took" : 2,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "failed" : 0
  },
  "hits" : {
    "total" : 2,
    "max_score" : null,
    "hits" : [
      {
        "_index" : "en_es_category_products",
        "_type" : "product",
        "_id" : "22100",
        "_score" : null,
        "_source" : {
          "product_id" : 22100,
          "icon" : "http://patpatdev.s3.amazonaws.com/Product/22100/1688I-SL-003-00008-001.jpg/1464845443.jpg",
          "store_price" : "2.99",
          "product_name" : "U-shape Silver Faux Perarl & Rhinestone Clip"
        },
        "sort" : [
          2.99
        ]
      },
      {
        "_index" : "en_es_category_products",
        "_type" : "product",
        "_id" : "354460",
        "_score" : null,
        "_source" : {
          "product_id" : 354460,
          "icon" : "http://patpatdev.s3.us-west-1.amazonaws.com/product/000766000119/5b0e5b0e49e8f.jpg",
          "store_price" : "2.99",
          "product_name" : "Pretty Star Decor Rhinestone Stud Hairband for Women"
        },
        "sort" : [
          2.99
        ]
      }
    ]
  }
}

注意参数嵌套了好几层,很容易写错,query、_source、sort都处于root级,query/bool下包含must、filter两级


以上所述就是小编给大家介绍的《Elasticsearch学习》,希望对大家有所帮助,如果大家有任何疑问请给我留言,小编会及时回复大家的。在此也非常感谢大家对 码农网 的支持!

查看所有标签

猜你喜欢:

本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们

注意力商人

注意力商人

吳修銘 / 黃庭敏 / 天下雜誌 / 2018-4-2 / NT$650

電子郵件,免費!照片分享,無上限! 你是否想過,隨手可得的免費內容、便利的免費服務,到底都是誰在付費? 如果商品免費,那你就不是消費者,而是商品! 你我可能都不知不覺地把自己賣給了注意力商人! 「『媒體轉型、網路演化與資訊浪潮」此一主題最具洞見的作者。』──黃哲斌(資深媒體人) 「這是少有的關注產業發展的傳播史,對現在或未來的『注意力產業』」中人來說,不可不讀。」──......一起来看看 《注意力商人》 这本书的介绍吧!

随机密码生成器
随机密码生成器

多种字符组合密码

HTML 编码/解码
HTML 编码/解码

HTML 编码/解码

SHA 加密
SHA 加密

SHA 加密工具