elasticsearch学习笔记高级篇(一)——在案例中实战使用term filter来搜索数据

栏目: 后端 · 发布时间: 6年前

内容简介:下面以IT技术论坛为案例背景输出:注意:初步来讲,先搞4个字段,因为整个es是支持json document格式的,所以说扩展性和灵活性非常之好。如果后续随着业务需求的增加,要在document中增加更多的field,那么我们可以很方便的随时添加field。但是如果是在关系型数据库中,比如mysql,我们建立了一个表,现在要给表中新增一些column,那么就很坑爹了,必须用复杂的修改表结构的语法去执行。而且可能对系统代码还有一定的影响。

下面以IT技术论坛为案例背景

1、根据用户ID、是否隐藏、帖子ID、发帖日期来搜索帖子

(1)插入一些测试帖子的数据

POST /forum/_bulk
{"index": {"_id": 1}}
{ "articleID" : "XHDK-A-1293-#fJ3", "userID" : 1, "hidden": false, "postDate": "2017-01-01" }
{ "index": { "_id": 2 }}
{ "articleID" : "KDKE-B-9947-#kL5", "userID" : 1, "hidden": false, "postDate": "2017-01-02" }
{"index": {"_id": 3}}
{ "articleID" : "JODL-X-1937-#pV7", "userID" : 2, "hidden": false, "postDate": "2017-01-01" }
{ "index": { "_id": 4 }}
{ "articleID" : "QQPX-R-3956-#aD8", "userID" : 2, "hidden": true, "postDate": "2017-01-02" }

输出:

{
  "took" : 269,
  "errors" : false,
  "items" : [
    {
      "index" : {
        "_index" : "forum",
        "_type" : "_doc",
        "_id" : "1",
        "_version" : 1,
        "result" : "created",
        "_shards" : {
          "total" : 2,
          "successful" : 1,
          "failed" : 0
        },
        "_seq_no" : 0,
        "_primary_term" : 1,
        "status" : 201
      }
    },
    {
      "index" : {
        "_index" : "forum",
        "_type" : "_doc",
        "_id" : "2",
        "_version" : 1,
        "result" : "created",
        "_shards" : {
          "total" : 2,
          "successful" : 1,
          "failed" : 0
        },
        "_seq_no" : 1,
        "_primary_term" : 1,
        "status" : 201
      }
    },
    {
      "index" : {
        "_index" : "forum",
        "_type" : "_doc",
        "_id" : "3",
        "_version" : 1,
        "result" : "created",
        "_shards" : {
          "total" : 2,
          "successful" : 1,
          "failed" : 0
        },
        "_seq_no" : 2,
        "_primary_term" : 1,
        "status" : 201
      }
    },
    {
      "index" : {
        "_index" : "forum",
        "_type" : "_doc",
        "_id" : "4",
        "_version" : 1,
        "result" : "created",
        "_shards" : {
          "total" : 2,
          "successful" : 1,
          "failed" : 0
        },
        "_seq_no" : 3,
        "_primary_term" : 1,
        "status" : 201
      }
    }
  ]
}

注意:初步来讲,先搞4个字段,因为整个es是支持json document格式的,所以说扩展性和灵活性非常之好。如果后续随着业务需求的增加,要在document中增加更多的field,那么我们可以很方便的随时添加field。但是如果是在关系型数据库中,比如mysql,我们建立了一个表,现在要给表中新增一些column,那么就很坑爹了,必须用复杂的修改表结构的语法去执行。而且可能对系统代码还有一定的影响。

(2)根据用户ID搜索帖子

GET /forum/_search
{
  "query": {
    "constant_score": {
      "filter": {
        "term": {
          "userID": 1
        }
      }
    }
  }
}

输出:

{
  "took" : 6,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 2,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "forum",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 1.0,
        "_source" : {
          "articleID" : "XHDK-A-1293-#fJ3",
          "userID" : 1,
          "hidden" : false,
          "postDate" : "2017-01-01"
        }
      },
      {
        "_index" : "forum",
        "_type" : "_doc",
        "_id" : "2",
        "_score" : 1.0,
        "_source" : {
          "articleID" : "KDKE-B-9947-#kL5",
          "userID" : 1,
          "hidden" : false,
          "postDate" : "2017-01-02"
        }
      }
    ]
  }
}

term filter/query : 对搜索文本不分词,直接拿去倒排索引中去匹配,你输入的是什么,就去匹配什么

(3)搜索没有隐藏的帖子

GET /forum/_search
{
  "query": {
    "constant_score": {
      "filter": {
        "term": {
          "hidden": false
        }
      }
    }
  }
}

输出:

{
  "took" : 2,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 3,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "forum",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 1.0,
        "_source" : {
          "articleID" : "XHDK-A-1293-#fJ3",
          "userID" : 1,
          "hidden" : false,
          "postDate" : "2017-01-01"
        }
      },
      {
        "_index" : "forum",
        "_type" : "_doc",
        "_id" : "2",
        "_score" : 1.0,
        "_source" : {
          "articleID" : "KDKE-B-9947-#kL5",
          "userID" : 1,
          "hidden" : false,
          "postDate" : "2017-01-02"
        }
      },
      {
        "_index" : "forum",
        "_type" : "_doc",
        "_id" : "3",
        "_score" : 1.0,
        "_source" : {
          "articleID" : "JODL-X-1937-#pV7",
          "userID" : 2,
          "hidden" : false,
          "postDate" : "2017-01-01"
        }
      }
    ]
  }
}

(4)根据发帖日期搜索帖子

GET /forum/_search
{
  "query": {
    "constant_score": {
      "filter": {
        "term": {
          "postDate": "2017-01-01"
        }
      }
    }
  }
}

输出:

{
  "took" : 5,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 2,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "forum",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 1.0,
        "_source" : {
          "articleID" : "XHDK-A-1293-#fJ3",
          "userID" : 1,
          "hidden" : false,
          "postDate" : "2017-01-01"
        }
      },
      {
        "_index" : "forum",
        "_type" : "_doc",
        "_id" : "3",
        "_score" : 1.0,
        "_source" : {
          "articleID" : "JODL-X-1937-#pV7",
          "userID" : 2,
          "hidden" : false,
          "postDate" : "2017-01-01"
        }
      }
    ]
  }
}

(5)根据帖子ID搜索帖子

GET /forum/_search
{
  "query": {
    "constant_score": {
      "filter": {
        "term": {
          "articleID.keyword": "XHDK-A-1293-#fJ3"
        }
      }
    }
  }
}

输出:

{
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 1,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "forum",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 1.0,
        "_source" : {
          "articleID" : "XHDK-A-1293-#fJ3",
          "userID" : 1,
          "hidden" : false,
          "postDate" : "2017-01-01"
        }
      }
    ]
  }
}

如果不使用keyword

GET /forum/_search
{
  "query": {
    "constant_score": {
      "filter": {
        "term": {
          "articleID": "XHDK-A-1293-#fJ3"
        }
      }
    }
  }
}

输出:

{
  "took" : 0,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 0,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [ ]
  }
}

articleID.keyword是ES最新版本内置建立的field,就是不分词的,所以一个articleID过来的时候,会建立两次索引,一次是自己本省,是要分词的,分词后放入倒排索引,另外一次是基于articleID.keyword不分词的,保留最多256个字符,直接一个字符串放入倒排索引中。在实际使用的时候如果有必要可以手动指定type=keyword,那样就可以避免自动创建默认保留256个字符

(6)查看分词

GET /forum/_analyze
{
  "field": "articleID",
  "text": ["XHDK-A-1293-#fJ3"]
}

输出:

{
  "tokens" : [
    {
      "token" : "xhdk",
      "start_offset" : 0,
      "end_offset" : 4,
      "type" : "<ALPHANUM>",
      "position" : 0
    },
    {
      "token" : "a",
      "start_offset" : 5,
      "end_offset" : 6,
      "type" : "<ALPHANUM>",
      "position" : 1
    },
    {
      "token" : "1293",
      "start_offset" : 7,
      "end_offset" : 11,
      "type" : "<NUM>",
      "position" : 2
    },
    {
      "token" : "fj3",
      "start_offset" : 13,
      "end_offset" : 16,
      "type" : "<ALPHANUM>",
      "position" : 3
    }
  ]
}

默认的分词的text类型的field,建立倒排索引的时候,就会对所有的articleID分词,分词之后,原本的articleID就没有了,之后分词后的各个word存在于倒排索引中。

(7)重建索引

DELETE /forum
PUT /forum
{
  "mappings": {
    "properties": {
      "articleID": {
        "type": "keyword"
      }
    }
  }
}

POST /forum/_bulk
{ "index": { "_id": 1 }}
{ "articleID" : "XHDK-A-1293-#fJ3", "userID" : 1, "hidden": false, "postDate": "2017-01-01" }
{ "index": { "_id": 2 }}
{ "articleID" : "KDKE-B-9947-#kL5", "userID" : 1, "hidden": false, "postDate": "2017-01-02" }
{ "index": { "_id": 3 }}
{ "articleID" : "JODL-X-1937-#pV7", "userID" : 2, "hidden": false, "postDate": "2017-01-01" }
{ "index": { "_id": 4 }}
{ "articleID" : "QQPX-R-3956-#aD8", "userID" : 2, "hidden": true, "postDate": "2017-01-02" }

(8)重新根据帖子ID和发帖日期进行搜索

GET /forum/_search
{
  "query": {
    "constant_score": {
      "filter": {
        "term": {
          "articleID": "XHDK-A-1293-#fJ3"
        }
      }
    }
  }
}

输出:

{
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 1,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "forum",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 1.0,
        "_source" : {
          "articleID" : "XHDK-A-1293-#fJ3",
          "userID" : 1,
          "hidden" : false,
          "postDate" : "2017-01-01"
        }
      }
    ]
  }
}

2、总结

(1)term filter:根据exact value进行搜索,像数字、boolean、date是天然支持

(2)text需要建立索引时指定为not_analyzed,才能使用term

(3)相当于 SQL 中的单个where条件

select * from forum where articleID = 'XHDK-A-1293-#fJ3'

以上就是本文的全部内容,希望对大家的学习有所帮助,也希望大家多多支持 码农网

查看所有标签

猜你喜欢:

本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们

Practical JavaScript, DOM Scripting and Ajax Projects

Practical JavaScript, DOM Scripting and Ajax Projects

Frank Zammetti / Apress / April 16, 2007 / $44.99

http://www.amazon.com/exec/obidos/tg/detail/-/1590598164/ Book Description Practical JavaScript, DOM, and Ajax Projects is ideal for web developers already experienced in JavaScript who want to ......一起来看看 《Practical JavaScript, DOM Scripting and Ajax Projects》 这本书的介绍吧!

CSS 压缩/解压工具
CSS 压缩/解压工具

在线压缩/解压 CSS 代码

html转js在线工具
html转js在线工具

html转js在线工具

HSV CMYK 转换工具
HSV CMYK 转换工具

HSV CMYK互换工具