Elasticsearch 嵌套类型nested

栏目: 后端 · 发布时间: 6年前

内容简介：我们在使用Elasticsearch做搜索引擎的时候有可能会遇到跨domain查询的场景，比如做一个学生课程管理系统，搜一个学生的名字，像知道该学生的选课情况。当然解决问题的方法有很多，我可以搜学生，然后去db查找学生关联的选课，就可以查到所有的课程，有时候数据量不是很大，并且我的索引只有一个课程维度的时候，就需要使用嵌套类型来解决这类问题。本文使用es和kibina来操作实例，因为基于中文的实例，还使用到了ik分词器，具体可以参考：

1.背景介绍

我们在使用Elasticsearch做搜索引擎的时候有可能会遇到跨domain查询的场景，比如做一个学生课程管理系统，搜一个学生的名字，像知道该学生的选课情况。

当然解决问题的方法有很多，我可以搜学生，然后去db查找学生关联的选课，就可以查到所有的课程，有时候数据量不是很大，并且我的索引只有一个课程维度的时候，就需要使用嵌套类型来解决这类问题。本文使用es和kibina来操作实例，因为基于中文的实例，还使用到了ik分词器，具体可以参考：

Elasticsearch安装和使用

Elasticsearch中IK分词器的使用

2.对象类型

Elasticsearch支持对象类型的存储，我们可以把一个对象数组存到某个document的字段内，比如一个课程作为一个document，那么这个课程可以建立一个students字段，存储该课程下的学生object数组。

在Elasticsearch中，新建一个如下的class_test索引，其中student作为一个object数组类型。

PUT /class_test
{
  "mappings":{
	"class_test": {
		"properties": {
			"id": {
				"type": "keyword"
			},
			"name": {
				"analyzer": "ik_max_word",
				"type": "text"
			},
			"type":{
			  "type":"keyword"
			},
			"student":{
			    "properties": {
			       "name":{
			         "analyzer": "ik_max_word",
				        "type": "text"
			       },
			       "id":{
			         "type":"keyword"
			       }
			   }
		  }
      }
	  }
  },
  "settings":{
            "index": {
                "refresh_interval": "1s",
                "number_of_shards": 5,
                "max_result_window": "10000000",
                "mapper": {
                    "dynamic": "false"
                },
                "number_of_replicas": 0
            }
  }
}复制代码

往class_test放入一下数据，现在索引里面一共有两条数据

{
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 2,
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "class_test",
        "_type" : "class_test",
        "_id" : "ijfJ5GoBJeNZPNCWykLR",
        "_score" : 1.0,
        "_source" : {
          "id" : "1",
          "name" : "数学课",
          "student" : [
            {
              "id" : "1",
              "name" : "张三"
            },
            {
              "id" : "2",
              "name" : "李四"
            }
          ]
        }
      },
      {
        "_index" : "class_test",
        "_type" : "class_test",
        "_id" : "Q9NxGGsBa-TqHCWqAaM4",
        "_score" : 1.0,
        "_source" : {
          "id" : "2",
          "name" : "语文",
          "student" : [
            {
              "id" : "3",
              "name" : "杰克"
            },
            {
              "id" : "4",
              "name" : "玛丽"
            }
          ]
        }
      }
    ]
  }
}复制代码

接下来，我们可以使用查询语句对索引进行查询。当我们查询id为1的学生参见的课程的时候，可以查到数学课。

GET /class_test/class_test/_search
{
"query": {
  "bool": {
    "must": [
      {
        "match": {
          "student.id": "1"
        }
      }
    ]
  }
}
}复制代码

{
  "took" : 0,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 1,
    "max_score" : 0.2876821,
    "hits" : [
      {
        "_index" : "class_test",
        "_type" : "class_test",
        "_id" : "ijfJ5GoBJeNZPNCWykLR",
        "_score" : 0.2876821,
        "_source" : {
          "id" : "1",
          "name" : "数学课",
          "student" : [
            {
              "id" : "1",
              "name" : "张三"
            },
            {
              "id" : "2",
              "name" : "李四"
            }
          ]
        }
      }
    ]
  }
}
复制代码

当我们查名字叫张三的学生参加的课程的时候，也能查到数学课。

GET /class_test/class_test/_search
{
"query": {
  "bool": {
    "must": [
      {
        "match": {
          "student.name": "张三"
        }
      }
    ]
  }
}
}复制代码

{
  "took" : 4,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 1,
    "max_score" : 0.5753642,
    "hits" : [
      {
        "_index" : "class_test",
        "_type" : "class_test",
        "_id" : "ijfJ5GoBJeNZPNCWykLR",
        "_score" : 0.5753642,
        "_source" : {
          "id" : "1",
          "name" : "数学课",
          "student" : [
            {
              "id" : "1",
              "name" : "张三"
            },
            {
              "id" : "2",
              "name" : "李四"
            }
          ]
        }
      }
    ]
  }
}复制代码

但是当我们查询id为1并且名字叫李四的学生参加的课程时

GET /class_test/class_test/_search
{
"query": {
  "bool": {
    "must": [
      {
        "match": {
          "student.name": "李四"
        }
      },
      {
        "match": {
          "student.id": "1"
        }
        }
    ]
  }
}
}复制代码

{
  "took" : 6,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 1,
    "max_score" : 0.8630463,
    "hits" : [
      {
        "_index" : "class_test",
        "_type" : "class_test",
        "_id" : "ijfJ5GoBJeNZPNCWykLR",
        "_score" : 0.8630463,
        "_source" : {
          "id" : "1",
          "name" : "数学课",
          "student" : [
            {
              "id" : "1",
              "name" : "张三"
            },
            {
              "id" : "2",
              "name" : "李四"
            }
          ]
        }
      }
    ]
  }
}
复制代码

我们发现，出来的结果也是数学课，这就有点奇怪，因为并没有一个id为1并且名字是李四的学生，那就不应该有这么课。这是怎么回事？原来在es内部，object数组类型会被打平，简单来说我们输入的数组，实际存储的类型是：

"student.id":[1,2],
"student.name":[张三,李四]复制代码

所以倒排索引的建立，也是按照这种打平的逻辑。这个时候我们可以借助Elasticsearch内的嵌套类型来解决问题。

3.Nested类型

和2中类似的，我们需要建一个测试索引，名字为class，不同的是student有了type字段，为 "type":"nested"。

PUT /class
{
  "mappings":{
	"class": {
		"properties": {
			"id": {
				"type": "keyword"
			},
			"name": {
				"analyzer": "ik_max_word",
				"type": "text"
			},
			"type":{
			  "type":"keyword"
			},
			"student":{
			  "type":"nested",
			   "properties": {
			       "name":{
			         "analyzer": "ik_max_word",
				        "type": "text"
			       },
			       "id":{
			         "type":"keyword"
			       }
			   }
			  
			}
		}
	  }
  },
  "settings":{
            "index": {
                "refresh_interval": "1s",
                "number_of_shards": 5,
                "max_result_window": "10000000",
                "mapper": {
                    "dynamic": "false"
                },
                "number_of_replicas": 0
            }
  }
}复制代码

我们导入相同的数据，然后用搜索id为1并且名字为李四的学生的课程，这个时候我们看到搜索结果为空：

GET /class/class/_search
{
"query": {
  "bool": {
    "must": [
      {
        "nested": {
          "path": "student",
          "query": {
            "bool": {"must": [
              {
                "match": {
              "student.name": "李四"
                }
              },
              {
                "match": {
              "student.id": "1"
            }
              }
            ]}
          }
        }
      }
    ]
  }
}
}复制代码

{
  "took" : 3,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 0,
    "max_score" : null,
    "hits" : [ ]
  }
}
复制代码

4.其他方式

其实解决这种跨domain的搜索还有一些其他方式，对于嵌套类型，其实是非常消耗Elasticsearch的性能的，我们可以选择将需要搜索字段的值打平存一个字段，或者对学生单独建立一个索引，然后去学生-班级映射关系表查询班级。这一块后面有机会再做介绍。

以上就是本文的全部内容，希望本文的内容对大家的学习或者工作能带来一定的帮助，也希望大家多多支持码农网

查看所有标签

猜你喜欢:

本站部分资源来源于网络，本站转载出于传递更多信息之目的，版权归原作者或者来源机构所有，如转载稿涉及版权问题，请联系我们。

码农书籍

A=B

Marko Petkovsek、Herbert S. Wilf、Doron Zeilberger / AK Peters, Ltd. / 1996-01 / USD 49.00

At some point, this book describes methods of solving the problem raised by Donald E. Knuth in the classical book "The Art of Computer Programming, Volume 1: Fundamental Algorithms". The main purpo......一起来看看《A=B》这本书的介绍吧!

码农工具

Elasticsearch 嵌套类型nested

1.背景介绍

2.对象类型

3.Nested类型

4.其他方式

A=B

JSON 在线解析

URL 编码/解码

RGB CMYK 转换工具