1 功能简介
聚合后,每一个聚合 Bucket 里面仅返回指定顺序的前 N 条数据。
2 使用示例
(1)场景示例:
ES 库中存储着成员数据,每个成员有自己的编号 ID、所属的团队 ID 和个人得分等数据:id, team_id, score, age…
给定一组团队 ID 列表:team_id IN (1, 5, 7)
查询每个团队中得分最高的 2 个人的编号 ID。
(2)ES 查询示例:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
|
GET .../_search?routing=xxx // 若已知数据属于某一个或几个路由分区,设置路由会提升性能。
{
"size": 0, // 仅过滤数据,不返回命中数据。
"query": {
"bool": {
"filter": [ // 过滤条件,在聚合前先进行数据筛选。
{
"terms": {
"team_id": [
1,
5,
7
]
}
}
]
}
},
"aggs": {
"group_aggs": { // 第一层聚合:先按照team_id将数据聚合成多个Bucket。
"terms": {
"field": "team_id",
"execution_hint": "map" // 若可知该层聚合结果数量很小,设置成map可提升性能。
},
"aggs": {
"top_score_member": { // 第二层聚合:在第一层聚合结果中的每个Bucket内,在进行top_hits聚合操作。
"top_hits": {
"size": 2, // 仅返回前2条记录
"sort": [ // 排序条件按照score倒序
{
"score": {
"order": "desc"
}
}
]
}
}
}
}
}
}
|
(3)Java 查询示例:
TransportClient 版本示例:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
|
// 过滤条件
BoolQueryBuilder boolQueryBuilder = QueryBuilders.boolQuery();
boolQueryBuilder.filter(QueryBuilders.termsQuery("team_id", Lists.newArrayList(1, 3, 5)));
// 聚合条件
AggregationBuilder groupAggBuilder = AggregationBuilders.terms("group_aggs")
.field("team_id")
.executionHint("map"); // 若可知该层聚合结果数量很小,设置成map可提升性能。
AggregationBuilder topScoreAggBuilder = AggregationBuilders.topHits("top_score_member")
.sort("score", SortOrder.DESC)
.size(2);
groupAggBuilder.subAggregation(topScoreAggBuilder);
// 查询结果
SearchResponse response = transportClient.prepareSearch("index_name").setTypes("type_name")
.setRouting("xxx") // 若已知数据属于某一个或几个路由分区,设置路由会提升性能。
.setSize(0)
.setQuery(boolQueryBuilder)
.addAggregation(groupGoodsAggBuilder)
.get();
|
RestHighLevelClient 示例:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
|
// 过滤条件
BoolQueryBuilder boolQueryBuilder = QueryBuilders.boolQuery();
boolQueryBuilder.filter(QueryBuilders.termsQuery("team_id", Lists.newArrayList(1, 3, 5)));
// 聚合条件
AggregationBuilder groupAggBuilder = AggregationBuilders.terms("group_aggs")
.field("team_id")
.executionHint("map"); // 若可知该层聚合结果数量很小,设置成map可提升性能。
AggregationBuilder topScoreAggBuilder = AggregationBuilders.topHits("top_score_member")
.sort("score", SortOrder.DESC)
.size(2);
groupAggBuilder.subAggregation(topScoreAggBuilder);
// 构造查询对象
SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
searchSourceBuilder.query(boolQueryBuilder);
searchSourceBuilder.size(0);
searchSourceBuilder.aggregation(groupGoodsAggBuilder);
// 观察线上接口响应情况,设置合理的超时时间。
searchSourceBuilder.timeout(new TimeValue(300));
SearchRequest request = new SearchRequest("index_name")
request.source(searchSourceBuilder);
request.setRouting("xxx") // 若已知数据属于某一个或几个路由分区,设置路由会提升性能。
// 请求数据
SearchResponse searchResponse = restHighLevelClient.search(request, RequestOptions.DEFAULT);
|
SearchResponse 解析示例:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
|
if (Objects.nonNull(response) && Objects.equals(response.status(), RestStatus.OK)) {
Terms groupResult = response.getAggregations().get("group_aggs");
if (Objects.nonNull(groupResult)) {
for (Terms.Bucket groupBucket : groupResult.getBuckets()) {
TopHits topScoreResult = groupBucket.getAggregations().get("top_score_member");
if (Objects.nonNull(topScoreResult) && topScoreResult.getHits().getHits().length > 0) {
SearchHit searchHit = topScoreResult.getHits().getAt(0);
MemberDTO top1Member = JSON.parseObject(searchHit.getSourceAsString(), MemberDTO.class);
SearchHit searchHit = topScoreResult.getHits().getAt(1);
MemberDTO top2Member = JSON.parseObject(searchHit.getSourceAsString(), MemberDTO.class);
// 其它逻辑
}
}
}
}
|
参考:
https://blog.csdn.net/cuixianlong/article/details/104426160
另外的查询:
ES 分组取每组第一条的 ES 写法和 Java 写法_Counter-Strike 大牛-程序员秘密
例子中按 trace_id 分组,然后每个分组中按照 log_time 正序排列取第一条。
ES 写法:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
|
{
"query": {
"bool": {
"must": [
{
"query_string": {
"query": "log_level:ERROR",
"fields": [],
"type": "best_fields",
"default_operator": "or",
"max_determinized_states": 10000,
"enable_position_increments": true,
"fuzziness": "AUTO",
"fuzzy_prefix_length": 0,
"fuzzy_max_expansions": 50,
"phrase_slop": 0,
"escape": false,
"auto_generate_synonyms_phrase_query": true,
"fuzzy_transpositions": true,
"boost": 1
}
},
{
"range": {
"log_time": {
"from": "2021-06-02 18:00:44.727",
"to": "2021-06-02 18:05:44.727",
"include_lower": true,
"include_upper": false,
"format": "yyyy-MM-dd HH:mm:ss.SSS",
"boost": 1
}
}
}
],
"adjust_pure_negative": true,
"boost": 1
}
},
"aggs": {
"group_by_trace_id": {
"terms": {
"field": "trace_id",
"order": {
"top_hit": "asc"
}
},
"aggs": {
"min_trace": {
"min": {
"field": "log_time"
}
},
"top_test": {
"top_hits": {
"sort": {
"log_time": "asc"
},
"size":1
}
},
"top_hit": {
"min": {
"script": "_score"
}
}
}
}
}
}
|
Java 写法:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
|
MetricData elasticsearchMetric = new MetricData();
ElasticsearchInfo elasticsearchInfo = new ElasticsearchInfo(metricContract.getDataSourceContract());
EsRestClientContainer esRestClientContainer = elasticsearchSourceManager.findEsRestClientContainer(elasticsearchInfo);
BoolQueryBuilder boolQueryBuilder = QueryBuilders.boolQuery()
.must(new QueryStringQueryBuilder(metricContract.getQueryString()))
.must(QueryBuilders.rangeQuery(metricContract.getDataNameContract().getTimestampField())
.from(start.toDateTimeISO().toString(dateTimeFormatter))
.to(end.toDateTimeISO().toString(dateTimeFormatter))
.includeLower(true)
.includeUpper(false)
.format(dateTimeFormatter));
Map<String, String> dataNameProperties = metricContract.getDataNameContract().getSettings();
String indexPrefix = dataNameProperties.get("indexPrefix");
String datePattern = dataNameProperties.get("timePattern");
String[] indices = esRestClientContainer.buildIndices(start, end, indexPrefix, datePattern);
Long count = null;
try {
count = esRestClientContainer.totalCount(boolQueryBuilder, indices);
} catch (Exception ex) {
log.error("queryElasticsearchMetricValue 发生异常:", ex);
throw new RuntimeException("error when totalCount", ex);
}
if (metricContract.getAggregationType().equalsIgnoreCase(SymbolExpr.COUNT)) {
elasticsearchMetric.setMetricValue(count);
}
if (count == 0) {
elasticsearchMetric.setMetricValue(0);
return elasticsearchMetric;
}
SearchRequest searchRequest = new SearchRequest(indices);
SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
searchSourceBuilder.trackScores(false);
searchSourceBuilder.trackTotalHits(true);
searchSourceBuilder.query(boolQueryBuilder).from(0).size(10)
.sort(metricContract.getDataNameContract().getTimestampField(), SortOrder.DESC);
attachAggregation(metricContract, searchSourceBuilder);
// 聚合搜索
TermsAggregationBuilder termsBuilder = AggregationBuilders.terms("group_by_trace_id").field("trace_id");
MinAggregationBuilder minAggregationBuilder = AggregationBuilders.min("min_trace").field("log_time");
TopHitsAggregationBuilder topHitsAggregationBuilder = AggregationBuilders.topHits("top_detail").sort("log_time", SortOrder.ASC).size(1);
MinAggregationBuilder minAggregationBuilderTopHit = AggregationBuilders.min("top_hit").field("_score");
// TopHitsAggregationBuilder topHitsAggregationBuilder = AggregationBuilders.topHits("min_trace").("trace_id", SortOrder.ASC).sort("log_time", SortOrder.ASC).size(10);
termsBuilder.subAggregation(minAggregationBuilder);
termsBuilder.subAggregation(topHitsAggregationBuilder);
termsBuilder.subAggregation(minAggregationBuilderTopHit);
searchSourceBuilder.aggregation(termsBuilder);
// 执行查询
searchRequest.source(searchSourceBuilder);
SearchResponse searchResponse = esRestClientContainer.fetchHighLevelClient().search(searchRequest, RequestOptions.DEFAULT);
ParsedStringTerms stringTerms = searchResponse.getAggregations().get("group_by_trace_id");
List<? extends Terms.Bucket> buckets = stringTerms.getBuckets();
if (metricContract.getAggregationType().equalsIgnoreCase(SymbolExpr.COUNT)) {
if (buckets.size() > 0) {
elasticsearchMetric.setMetricValue(buckets.size());
}
} else {
Double numericValue = findAggregationValue(metricContract, searchResponse);
elasticsearchMetric.setMetricValue(numericValue);
}
List<Map<String, Object>> latestDocumentList = new ArrayList<>();
for (Terms.Bucket bucket : buckets) {
ParsedTopHits topDetail = bucket.getAggregations().get("top_detail");
SearchHit[] hits = topDetail.getHits().getHits();
for (SearchHit hit : hits) {
Map<String, Object> latestDocument = hit.getSourceAsMap();
latestDocument.put("esDataId", latestDocument.get("id"));
latestDocumentList.add(latestDocument);
elasticsearchMetric.setLatestDocumentList(latestDocumentList);
}
}
return elasticsearchMetric;
|
es 删除索引:
1
2
|
PASSWORD=jH9q52s82u5F33kyt74zxqwy
curl -u "elastic:$PASSWORD" -X DELETE 'https://localhost:9200/kibana*' -k
|
es 删除材料
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
|
PASSWORD=jH9q52s82u5F33kyt74zxqwy
curl -u "elastic:$PASSWORD" -X POST "https://localhost:9200/remote_statistics/_delete_by_query" -k -H 'Content-Type: application/json' -d'
{
"query": {
"term": {
"beginTime": {
"value": 0
}
}
}
}
'
curl -u "elastic:$PASSWORD" -X DELETE
PASSWORD=123456
curl -u "elastic:$PASSWORD" -XDELETE "http://10.7.11.80:9201/remote_statistics" -k
curl -u "elastic:$PASSWORD" -X GET "https://localhost:9200/_cluster/health" -k
curl -u "elastic:$PASSWORD" -XDELETE "http://10.7.11.80:9201/remote_statistics" -k
curl -u "elastic:$PASSWORD" -X GET "https://localhost:9200/_cat/indices?v&pretty'" -
单节点 Elasticsearch 健康状态为 yellow 问题的解决
PASSWORD=jH9q52s82u5F33kyt74zxqwy
curl -u "elastic:$PASSWORD" -X PUT "https://localhost:9200/_settings" -k -H 'Content-Type: application/json' -d'
{
"number_of_replicas":0
}
'
curl -u "elastic:$PASSWORD" -X PUT "https://localhost:9200/_cluster/settings?pretty" -k -H 'Content-Type: application/json' -d'
{
"transient" : {
"cluster.routing.allocation.enable" : "all"
}
}'
DELETE /remote_statistics/_doc/1537410295226564608
DELETE /remote_statistics/_doc/1537410404274274304
DELETE /remote_statistics/_doc/1537410394350551040
DELETE /remote_statistics/_doc/1537413001215344640
DELETE /remote_statistics/_doc/1537412942583169024
DELETE /remote_statistics/_doc/1537413009532649472
DELETE /remote_statistics/_doc/1537413006911209472
DELETE /remote_statistics/_doc/1537413984825769984
DELETE /remote_statistics/_doc/1537414138886750208
DELETE /remote_statistics/_doc/1537414142061838336
GET /remote_statistics/
GET /_cluster/health
GET /_cat/indices
|
esapi 学习
1
2
3
4
|
2.1、查看集群健康状况
curl -X GET "http://10.49.196.11:9200/_cat/health?v=true"
2.2、查看集节点信息
curl -X GET "http://10.49.196.11:9200/_cat/nodes?v=true"
|
3、Index APIs
3.1、创建索引
同时设置了 setting 和 mapping 信息;setting 里面包含分片和副本信息,mapping 里包含字段设置的详细信息。
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
|
curl -X PUT -H 'Content-Type:application/json' 'http://10.49.196.11:9200/poet-index' -d '
{
"settings": {
"index": {
"number_of_shards": 2,
"number_of_replicas": 1
}
},
"mappings": {
"properties": {
"age": {
"type": "integer"
},
"name": {
"type": "keyword"
},
"poems": {
"type": "text",
"analyzer": "ik_max_word",
"search_analyzer": "ik_max_word"
},
"about": {
"type": "text",
"analyzer": "ik_max_word",
"search_analyzer": "ik_max_word"
},
"success": {
"type": "text",
"analyzer": "ik_max_word",
"search_analyzer": "ik_max_word"
}
}
}
}'
|
3.2、修改 _mapping 信息
字段可以新增,已有的字段只能修改字段的 search_analyze r 属性。
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
|
curl -X PUT -H 'Content-Type:application/json' 'http://10.49.196.11:9200/poet-index' -d '
{
"properties": {
"name": {
"type": "text",
"analyzer": "ik_max_word",
"search_analyzer": "ik_max_word"
},
"age": {
"type": "integer"
},
"desc": {
"type": "text",
"analyzer": "ik_max_word",
"search_analyzer": "ik_smart"
}
}
}'
|
3.3、删除索引
1
|
curl -X DELETE 'http://10.49.196.11:9200/poet-index'
|
3.4、查询索引列表
1
|
curl -X GET "http://10.49.196.11:9200/*"
|
或
1
|
curl -X GET "http://10.49.196.11:9200/_all"
|
3.5、查询索引详情
1
|
curl -X GET 'http://10.49.196.11:9200/poet-index'
|
4、Document APIs
4.1、新增文档
A、设置 id 为 1
1
2
3
4
5
6
7
8
|
curl -X POST -H 'Content-Type:application/json' 'http://10.49.196.11:9200/poet-index/_create/1' -d '
{
"age": 30,
"name": "李白",
"poems": "静夜思",
"about": "字太白",
"success": "创造了古代浪漫主义文学高峰、歌行体和七绝达到后人难及的高度"
}'
|
B、不设置 id,将自动生成
1
2
3
4
5
6
7
8
|
curl -X POST -H 'Content-Type:application/json' 'http://10.49.196.11:9200/poet-index/_doc' -d '
{
"age": 31,
"name": "杜甫",
"poems": "登高",
"about": "字子美",
"success": "唐代伟大的现实主义文学作家,唐诗思想艺术的集大成者"
}'
|
C、批量新增文档
1
2
3
4
5
6
7
|
curl -X POST -H 'Content-Type:application/json' 'http://10.49.196.11:9200/poet-index/_bulk' -d '
{"index":{"_id":"11"}}
{"age": 30,"name": "杜甫11","poems": "登高","about": "字子美","success": "唐代伟大的现实主义文学作家,唐诗思想艺术的集大成者"}
{"index":{"_id":"12"}}
{"age": 30,"name": "杜甫12","poems": "登高","about": "字子美","success": "唐代伟大的现实主义文学作家,唐诗思想艺术的集大成者"}
'
|
注:最后的空行是需要的,否则会报错。
4.2、删除文档
1
|
curl -X DELETE 'http://10.49.196.11:9200/poet-index/_doc/1'
|
4.3、更新文档
只更新参数设置的字段。
1
2
3
4
5
6
7
|
curl -X POST -H 'Content-Type:application/json' 'http://10.49.196.11:9200/poet-index/_update/1' -d '
{
"doc": {
"age": 32,
"poems": "望庐山瀑布"
}
}'
|
4.4、新增或覆盖文档
没有对应 id 的文档就创建,有就覆盖更新所有的字段(相当于先删除再新增)。
1
2
3
4
5
6
7
|
curl -X PUT -H 'Content-Type:application/json' 'http://10.49.196.11:9200/poet-index/_doc/1' -d '
{
"age": 31,
"name": "李白",
"poems": "静夜思",
"about": "字太白"
}'
|
5、Search APIs
5.1、查询一个索引的所有文档
1
|
curl -X GET 'http://10.49.196.11:9200/poet-index/_search'
|
5.2、根据 id 查询文档
1
|
curl -X GET 'http://10.49.196.11:9200/poet-index/_doc/1'
|
5.3、term 查询
term 查询不会对输入的内容进行分词处理,而是作为一个整体来查询。
A、查询单个词
1
2
3
4
5
6
7
8
9
10
|
curl -X GET -H 'Content-Type:application/json' 'http://10.49.196.11:9200/poet-index/_search' -d '
{
"query": {
"term": {
"name": {
"value": "李白"
}
}
}
}'
|
B、查询多个词
1
2
3
4
5
6
7
8
|
curl -X GET -H 'Content-Type:application/json' 'http://10.49.196.11:9200/poet-index/_search' -d '
{
"query": {
"terms": {
"name": ["李白", "杜甫"]
}
}
}'
|
5.4、range 查询
1
2
3
4
5
6
7
8
9
10
11
|
curl -X GET -H 'Content-Type:application/json' 'http://10.49.196.11:9200/poet-index/_search' -d '
{
"query": {
"range": {
"age": {
"gte": 20,
"lte": 35
}
}
}
}'
|
5.5、全文查询
5.5.1、match
对输入的内容进行分词处理,再根据分词查询。
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
|
curl -X GET -H 'Content-Type:application/json' 'http://10.49.196.11:9200/poet-index/_search' -d '
{
"query": {
"match": {
"success": "理想主义"
}
},
"from": 0,
"size": 10,
"sort": [{
"name": {
"order": "asc"
}
}]
}'
|
5.5.2、multi_match
多字段匹配。
1
2
3
4
5
6
7
8
9
|
curl -X GET -H 'Content-Type:application/json' 'http://10.49.196.11:9200/poet-index/_search' -d '
{
"query": {
"multi_match": {
"query": "太白",
"fields": ["about", "success"]
}
}
}'
|
5.5.3、match_phrase
匹配整个查询字符串。
1
2
3
4
5
6
7
8
|
curl -X GET -H 'Content-Type:application/json' 'http://10.49.196.11:9200/poet-index/_search' -d '
{
"query": {
"match_phrase": {
"success": "文学作家"
}
}
}'
|
5.5.4、match_all
查询所有数据。
1
2
3
4
5
6
7
|
curl -X GET -H 'Content-Type:application/json' 'http://10.49.196.11:9200/poet-index/_search' -d '
{
"query": {
"match_all": {
}
}
}'
|
5.5.5、query_string
query_string 可以同时实现前面几种查询方法。
A、类似 match
1
2
3
4
5
6
7
8
9
|
curl -X GET -H 'Content-Type:application/json' 'http://10.49.196.11:9200/poet-index/_search' -d '
{
"query": {
"query_string": {
"default_field": "success",
"query": "古典文学"
}
}
}'
|
B、类似 mulit_match
1
2
3
4
5
6
7
8
9
|
curl -X GET -H 'Content-Type:application/json' 'http://10.49.196.11:9200/poet-index/_search' -d '
{
"query": {
"query_string": {
"query": "古典文学",
"fields": ["about", "success"]
}
}
}'
|
C、类似 match_phrase
1
2
3
4
5
6
7
8
9
|
curl -X GET -H 'Content-Type:application/json' 'http://10.49.196.11:9200/poet-index/_search' -d '
{
"query": {
"query_string": {
"default_field": "success",
"query": "\"古典文学\""
}
}
}'
|
D、带运算符查询,运算符两边的词不再分词
1、查询同时包含 ”文学“ 和 ”伟大“ 的文档
1
2
3
4
5
6
7
8
9
|
curl -X GET -H 'Content-Type:application/json' 'http://10.49.196.11:9200/poet-index/_search' -d '
{
"query": {
"query_string": {
"default_field": "success",
"query": "文学 AND 伟大"
}
}
}'
|
或
1
2
3
4
5
6
7
8
9
10
|
curl -X GET -H 'Content-Type:application/json' 'http://10.49.196.11:9200/poet-index/_search' -d '
{
"query": {
"query_string": {
"fields": ["success"],
"query": "文学 伟大",
"default_operator": "AND"
}
}
}'
|
2、查询 name 或 success 字段包含"文学"和"伟大"这两个单词,或者包含"李白"这个单词的文档。
1
2
3
4
5
6
7
8
9
|
curl -X GET -H 'Content-Type:application/json' 'http://10.49.196.11:9200/poet-index/_search' -d '
{
"query": {
"query_string": {
"query": "(文学 AND 伟大) OR 李白",
"fields": ["name", "success"]
}
}
}'
|
5.5.6、simple_query_string
类似 query_string,主要区别如下:
1、不支持 AND OR NOT ,会当做字符处理;使用 + 代替 AND,| 代替 OR,- 代替 NOT
2、会忽略错误的语法
查询同时包含 ”文学“ 和 ”伟大“ 的文档:
1
2
3
4
5
6
7
8
9
|
curl -X GET -H 'Content-Type:application/json' 'http://10.49.196.11:9200/poet-index/_search' -d '
{
"query": {
"simple_query_string": {
"fields": ["success"],
"query": "文学 + 伟大"
}
}
}'
|
5.6、模糊查询
模糊查询时使用的参数:
fuzziness |
允许的最大编辑距离,默认不开启模糊查询,相当于 fuzziness=0。支持的格式 1、可以是数字(0、1、2)代表固定的最大编辑距离 2、自动模式,AUTO:[low],[high] 查询词长度在 [0-low)范围内编辑距离为 0(即强匹配) 查询词长度在 [low, high) 范围内允许编辑 1 次 查询词长度 >high 允许编辑 2 次 |
prefix_length |
控制两个字符串匹配的最小相同的前缀大小,也就是前 n 个字符不允许编辑,必须与查询词相同,默认是 0,大于 0 时可以显著提升查询性能 |
max_expansions |
产生的最大模糊选项 |
transpositions |
相邻位置字符互换是否算作 1 次编辑距离,全文查询不支持该参数 |
A、全文查询时使用模糊参数
先分词再计算模糊选项。
1
2
3
4
5
6
7
8
9
10
11
12
13
|
curl -X GET -H 'Content-Type:application/json' 'http://10.49.196.11:9200/poet-index/_search' -d '
{
"query": {
"match": {
"success": {
"query": "古典文化",
"fuzziness": 1,
"prefix_length": 0,
"max_expansions": 5
}
}
}
}'
|
B、使用 fuzzy query
对输入不分词,直接计算模糊选项。
1
2
3
4
5
6
7
8
9
10
11
12
13
|
curl -X GET -H 'Content-Type:application/json' 'http://10.49.196.11:9200/poet-index/_search' -d '
{
"query": {
"fuzzy": {
"success": {
"value": "理想",
"fuzziness": 1,
"prefix_length": 0,
"transpositions": true
}
}
}
}'
|
5.7、组合查询
组合查询使用 bool 来组合多个查询条件。
条件 |
说明 |
must |
同时满足 |
should |
满足其中任意一个 |
must_not |
同时不满足 |
filter |
过滤搜索,不计算得分 |
A、查询 success 包含 “思想” 且 age 在 [20-40] 之间的文档:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
|
curl -X GET -H 'Content-Type:application/json' 'http://10.49.196.11:9200/poet-index/_search' -d '
{
"query": {
"bool": {
"must": [{
"simple_query_string": {
"query": "思想",
"fields": ["success"]
}
}, {
"range": {
"age": {
"gte": 20,
"lte": 40
}
}
}]
}
}
}'
|
B、过滤出 success 包含 “思想” 且 age 在 [20-40] 之间的文档,不计算得分:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
|
curl -X GET -H 'Content-Type:application/json' 'http://10.49.196.11:9200/poet-index/_search' -d '
{
"query": {
"bool": {
"filter": [{
"simple_query_string": {
"query": "思想",
"fields": ["success"]
}
}, {
"range": {
"age": {
"gte": 20,
"lte": 40
}
}
}]
}
}
}'
|
5.8、聚合查询
A、求和
1
2
3
4
5
6
7
8
9
10
|
curl -X GET -H 'Content-Type:application/json' 'http://10.49.196.11:9200/poet-index/_search' -d '
{
"aggs": {
"age_sum": {
"sum": {
"field": "age"
}
}
}
}'
|
B、类似 select count distinct(age) from poet-index
1
2
3
4
5
6
7
8
9
10
|
curl -X GET -H 'Content-Type:application/json' 'http://10.49.196.11:9200/test-index/_search' -d '
{
"aggs": {
"age_count": {
"cardinality": {
"field": "age"
}
}
}
}'
|
C、数量、最大、最小、平均、求和
1
2
3
4
5
6
7
8
9
10
11
|
curl -X GET -H 'Content-Type:application/json' 'http://10.49.196.11:9200/poet-index/_search' -d '
{
"aggs": {
"age_stats": {
"stats": {
"field": "age"
}
}
},
"size": 0
}'
|
D、类似 select name,count(*) from poet-index group by name
1
2
3
4
5
6
7
8
9
10
11
|
curl -X GET -H 'Content-Type:application/json' 'http://10.49.196.11:9200/poet-index/_search' -d '
{
"aggs": {
"name_terms": {
"terms": {
"field": "name"
}
}
},
"size": 0
}'
|
E、类似 select name,age, count(*) from poet-index group by name,age
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
|
curl -X GET -H 'Content-Type:application/json' 'http://10.49.196.11:9200/poet-index/_search' -d '
{
"aggs": {
"name_terms": {
"terms": {
"field": "name"
},
"aggs": {
"age_terms": {
"terms": {
"field": "age"
}
}
}
}
},
"size": 0
}'
|
F、类似 select avg(age) from poet-indexwhere name=‘李白’
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
|
curl -X GET -H 'Content-Type:application/json' 'http://10.49.196.11:9200/poet-index/_search' -d '
{
"query": {
"bool": {
"filter": {
"term": {
"name": "李白"
}
}
}
},
"aggs": {
"age_avg": {
"avg": {
"field": "age"
}
}
},
"size": 0
}'
|
5.9、推荐搜索
如果希望 Elasticsearch 能够根据我们的搜索内容给一些推荐的搜索选项,可以使用推荐搜索。
1
2
3
4
5
6
7
8
9
10
11
12
13
14
|
curl -X GET -H 'Content-Type:application/json' 'http://10.49.196.11:9200/poet-index/_search' -d '
{
"suggest": {
"success_suggest": {
"text": "思考",
"term": {
"field": "success",
"analyzer": "ik_max_word",
"suggest_mode": "always",
"min_word_length":2
}
}
}
}'
|
推荐模式 suggest_mode:
推荐模式 |
说明 |
popular |
推荐词频更高的一些搜索 |
missing |
当没有要搜索的结果的时候才推荐 |
always |
无论什么情况下都进行推荐 |
5.10、高亮显示
对搜索结果中的关键字高亮显示。
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
|
curl -X GET -H 'Content-Type:application/json' 'http://10.49.196.11:9200/poet-index/_search' -d '
{
"query": {
"match": {
"success": "思想"
}
},
"highlight": {
"pre_tags": "<span color='red'>",
"post_tags": "</span>",
"fields": {
"success": {}
}
}
}'
|
5.11、SQL 查询
Elasticsearch 支持通过 SQL 查询数据。
1
2
3
4
|
curl -X GET -H 'Content-Type:application/json' 'http://10.49.196.11:9200/_sql' -d '
{
"query": "SELECT * FROM \"poet-index\" limit 3"
}'
|
详细的 Elasticsearch REST API 使用说明,请参考官网文档:https://www.elastic.co/guide/en/elasticsearch/reference/current/rest-apis.html。
标签: 搜索引擎
1、查看集群状态
curl ‘10.18.37.223:9200/_cat/health?v’
绿色表示一切正常, 黄色表示所有的数据可用但是部分副本还没有分配,红色表示部分数据因为某些原因不可用
2、获取集群节点列表
curl ‘10.18.37.223:9200/_cat/nodes?v’
3、查看所有 index
curl -X GET ‘http://10.18.37.223:9200/_cat/indices?v’
4、查询所有的 index 包含其所有的 type
curl ‘10.18.37.223:9200/_mapping?pretty=true’
5、查询某个 index 下的所有 type
curl ‘10.18.37.223:9200/test/_mapping?pretty=true’ 查询 test 下的所有 type
6、查询某个 index 的所有数据
curl ‘10.18.37.223:9200/test/_search?pretty=true’
7、查询 index 下某个 type 类型的数据
curl ‘10.18.37.223:9200/test/test_topic/_search?pretty=true’
其中:根据规划,Elastic 6.x 版只允许每个 Index 包含一个 Type,7.x 版将会彻底移
除 Type, index=test type=test_topic 注意自己使用的版本
8、查询 index 下某个 type 下 id 确定的数据
curl ‘10.18.37.223:9200/test/test_topic/3525?pretty=true’
index = test type= test_topic id = 3525
9、和 sql 一样的查询数据
curl “10.18.37.223:9200/test/_search” -d’
{
“query”: { “match_all”: {} },
“_source”: [“account_number”, “balance”],
“sort”: { “balance”: { “order”: “desc” },
“from”: 10,
“size”: 10
}
'
注:-d 之后的内容使用回车输入,不能使用换行符,es 不能识别
query:里面为查询条件此处为全部,不做限制,_source:为要显示的那些字段
sort:为排序字段 from 为从第 10 条开始,size:取 10 条
除此之外还有:布尔匹配,or 匹配。包含匹配。范围匹配。更多查询请去官网查看:
官网查询 API 地址
10、创建索引(index)
curl -X PUT ‘10.18.37.223:9200/test?pretty’
OR
curl -X PUT ‘10.18.37.223:9200/test’
创建一个名为 test 的索引
注:索引只能是小写,不能以下划线开头,也不能包含逗号
如果没有明确指定索引数据的 ID,那么 es 会自动生成一个随机的 ID,需要使用 POST 参数
11、往 index 里面插入数据
curl -X PUT ‘10.18.37.223:9200/test/test_zhang/1?pretty’ -d '
{“name”:“tom”,“age”:18}’
往 es 中插入 index=test,type=test_zhang id = 1 的数据为
{“name”:“tom”,“age”:18}的数据。
-X POST 也即可
12、修改数据
curl -X PUT ‘10.18.37.223:9200/test/test_zhang/1?pretty’ -d ‘{“name”:“pete”,“age”:20}’
注:修改 index = test type=test_zhang id = 1 数据: {“name”:“tom”,“age”:18}
为{“name”:“pete”,“age”:20} 成功之后执行查看数据命令可看到最新数据,且
version 会增加一个版本
13、更新数据同时新增数据,在一个 index,type 中
curl -X POST ‘10.18.37.223:9200/test/test_zhang/1/_update?pretty’ -d ‘{“doc”:{“name”:“Alice”,“age”:18,“addr”:“beijing”}}’
注:修改了名字,年龄,同时新增了字段 addr=beijing
14、利用 script 更新数据
curl -X POST ‘10.18.37.223:9200/test/test_zhang/1/_update?pretty’ -d ‘{“script”: “ctx._source.age += 5”}’
注:将年龄加 5
从 ES 1.4.3 以后, inline script 默认是被禁止的
要打开, 需要在 config/elasticsearch.yml 中添加如下配置:
script.inline:true
script.indexed:true 然后重启 (如果是集群模式:需要每个节点都添加 然后重启)
15、删除记录
curl -X DELETE ‘10.18.37.223:9200/test/test_zhang/1’
注:删除 index = test type = test_zhang id = 1 的数据
16、删除 index
curl -X DELETE ‘10.18.37.223:9200/test’
删除 index=test 的数据
17、批量操作
curl -X POST ‘10.18.37.223:9200/test/test_zhang/_bulk?pretty’ -d '
{“index”:{"_id":“2”}}
{“name”:“zhangsan”,“age”:12}
{“index”:{"_id":“3”}}
{“name”:“lisi”}
'
注:在 index = test type = test_zhang 下
新增 id= 2 和 id=3 的两条数据
curl -X POST ‘10.18.37.223:9200/test/test_zhang/_bulk?pretty’ -d '
{“update”:{"_id":“2”}}
{“doc”:{“name”:“wangwu”}}
{“delete”:{"_id":“3”}}’
注: 修改 id = 2 的数据 并且同时删除掉 id=3 的数据
在 index = test type = test_zhang 下
18、根据条件删除
curl -X POST “10.18.37.223:9200/test/_delete_by_query” -d’
{
“query”: {
“match”: {
“name”: “pete”
}
}
}’
使用 es 的_delete_by_query,此插件在 es2.0 版本以后被移除掉,要使用此命令。
需要自己安装_delete_by_query 插件:
在 es 安装目录下。bin 目录下,执行:
./plugin install delete-by-query 安装插件
如果是集群模式,则每个节点都需要安装然后重启