首先要安装java 和 elasticsearch,相关步骤参见之前文章。
1. 安装maven
# wget -c http://mirror.cc.columbia.edu/pub/software/apache/maven/maven-3/3.3.3/binaries/apache-maven-3.3.3-bin.tar.gz # tar zxvf apache-maven-3.3.3-bin.tar.gz -C /usr/local/ # cd /usr/local/ # ln -s apache-maven-3.3.3 maven # vi /etc/profile export M2_HOME=/alidata/server/maven export PATH=${M2_HOME}/bin:${PATH} # . /etc/profile
2. 安装elasticsearch-analysis-ik
# wget -c https://github.com/medcl/elasticsearch-analysis-ik/archive/master.zip # unzip master.zip # cd elasticsearch-analysis-ik-master/ # cp -Rf config/* ik /usr/local/elasticsearch/config/ # mvn package # /usr/local/elasticsearch/bin/plugin --install analysis-ik --url file:///alidata/download/elasticsearch-analysis-ik-master/target/releases/elasticsearch-analysis-ik-1.4.1.zip
3. 修改elasticsearch.yml
在该文件后面添加下面的内容文章源自运维生存时间-https://www.ttlsa.com/bigdata/elasticsearch-analysis-ik-chinese/
index: analysis: analyzer: ik: alias: [ik_analyzer] type: org.elasticsearch.index.analysis.IkAnalyzerProvider ik_max_word: type: ik use_smart: false ik_smart: type: ik use_smart: true index.analysis.analyzer.default.type: ik
4. 重新启动elasticsearch
/usr/local/elasticsearch/bin/elasticsearch -d
5. ik分词测试
创建一个索引,名为index文章源自运维生存时间-https://www.ttlsa.com/bigdata/elasticsearch-analysis-ik-chinese/
# curl -XPUT http://localhost:9200/index
为索引index创建mapping文章源自运维生存时间-https://www.ttlsa.com/bigdata/elasticsearch-analysis-ik-chinese/
# curl -XPOST http://localhost:9200/index/fulltext/_mapping -d' { "fulltext": { "_all": { "analyzer": "ik" }, "properties": { "content": { "type" : "string", "boost" : 8.0, "term_vector" : "with_positions_offsets", "analyzer" : "ik", "include_in_all" : true } } } }'
测试文章源自运维生存时间-https://www.ttlsa.com/bigdata/elasticsearch-analysis-ik-chinese/
# curl 'http://localhost:9200/index/_analyze?analyzer=ik&pretty=true' -d ' { "text":"www.ttlsa.com 教程太好了受益匪浅,望ttlsa越来越好" }'
显示结果如下:文章源自运维生存时间-https://www.ttlsa.com/bigdata/elasticsearch-analysis-ik-chinese/
{ "tokens" : [ { "token" : "text", "start_offset" : 4, "end_offset" : 8, "type" : "ENGLISH", "position" : 1 }, { "token" : "www.ttlsa.com", "start_offset" : 11, "end_offset" : 24, "type" : "LETTER", "position" : 2 }, { "token" : "www", "start_offset" : 11, "end_offset" : 14, "type" : "ENGLISH", "position" : 3 }, { "token" : "ttlsa", "start_offset" : 15, "end_offset" : 20, "type" : "ENGLISH", "position" : 4 }, { "token" : "com", "start_offset" : 21, "end_offset" : 24, "type" : "ENGLISH", "position" : 5 }, { "token" : "教程", "start_offset" : 25, "end_offset" : 27, "type" : "CN_WORD", "position" : 6 }, { "token" : "太好了", "start_offset" : 27, "end_offset" : 30, "type" : "CN_WORD", "position" : 7 }, { "token" : "太好", "start_offset" : 27, "end_offset" : 29, "type" : "CN_WORD", "position" : 8 }, { "token" : "好了", "start_offset" : 28, "end_offset" : 30, "type" : "CN_WORD", "position" : 9 }, { "token" : "受益匪浅", "start_offset" : 30, "end_offset" : 34, "type" : "CN_WORD", "position" : 10 }, { "token" : "受益", "start_offset" : 30, "end_offset" : 32, "type" : "CN_WORD", "position" : 11 }, { "token" : "匪", "start_offset" : 32, "end_offset" : 33, "type" : "CN_CHAR", "position" : 12 }, { "token" : "浅", "start_offset" : 33, "end_offset" : 34, "type" : "CN_CHAR", "position" : 13 }, { "token" : "望", "start_offset" : 35, "end_offset" : 36, "type" : "CN_CHAR", "position" : 14 }, { "token" : "ttlsa", "start_offset" : 36, "end_offset" : 41, "type" : "ENGLISH", "position" : 15 }, { "token" : "越来越好", "start_offset" : 41, "end_offset" : 45, "type" : "CN_WORD", "position" : 16 }, { "token" : "越来越", "start_offset" : 41, "end_offset" : 44, "type" : "CN_WORD", "position" : 17 }, { "token" : "越来", "start_offset" : 41, "end_offset" : 43, "type" : "CN_WORD", "position" : 18 }, { "token" : "越好", "start_offset" : 43, "end_offset" : 45, "type" : "CN_WORD", "position" : 19 } ] }文章源自运维生存时间-https://www.ttlsa.com/bigdata/elasticsearch-analysis-ik-chinese/文章源自运维生存时间-https://www.ttlsa.com/bigdata/elasticsearch-analysis-ik-chinese/

我的微信
微信公众号
扫一扫关注运维生存时间公众号,获取最新技术文章~
1F
关闭分词怎么改,我不想要分词