中文分词Sphinx + Coreseek Linux安装

sphinx 安装

1
2
3
4
5
6
wget http://www.sphinxsearch.com/downloads/sphinx-0.9.9.tar.gz
tar xzvf sphinx-0.9.9.tar.gz
cd sphinx-0.9.9
./configure --prefix=/usr/local/sphinx/ --with-mysql --enable-id64
make
make install

安装coreseek

1
2
3
4
5
6
7
8
#安装mmseg(coreseek所使用的词典)
wget http://www.wapm.cn/uploads/csft/3.2/coreseek-3.2.14.tar.gz
tar xzvf coreseek-3.2.14.tar.gz
cd mmseg-3.2.14
./bootstrap #输出的warning信息可以忽略,如果出现error则需要解决
./configure --prefix=/usr/local/mmseg3
make && make install
cd ..

安装coreseek(sphinx)

1
2
3
4
5
cd csft-3.2.14
sh buildconf.sh #输出的warning信息可以忽略,如果出现error则需要解决
./configure --prefix=/usr/local/coreseek --without-unixodbc --with-mmseg --with-mmseg-includes=/usr/local/mmseg3/include/mmseg/ --with-mmseg-libs=/usr/local/mmseg3/lib/ --with-mysql
make && make install
cd ..

测试mmseg分词和coreseek搜索

1
2
3
4
5
6
7
8
9
cd testpack
cat var/test/test.xml #此时应该正确显示中文
/usr/local/mmseg3/bin/mmseg -d /usr/local/mmseg3/etc var/test/test.xml
/usr/local/coreseek/bin/indexer -c etc/csft.conf --all //按配置生成索引
/usr/local/coreseek/bin/search -c etc/csft.conf 网络搜索
此时正确的应该返回
words:
1. '网络': 1 documents, 1 hits
2. '搜索': 2 documents, 5 hits
  • 可能会出现的问题:
    安装mmseg的时候,./configure出现错误:config.status: error: cannot find input file: src/Makefile.in
    1
    检查automake 版本
    WARNING: source 'index1': xmlpipe2 support NOT compiled in. To use xmlpipe2, install missing XML libraries xmlpipe2 support NOT compiled
    1
    yum install expat-devel*     #然后重装
  • 相关依赖
    1
    2
    3
    yum -y install m4 autoconf automake libtool
    yum -y install gcc gcc-c++ wget
    yum -y install mysql-devel
  • 编译不通过情况
    1
    2
    make时会出现一个错误
    sphinxexpr.cpp:1013:43: 错误:‘ExprEval’ was not declared in this scope, and no declarations were found by argument-dependent lookup at the point of instantiation [-fpermissive]
  • 解决vi src/sphinxexpr.cpp 查找/ExprEval 按N切换到下一个
    1
    2
    3
    4
    5
    6
    {
    T val = ExprEval ( this->m_pArg, tMatch ); // 'this' fixes gcc braindamage
    修改为
    T val = this->ExprEval ( this->m_pArg, tMatch ); // 'this' fixes gcc braindamage
    共修改三个地方
    }
  • 修改完重新执行 make && make install
  • 报错:error: cannot find input file: src/Makefile.in
    1
    2
    3
    4
    5
    6
    7
     yum -y install libtool
    aclocal
    libtoolize --force
    automake --add-missing
    autoconf
    autoheader
    make clean
  • 报错:ERROR: cannot find MySQL include files.
    1
    2
    3
    4
    5
    ### --with-mysql= 后边跟上mysql路径
    ./configure --prefix=/usr/local/coreseek --without-unixodbc --with-mmseg --with-mmseg-includes=/usr/local/mmseg3/include/mmseg/ --with-mmseg-libs=/usr/local/mmseg3/lib/ --with-mysql=/usr/mysql/include/mysql
    ### 如果找不到
    yum install mysql-devel ### 安装
    mysql_config --include ## 查看路径 再次安装