Orace Text Index-part4-contains query

通配符

全文索引的contains query是查找整个单词的方式,当查询单词的一部分字符串的时候会获取不到结果。当需要进行部分单词匹配的时候可以使用通配符,%匹配任意多个未知字符,_统配单个未知字符。

SQL> select score(99),id, create_date, dbms_lob.substr(DATA_ECHO_AREA) AREA from T_LEXER where contains(DATA_ECHO_AREA,'SMITH',99) > 0 
order by score(99) desc;

 SCORE(99)         ID CREATE_DATE               AREA
---------- ---------- ------------------------- ------------------
         5          9 2016-12-29 20:43:35       John Smith
         5         12 2016-12-29 20:43:35       Under_Smith
         5         11 2016-12-29 20:43:35       Smith Blue

SQL> select score(99),id, create_date, dbms_lob.substr(DATA_ECHO_AREA) AREA from T_LEXER where contains(DATA_ECHO_AREA,'SMI',99) > 0 
SQL> order by score(99) desc;

no rows selected

SQL> select score(99),id, create_date, dbms_lob.substr(DATA_ECHO_AREA) AREA from T_LEXER where contains(DATA_ECHO_AREA,'SM%',99) > 0 
SQL> order by score(99) desc;

 SCORE(99)         ID CREATE_DATE               AREA
---------- ---------- ------------------------- ------------------
         5          9 2016-12-29 20:43:35       John Smith
         5         12 2016-12-29 20:43:35       Under_Smith
         5         11 2016-12-29 20:43:35       Smith Blue

SQL> select score(99),id, create_date, dbms_lob.substr(DATA_ECHO_AREA) AREA from T_LEXER 
SQL> where contains(DATA_ECHO_AREA,'SMI_H BLUE',99) > 0 order by score(99) desc;

 SCORE(99)         ID CREATE_DATE               AREA
---------- ---------- ------------------------- ------------------
         6         11 2016-12-29 20:43:35       Smith Blue

AND和OR查询

Oracle Text Index是很精确的查询,所以对于模糊查询,比如位置颠倒的场景,全文索引就无法像GOOGLE那样。例如我想查询Blue Smith,Oracle Text index就无法返回Smith Blue这种记录。
对于这种时候,我们可能希望全文索引能返回给我们Smith Blue或者Blue A Smith。这种类似的场景我们可以使用AND或者OR查询,下面是AND和OR操作符的用法。

SQL> col area for a120
SQL> SELECT ID,DBMS_LOB.SUBSTR(DATA_ECHO_AREA) AREA FROM T_LEXER;

        ID AREA
---------- --------------
         1 故事发生在19世纪末的欧洲,善良可爱的苏菲被恶毒的女巫施下魔咒,从18岁的女孩变成90岁的婆婆,孤单无助的她无意中走入镇外的移
           动城堡,据说它的主人哈尔以吸取
           女孩的灵魂为乐,但是事情并没有人们传说的那么可怕,性情古怪的
           哈尔居然收留了苏菲,两个人在四脚的移动城堡中开始了奇妙的共同生活,一段交织了爱与痛、乐与悲的爱情故事在战火中悄悄展开

         2 这部由俄罗斯导演提莫贝克曼贝托夫执导的影片自6月末在北美上映以来,已经在全球取得了超过3亿美元的票房收入。在亚洲上映后也先
           后拿下日本、韩国等地的票房冠军宝座。
           虽然不少网友在此之前也相继通过各种渠道接触到本片,但相信影片凭着在大银幕上呈现出的超酷的视听效果,依然能够吸引大量影迷前
           往影院捧场

         3 Shanghai is a city in the CHINA
         4 运维:/indexpage/indexpagesActionGetTopPage.action?pageType=0,1,2,3,5,10
         5 steven king
         6 515.124.4169
         7 14895956868
         9 John Smith
        10 650.121.2019
        11 Smith Blue
        12 Under_Smith

11 rows selected.

SQL> select score(99),id, create_date, dbms_lob.substr(DATA_ECHO_AREA) AREA from T_LEXER 
SQL> where contains(DATA_ECHO_AREA,'Blue OR Smith',99) > 0 order by score(99) desc;

 SCORE(99)         ID CREATE_DATE         AREA
---------- ---------- ------------------- -----------------
         6         11 2016-12-29 20:43:35 Smith Blue
         5         12 2016-12-29 20:43:35 Under_Smith
         5          9 2016-12-29 20:43:35 John Smith

SQL> 
SQL> 
SQL> select score(99),id, create_date, dbms_lob.substr(DATA_ECHO_AREA) AREA from T_LEXER 
SQL> where contains(DATA_ECHO_AREA,'Blue AND Smith',99) > 0 order by score(99) desc;

 SCORE(99)         ID CREATE_DATE         AREA
---------- ---------- ------------------- -------------------
         5         11 2016-12-29 20:43:35 Smith Blue

NOT 操作符

SQL> insert into t_lexer
  2   (id, create_date, DATA_ECHO_AREA)
  3   values
  4   (seq_t_lexer.nextval,
  5     sysdate,
  6     'Berry Blue');

已创建 1 行。

SQL> commit;

提交完成。

SQL> execute ctx_ddl.sync_index('IDX1_T_LEXER');

PL/SQL 过程已成功完成。

SQL> select score(99), id, create_date, dbms_lob.substr(DATA_ECHO_AREA) AREA
  2    from T_LEXER
  3   where contains(DATA_ECHO_AREA, 'blue', 99) > 0
  4   order by score(99) desc;

 SCORE(99)         ID CREATE_DATE         AREA
---------- ---------- ------------------- ------------------------------
         5         41 2017-01-10 12:53:30 Berry Blue
         5         11 2016-12-28 10:41:41 Smith Blue

SQL> select score(99), id, create_date, dbms_lob.substr(DATA_ECHO_AREA) AREA
  2    from T_LEXER
  3   where contains(DATA_ECHO_AREA, 'blue not smith', 99) > 0
  4   order by score(99) desc;

 SCORE(99)         ID CREATE_DATE         AREA
---------- ---------- ------------------- ------------------------------
         5         41 2017-01-10 12:53:30 Berry Blue

ACCUM操作符

accum = accumulate

SQL> select score(99), id, create_date, dbms_lob.substr(DATA_ECHO_AREA) AREA
  2    from T_LEXER
  3   where contains(DATA_ECHO_AREA, 'berry accum steven', 99) > 0
  4   order by score(99) desc;

 SCORE(99)         ID CREATE_DATE         AREA
---------- ---------- ------------------- ------------------------------
         3         41 2017-01-10 12:53:30 Berry Blue
         3          5 2016-12-28 10:32:03 steven king

FUZZY

对拼写和读音相似的单词不精确匹配,但是测试过程中对中文支持不好,数据库字符集为AL32UTF8

SQL> select id, create_date, dbms_lob.substr(DATA_ECHO_AREA) AREA
  2    from T_LEXER
  3   where contains(DATA_ECHO_AREA, 'fuzzy((kine))', 99) > 0
  4   order by score(99) desc;

        ID CREATE_DATE         AREA
---------- ------------------- ------------------------------
         5 2016-12-28 10:32:03 steven king

对操作符组合使用

多种操作符组合使用进行过滤查询,比如下面的例子使用ACCUM和NOT进行查询过滤

SQL> select score(99), id, create_date, dbms_lob.substr(DATA_ECHO_AREA) AREA
  2    from T_LEXER
  3   where contains(DATA_ECHO_AREA, '((berry accum steven) not Blue)', 99) > 0
  4   order by score(99) desc;

 SCORE(99)         ID CREATE_DATE         AREA
---------- ---------- ------------------- ------------------------------
         3          5 2016-12-28 10:32:03 steven king

对保留字的查询

Oracle Text Index里有很多保留字,比如上面文章中使用的AND/OR/FUZZY/NOT等,如果需要对这种关键字进行查询,需要使用{}进行使用。

SQL> insert into t_lexer
  2   (id, create_date, DATA_ECHO_AREA)
  3   values
  4   (seq_t_lexer.nextval,
  5     sysdate,
  6     'and or fuzzy not');

已创建 1 行。

SQL> commit;

提交完成。

SQL>  execute ctx_ddl.sync_index('IDX1_T_LEXER');

PL/SQL 过程已成功完成。

SQL> select score(99), id, create_date, dbms_lob.substr(DATA_ECHO_AREA) AREA
  2    from T_LEXER
  3   where contains(DATA_ECHO_AREA, 'fuzzy', 99) > 0
  4   order by score(99) desc;
select score(99), id, create_date, dbms_lob.substr(DATA_ECHO_AREA) AREA
*
第 1 行出现错误:
ORA-29902: 执行 ODCIIndexStart() 例行程序中出错
ORA-20000: Oracle Text error:
DRG-50901: text query parser syntax error on line 1, column 6


SQL>
SQL>
SQL> select score(99), id, create_date, dbms_lob.substr(DATA_ECHO_AREA) AREA
  2    from T_LEXER
  3   where contains(DATA_ECHO_AREA, '{fuzzy}', 99) > 0
  4   order by score(99) desc;

 SCORE(99)         ID CREATE_DATE         AREA
---------- ---------- ------------------- ------------------------------
         7         63 2017-01-10 13:16:28 and or fuzzy not

good luck!

此条目发表在index, Index分类目录。将固定链接加入收藏夹。

2 Responses to Orace Text Index-part4-contains query

  1. Pingback引用通告: Oracle Text Index-part5-sync index | LEO Notes

  2. Pingback引用通告: Oracle Text Index-part6-preference and stoplist | LEO Notes

发表评论

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / 更改 )

Twitter picture

You are commenting using your Twitter account. Log Out / 更改 )

Facebook photo

You are commenting using your Facebook account. Log Out / 更改 )

Google+ photo

You are commenting using your Google+ account. Log Out / 更改 )

Connecting to %s