lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sujit Pal <sujit....@comcast.net>
Subject Re: Is there any "Query" in Lucene can search the term, which is similar as "SQL-LIKE"?
Date Mon, 17 Oct 2011 18:42:57 GMT
Hi Mead,

You may want to check out the permuterm index idea.
http://www-nlp.stanford.edu/IR-book/html/htmledition/permuterm-indexes-1.html 

Basically you write a custom filter that takes a term and generates all
word permutations off it. On the query side, you convert your query so
its always a prefix query by rotating the characters so * is always at
the end and match against the permuterm indexed field.

I have a simple (and currently incomplete) working implementation (works
with queries such as *keyword, keyword*, key*rd, *keyword*, but only a
single * and no ? unlike the Wildcard query. But because its always a
prefix query internally, it does not have the performance penalty of
leading * in WildcardQuery. Maybe it will give you some ideas...
http://sujitpal.blogspot.com/2011/10/lucene-wildcard-query-and-permuterm.html 

-sujit

On Thu, 2011-10-13 at 10:10 +0800, Mead Lai wrote:
> Thank you very much,
> With your helps, that, finally, I use "WildcardQuery" to find right result:
>     BooleanQuery resultQuery = new BooleanQuery();
>     resultQuery.add(WildcardQuery(new Term("content", "*keyword*"));
>     TopDocs topDocs = searcher.search(resultQuery,*1000*);
> 
> But there is also a problem puzzle me, the result only can get 1000 items,
> which is not enough.
> I want to have entire/whole items, which match that condition(*keyword*).
> 
> OR, may I put a date condtion to query,
> e.g: select * from table where start_date *>=* 2011-10-12
> 
> 
> Regards,
> Mead
> 
> 
> On Tue, Oct 11, 2011 at 11:39 PM, Chris Lu <chris.lu@gmail.com> wrote:
> 
> > You need to analyze the search keyword with the same analyzer that's
> > applied
> > on the "content" field.
> >
> > --
> > Chris Lu
> > -------------------------
> > Instant Scalable Full-Text Search On Any Database/Application
> > site: http://www.dbsight.net
> > demo: http://search.dbsight.com
> > Lucene Database Search in 3 minutes:
> >
> > http://wiki.dbsight.com/index.php?title=Create_Lucene_Database_Search_in_3_minutes
> >
> > On Tue, Oct 11, 2011 at 12:11 AM, Mead Lai <laiqinyi@gmail.com> wrote:
> >
> > > Hello all,
> > > *Background:
> > > *There are *ONE MILLION* data in a table, and this table has 100 columns
> > > inside.
> > > The application need to search the data in EVERY column with one
> > 'keyword'.
> > > so, I try it in a clumsy way, using a database view, then search the
> > view.
> > > Just like the following SQL:
> > > *=Step1*: create a view.
> > >
> > > CREATE OR REPLACE VIEW V_MY_VIEW(id,title,content)
> > > as
> > > SELECT
> > >
> > >
> > mv.l_instanceid,mv.c_param1,mv.c_param2||';'||mv.c_param3||';'||mv.c_param4||';'||mv.c_param5||';'||mv.c_param6||';'||mv.c_param7||';'||mv.c_param8||';'||mv.c_param9||';'||mv.c_param10||';'||mv.c_param11||';'||mv.c_param12||';'||mv.c_param13||';'||mv.c_param14||';'||mv.c_param15||';'||mv.c_param16||';'||mv.c_param17||';'||mv.c_param18||';'||mv.c_param19||';'||mv.c_param20||';'||mv.c_param21||';'||mv.c_param22||';'||mv.c_param23||';'||mv.c_param24||';'||mv.c_param25||';'||mv.c_param26||';'||mv.c_param27||';'||mv.c_param28||';'||mv.c_param29||';'||mv.c_param30||';'||mv.c_param31||';'||mv.c_param32||';'||mv.c_param33||';'||mv.c_param34||';'||mv.c_param35||';'||mv.c_param36||';'||mv.c_param37||';'||mv.c_param38||';'||mv.c_param39||';'||mv.c_param40||';'||mv.c_param41||';'||mv.c_param42||';'||mv.c_param43||';'||mv.c_param44||';'||mv.c_param45||';'||mv.c_param46||';'||mv.c_param47||';'||mv.c_param48||';'||mv.c_param49||';'||mv.c_param50||';'||mv.c_param51||';'||mv.c_param52||';'||mv.c_param53||';'||mv.c_param54||';'||mv.c_param55||';'||mv.c_param56||';'||mv.c_param57||';'||mv.c_param58||';'||mv.c_param59||';'||mv.c_param60||';'||mv.c_param61||';'||mv.c_param62||';'||mv.c_param63||';'||mv.c_param64||';'||mv.c_param65||';'||mv.c_param66||';'||mv.c_param67||';'||mv.c_param68||';'||mv.c_param69||';'||mv.c_param70||';'||mv.c_param71||';'||mv.c_param72||';'||mv.c_param73||';'||mv.c_param74||';'||mv.c_param75||';'||mv.c_param76||';'||mv.c_param77||';'||mv.c_param78||';'||mv.c_param79||';'||mv.c_param80||';'||mv.c_param81||';'||mv.c_param82||';'||mv.c_param83||';'||mv.c_param84||';'||mv.c_param85||';'||mv.c_param86||';'||mv.c_param87||';'||mv.c_param88||';'||mv.c_param89||';'||mv.c_param90||';'||mv.c_param91||';'||mv.c_param92||';'||mv.c_param93||';'||mv.c_param94||';'||mv.c_param95||';'||mv.c_param96||';'||mv.c_param97||';'||mv.c_param98||';'||mv.c_param99||';'||mv.c_param100||';'
> > > FROM MyTable mv
> > >
> > > *=Step2*: search the view with LIKE '%keyword%'
> > >
> > > SELECT *
> > > FROM V_MY_VIEW wcv
> > > WHERE wcv.content LIKE '%keyword%'
> > >
> > > Finally, it works nice, but inefficiency, almost cost 5~7 seconds. cos
> > ONE
> > > MILLION rows are tooo huge.
> > >
> > > *Lucene way:*
> > >    So, I use the Lucene to store these ONE MILLION data,
> > > code:document.add(new Field("content", content, Store.YES,
> > > Index.ANALYZED));//variable content, is the strings which jointed from
> > the
> > > 100 columns
> > >     The problem is that: if some keyword is not a word or a term, the
> > > search will return nothing.
> > > Usually, the keyword would be a person's name or some jargon, like
> > > 'catstiger.amber','amin.ahmad','fund-in-trust'.
> > > and 'catstiger.amber' can't be split into a term to save in the index
> > store
> > >    Cos, the Index.ANALYZED would fail to recognition the keyword as a
> > term
> > > , so there is no such index at all.
> > > So, In short, is there any "Query" in Lucene can search the term, which
> > is
> > > similar as "SQL-LIKE"?
> > >
> > >    This SQL-function will meet the purpose:
> > >
> > > SELECT * FROM luceneDB ldb
> > > WHERE ldb.content *LIKE* '%keyword%'
> >  >
> > > Thank you very much.
> > >
> > > Regards,
> > > Mead
> > >
> >


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message