lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Deve <developer.in...@gmail.com>
Subject Re: solution for parent-child relationship
Date Wed, 23 Dec 2009 16:41:02 GMT
@Erick, you are right i will have to stop thinking in terms of databases,
thats why i wanted to discuss this.

i don't get how can i use getPositionIncrementGap, could you provide little
more details.

thanks,

On Wed, Dec 23, 2009 at 8:45 PM, Erick Erickson <erickerickson@gmail.com>wrote:

> Ya just gotta stop thinking like a database guy, man <G>. Lucene searches
> lots and lots of text very well. It doesn't do joins worth a darn. The
> moment
> you star thinking in terms of sub-queries, you're probably starting down
> the
> wrong track.
>
> Here's a possibility. Index each post and all associated comments as a
> single
> document, with the post and all comments put in the same field. Something
> like
> doc.add("text", <original post text>);
> doc.add("text", <first comment>);
> doc.add("text", <second comment>);
> add the document to the index.....
>
> Now, all you need to do is search the "text" field and you get everything.
> If
> you set the position increment gap to something other than 1 (see the
> docs),
> you can distinguish between the individual calls to doc.add(), see
> getPositionIncrementGap.
>
> In general, when indexing anything database-related, the advice is to
> flatten
> then data as you put it into a Lucene index. Think about making all the
> data
> available as the result of a single query. This leads to data replication,
> which
> goes against all your database instincts, but....
>
> HTH
> Erick
>
>
> On Wed, Dec 23, 2009 at 8:35 AM, Shahid Faiz <developer.inbox@gmail.com
> >wrote:
>
> > Hi,
> >
> > Following are details of my problem and possible solutions which I can
> > think
> > of. Please suggest which should I choose, or is there any other approach
> > better than these.
> >
> > I want to index blog posts and their comments, in my database posts and
> > comments are stored in two different tables. Currently I am indexing
> posts
> > only and search works perfectly fine, but now I want to index comments as
> > well. From now on when user will search any word, all posts will be
> > displayed which meet search criteria including posts and only those
> > comments
> > which meet the specified criteria along with posts. All paging is done on
> > posts, comments are not considered while calculating page contents.
> >
> > e.g.
> >
> > POST =>          Google launched Chrome Browser
> > COMMENT =>  Microsoft IE also has tab browsing.
> >
> > Now searching (microsoft AND IE), should also display this post with
> along
> > with above comment.
> >
> > One solution is: I should index both posts+comments in one lucene index
> and
> > search that index. But as i have to display 10 posts (not including
> > comments) per page so it gets complicated while finding records to be
> > displayed on any given page. I think, I can solve this problem by
> creating
> > two queries one to search posts and one to search comments only and then
> > can
> > join them using OR operator.
> >
> > Second solution could be: I can store comments in different index, and on
> > search first I will execute query on comments index and use result of
> > comments search to search blog posts.
> >
> > Thanks in advance.
> >
> > - deve
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message