lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jimi HullegÄrd <jimi.hulleg...@mogul.com>
Subject RE: Using separate index for each user
Date Fri, 19 Sep 2008 12:14:49 GMT
Well, if the total size of the relevant data is to big to fit in one single index, then simply
adding the username as a field would not solve the original problem.

But what if you combine the two alternaties? Lets say that you can find a way to divide the
users in a evenly distributed way based on their user names, for example every user with a
username that starts with "A" gets one index, the ones that start with "B" gets another index,
etc etc. Or even those that start with any of the letters "A" to "D" gets one index, etc.
That way you can make each index size be of reasonable size, and still don't end up with thousands
of separate indexes. And since you know which index to search in based on the username, you
don't need any kind of distributed search.
And then you should ofcourse add the username as a field, so that each search is filtered
to only that users data.

/Jimi

mogul | jimi hullegÄrd | system developer | hudiksvallsgatan 4, 113 30 stockholm sweden |
+46 8 506 66 172 | +46 765 27 19 55 | jimi.hullegard@mogul.com | www.mogul.com


> -----Original Message-----
> From: Alexander Aristov [mailto:alexander.aristov@gmail.com]
> Sent: den 19 september 2008 13:43
> To: java-user@lucene.apache.org
> Subject: Re: Using separate index for each user
>
> IF you create a field in the index which would hold username
> then you can
> create search queries to reject entries which don;t belong to
> the user?
>
> it's much efficient
>
> Alexander
>
>
> 2008/9/16 Tobias Larsson Hult <tobias.larsson.hult@findwise.se>
>
> > Hi,
> >
> > We're thinking of using Lucene to integrate search in a
> backup service
> > application. The background is that we have a bunch of
> users using a backup
> > service, and we want them to be able to search their own,
> and only their
> > own, backups.
> >
> > The total amount of data that's being backed up is very
> large (size in
> > terabyte). Even though the index will probably be smaller
> due to only
> > indexing relevant fields, it is still to much to
> incorporate in one index.
> > But since a user will only search in his/her own files
> we're thinking of
> > creating one index for each user. There will be a lot of
> indexes of course
> > but each index will not span to more than a couple of
> gigabytes at the most.
> >
> > So when a user searches or adds new content to the backup
> we will open up
> > his/her index and to a search/update in that particular
> index. That way,
> > each query/update should not be so performance intense.
> >
> > Does this sound like a reasonable solution?  Of course this
> means creating
> > a lot of IndexReaders/Writers but I prefer that to
> searching in a huge index
> > everytime when a user only wants to search in a slice of
> the total index.
> >
> > Best Regards,
> > Tobias Larsson Hult
> >
> >
>
>
> --
> Best Regards
> Alexander Aristov
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message