lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yaniv Ben Yosef <yani...@gmail.com>
Subject Implementing filtering based on multiple fields
Date Thu, 07 Jan 2010 20:54:22 GMT
Hi,

I'm very new to Lucene. In fact, I'm at the beginning of an evaluation
phase, trying to figure whether Lucene is the right fit for my needs.
The project I'm involved in requires something similar to the Google Custom
Search Engine <http://www.google.com/cse/> (CSE). In CSE, each user can
define a set (could be a large set) of websites, and limit the search to
only those websites. So for example, I can create a CSE that searches all
web pages on cnn.com, msnbc.com and nytimes.com only.
I am trying to understand whether and how I can do something similar in
Lucene.

The FAQ hints about this possibility
here<http://wiki.apache.org/lucene-java/LuceneFAQ#How_can_I_search_over_multiple_fields.3F>,
but it mentions a class that no longer exists in 3.0 (QueryFilter), and is
very laconic about the suggested options. Also I'm not sure how well it will
perform in my use case (or even if it fits at all).
I thought about creating a separate index for each user or CSE. However, my
system should be able to handle tens of thousands of concurrent users. I
haven't done any analysis yet on how this will affect CPU, RAM, I/O and
storage size, but was wondering if any of you experienced Lucene
users/developers think it's a good direction.
If that's not a good idea, what would be a good strategy here?

Any help will be much appreciated,
Yaniv

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message