Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm
Precedence: bulk
Reply-To: java-user@lucene.apache.org
Received-SPF: pass (athena.apache.org: local policy)
From: Daniel Noll <daniel@nuix.com>
Organization: Nuix Pty Ltd
To: java-user@lucene.apache.org
Subject: Tricky (maybe) query question
Date: Thu, 6 Dec 2007 11:42:32 +1100
User-Agent: KMail/1.9.6 (enterprise 0.20070907.709405)
MIME-Version: 1.0
Content-Type: text/plain;
  charset="us-ascii"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
Message-Id: <200712061142.32839.daniel@nuix.com>

Hi all.

Suppose you have a text index with a field used for deduplication, and then 
you later add a second field with further information that might also be used 
for deduplication.  We'll call them A and B for the sake of brevity.

If I have only a current text index, then I can use (a:foo AND b:bar) to 
deduplicate.  However, I still want to deduplicate between the older ones 
which don't have B and the new ones which do.

Is there a way I can do a query which will:
  - Match a document if both a:foo and b:bar are matched
  - Match a document if a:foo matches and b is absent, or vice versa.
  - Not match a document if both a:foo and b:foo are absent
  - Not match a document if either a:foo or b:foo are present and do not
    match ?

If not, I suppose I'll have to go with the lowest common denominator approach 
and find out which fields are present in every index.

Daniel

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org