Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm
Precedence: bulk
Reply-To: java-user@lucene.apache.org
Received-SPF: pass (asf.osuosl.org: domain of ananth.t.sarathy@gmail.com
 designates 64.233.184.226 as permitted sender)
DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws;
        s=beta; d=gmail.com;
        h=received:message-id:date:from:to:subject:in-reply-to:mime-version:content-type:references;
        b=VVL0f9eVvLpoJU/NjRSZwHWLSPhuwI3dwBT+tX8eoSg/Ijua0vE0Suy3njtYcqgYfK4IgdKoJAAN5F2ye3hLB+ObDDNLoOee5Cl2gXAWJBjI6g6Mkq2y93nIa7VYd5lD2vxJOTrIlvqcKmSyFtY2GbS1dCpenZB8gHA/icfNwKk=
Message-ID: <ad681e7f0604130833i1cec5119s443b1aca9b476930@mail.gmail.com>
Date: Thu, 13 Apr 2006 11:33:44 -0400
From: "Ananth T. Sarathy" <ananth.t.sarathy@gmail.com>
To: java-user@lucene.apache.org
Subject: Re: Lucene Seaches VS. Relational database Queries
In-Reply-To: 
 <OF1F126187.FB480636-ONC225714E.002440B0-C225714E.00270FD3@il.ibm.com>
MIME-Version: 1.0
Content-Type: multipart/alternative;
	boundary="----=_Part_20721_31536729.1144942424020"
References: <Pine.LNX.4.58.0604111517560.9140@hal.rescomp.berkeley.edu>
	 <OF1F126187.FB480636-ONC225714E.002440B0-C225714E.00270FD3@il.ibm.com>

------=_Part_20721_31536729.1144942424020
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
Content-Disposition: inline

Ok,
 Some of the stuff makes  some sense. I was a little loopy from lack of
sleep and some of these solutions don't really cover my concerns....


Let's take this movie example. If each member of a production Crew can have
multiple titles that come from a lookup table of Distinct Jobs

Titles
Assistant Producer
Producer
Executive Producer
Director
Director Trainee
Stunt Director

In the Database there would be a Assocation Table Linking each Crew member
the titles they had

Crew_Titles
Crew_ID   Title
1             Producer
1

On 4/12/06, Nadav Har'El <NYH@il.ibm.com> wrote:
>
> Chris Hostetter <hossman_lucene@fucit.org> wrote on 12/04/2006 01:41:37
> AM:
> > : them in one field).  One of the problems I see would be with values
> that
> > : over lap (Example, name where one name is Jason Bateman, and one is
> Jason
> > : Bateman Black, and it would be hard to replicate the Discrete Search
> for
> >
> > they way field values are "analyzed" is extremely configurable -- down
> to
> > the individual field level.  Which means that while you can have an
> actor
> > field where you can do loose text searching for "bateman" and get back
> > movies staring "Jason Bateman" and "Jason Bateman Black" (and even Guid=
o
> > Batemans" if you use stemming) you can also have another field using a
> > KeywordAnalyzer such that a record with teh values "Jason Bateman" and
> > "Jack Black" will only be matched if hte user searches for "Jason
> Bateman"
> > or "Jack Black" ... searching for "Bateman Jack" or "Black Jason" will
> not
> > work.
>
> Another possible trick is to have one field, but mark its end with specia=
l
> tokens, say "^" and "$", so that "Jason Bateman" gets indexed as four
> tokens:
>      ^ Jason Bateman $
> Then, if you want to search for the name Jason Bateman and that name only=
,
> just search for the phrase "^ Jason Bateman $" - and only this entry will
> match. (you can also continue to search this field normally)
>
> If you'll think about this, you'll notice that you don't actually need
> the beginning-of-field marker ("^") because it's easy to recognize the
> beginning of a field because the position there is 0. Unfortunately,
> I don't know how to match position 0 using the standard QueryParser,
> but you can do it with the SpanFirstQuery: for example if we index
> Jason Bateman as the three tokens
>      Jason Bateman $
> then we can search for it using something like
>      SpanQuery[] terms =3D {
>            new SpanTermQuery(new Term("actor", "Jason")),
>            new SpanTermQuery(new Term("actor", "Bateman")),
>            new SpanTermQuery(new Term("actor", "$")) };
>      new SpanFirstQuery(new SpanNearQuery(terms, 0, true), 3);
> (or something like that... I didn't test this)
>
>
> --
> Nadav Har'El
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>


--
Ananth T Sarathy

------=_Part_20721_31536729.1144942424020--