lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jan <fajer...@informatik.hu-berlin.de>
Subject Re: uncorrect results
Date Wed, 17 Nov 2010 18:19:48 GMT
thats what i figured...i can't find out what i'm doing wrong though ;)

so the query is "experiment" (i know not really a phrase...but the
assignment requested precisely so). The program constructs the following
query

+(AbstractText:"experiment" ArticleTitle:"experiment")

which looks good to me. the results look like this:

Found 95 hits.
1. 19810 phrase occurs 3 times 
2. 587340 phrase occurs once 
...
80. 896889 phrase occurs 0 times 
...
95. 900325 phrase occurs once

so here is the document 896889
PMID
	896889
ArticleTitle
	Estrogen-induced sexual receptivity and localization of 3H-estradiol in
brains of female mice: effects of 5 alpha-reduced androgens, progestins
and cyproterone acetate.
AbstractText
	Sexual receptivity induced in ovariectomized CD-1 mice with chronic
daily administration of estradiol benzoate (E2 B) was blocked by
concurrent administration of the 5 alpha-reduced androgen,
dihydrotestosterone (DHT). Receptivity was restored in these females
with progesterone-, but not with dihydroprogesterone-priming 6 hr prior
to testing. Delaying the DHT injections until 12 hr after the E2 B
injections greatly reduced its inhibitory properties. Receptivity in E2
B-primed females was also blocked by concurrent treatment with
cyproterone acetate and 3 alpha-, but not 3 beta-adrostanediol.
Pretreatment with DHT, or 3 alpha- or 3 beta-androstanediol failed to
consistently affects 3H-estradiol accumulation in crude nuclear and
supernatant fractions from brain and pituitary

so apart from doing something wrong while indexing/analyzing (the text
above is from the xml, but i double checked...it is put in teh index
with these textfragments) or so, the token "experiment" does not even
occur. thats what baffles me.

thanks for the very quick reaction
jan

Am Mittwoch, den 17.11.2010, 12:57 -0500 schrieb Donna L Gresh:
> As it is probably more likely that you're doing something incorrect than 
> that Lucene is reporting incorrect results :), it might help if you 
> reported the exact query that is being submitted to the IndexSearcher, and 
> then showing us the document that was incorrectly returned. My guess is 
> that either looking at the query itself will immediately reveal the 
> problem to you, or that the query in combination with the document and 
> knowledge of which analyzers you are using will reveal the problem-
> 
> Donna 
> 
> 
> Jan <fajerski@informatik.hu-berlin.de> wrote on 11/17/2010 11:47:49 AM:
> 
> > [image removed] 
> > 
> > uncorrect results
> > 
> > Jan 
> > 
> > to:
> > 
> > java-user
> > 
> > 11/17/2010 11:51 AM
> > 
> > Please respond to java-user
> > 
> > Hi,
> > i have an assignment in my Text Analytics class. I am supposed to create
> > an index and search it. The corpus is a PubMed-like XML file. it is
> > possible to query terms (programcall a few terms) and phrases
> > (programcall "a phrase"). 
> > When a phrase is queried the program should answer how often the phrase
> > occured.
> > The problem is, on certain queries the IndexSearcher returns some
> > documents that do not have that particular query in its fields.
> > I'd be delighted if someone could tell me what i am doing wrong.
> > See the source code at my github repo
> > 
> https://github.com/jangingnicht/TextAnalytics2/tree/master/src/textanalytics2/
> 
> > 
> > Thanks in advance
> > jan
> > 
> > PS: I use Lucene 3.0.2 and the OpenJDK Runtime Environment (IcedTea6
> > 1.8.2) on an 64 bit Linux machine.
> > [attachment "signature.asc" deleted by Donna L Gresh/Watson/IBM] 


Mime
View raw message