lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
Subject Re: highlight problem
Date Thu, 05 May 2005 18:50:59 GMT
Hi, Mark,

Please ignore my previous posting. I sent it by accident. 

Sorry for the confusing. The complete code is here:
      Analyzer analyzer = new StandardAnalyzer();
      BufferedReader in = new BufferedReader(new InputStreamReader(;
      String line = in.readLine();

      if (line.length() == -1)

      Query query = QueryParser.parse(line, "contents", analyzer);
      Hits hits =;
      Highlighter highlighter =new Highlighter(new QueryScorer(query));

      String ids = "";
      for (int i = 0; i < hits.length(); i++) {
        Document doc = hits.doc(i);
        String text = doc.get("contents");
        String result = "";

        if(text != null ){
          TokenStream tokenStream = analyzer.tokenStream("contents",
              new StringReader(text));
          // Get 3 best fragments and seperate with a "..."                    
          result = highlighter.getBestFragments(tokenStream,
              text, 3, "...");

The result is as following. I only get three fragments, but the 3rd one contains
too much text and I don't copy all of them here. Just for example:
 market, TunisiaThe uncertainty about resource availability plays a large role
in water <B>management</B>..., and at all different scales of water
<B>management</B>. First, at the irrigation scheme level... <B>management</B>
the collective water resource is of ma jor imp ortance for these irrigation
schemes. The aim is here to understand the global risk taken at the scheme
level, by comparing the actual cropping choices and allocation rules to other
cropping possibilities and potential rules. Given an allocation rule, does a
collective cropping choice correspond to a high or a low risk taking ? In our
opinion, it would have been very difficult to estimate the individual risk
tolerances and, with them, the collective risk tolerance because the farms
diversity appeared to be very high and there are complex informal risk sharing
networks. That is why we did not try to determine the optimal risk taking The
study remains at the scale of the scheme: we consider, for example in El Melalsa
scheme, three global fields put under crops, one with wheat, another with
pepper-bean and the last one with melon. A model calculates the different yields
taking into account the water stresses impact, and hence the global profit, for
a given rainfall and a given allocation rule (see appendix).195.0.1A big risk
taking on El Melalsa irrigation schemeThe El Melalsa irrigation scheme is one of
the two studied schemes. It irrigates 160 ha owned by 54 farmers, using a pump
that delivers a flow of 24 l/s. The cropping pattern is based on the following
rotation: wheat then pepper associated with bean and finally melon. There is no
control over the area put under crops. Besides, there is a water turn but when a
farmer gets his turn, he can irrigate as long as he wants: the water turn length
lasts up to three weeks during spring, when both beans and melon have to be
irrigated. These two facts lead also to an over-cropping situation that can be
viewed as a Nash equilibrium (Faysse, &#65533; 2001). Two allocation rules were
simulated on this scheme (a brief description of the model is given in
appendix): first, the actual one, with a free individual irrigation length, and
then an ex ante allocation rule where the water turn length is fixed and each
crop receives during its water turn in proportion to its needs. By definition, a
"safe" cropping choice is one for which crops are sufficiently irrigated during
median rainfall year and with an irrigation set up when the soil reservoir
becomes lower than 0.85 than the Usable Water Reserve. The calculated cropping
choice is made of 40 ha of wheat, 20 of pepper-bean and 15 of melon. For a risky
cropping choice, we used the actual one on year 98-99, i.e. 60 ha of wheat, 21
of pepper-bean and 27 of melon. The choice between two cropping repartitions and
two allocation rules leads to four scenarios. On figure 9, the left table
presents the scenario description while the right table gives the total
valorization made on the whole scheme for different rainfalls: quinquennial
drought year, quinquennial wet year, the median year and the year 98-99 (which
stands between a median year and a quinquennial dry one). Moreover, the study
made on the field for the 98-99 campaign showed that the total profit made had
not exceeded 50 000 DT7 .water turnyear 98-99 quinquennial dry median
quinquennial wet50000 60000 70000 80000 90000cropping
patternscenariopepperbeanmelonwheatProduction on the schem e (Tunisan
Dinar)unlimited irrigation 0 (r&#65533;el) risky 61 21 27 period unlimited irrigation
period risky 2 61 21 27 daily repartition100000 110000120000
<Much more text>
My search criteria is "joint management", but I get three markups on management. 

Is there anything wrong with the text? 


> As much as you have shown of the example output is
> roughly what I would expect - using the default
> simpleFragmenter you get roughly 100 character sized
> fragments and you have shown 3 fragments sized 97, 100
> and 105 chars long separated by "...".
> > Of course the result is far more than this.
> So are you saying you had even more fragments in the
> "getBestFragments" return string separated by your
> choice of "..." separator?
> I also notice the text contains no markup - have you
> removed that from the example? 
> Cheers
> Mark
> ___________________________________________________________ 
> How much free photo storage do you get? Store your holiday 
> snaps for FREE with Yahoo! Photos
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message