lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Victor Podberezski <vpodberez...@cms-medios.com>
Subject Re: Help with a fieldcomparator!
Date Sun, 18 Jan 2015 04:26:10 GMT
Erick:

(sorry, I misspelled your name in my last email )

I tried a bunch of solutions.... none worked as I expected.

Basically none of them sorts the documents using the pattern as I expect.

This is my simplified code:

public class PatternFieldComparatorSource
extends FieldComparatorSource {


private String pattern;
private boolean ascending = false;

public PatternFieldComparatorSource(String pattern, boolean ascending){

this.ascending = ascending;
this.pattern = pattern;
 }

public  FieldComparator newComparator(String fieldname, int numHits, int
sortPos, boolean reversed) throws IOException {

return new PatternFieldComparator(numHits, fieldname);
}



class PatternFieldComparator extends FieldComparator {

    private final int[] values;

    private int[] currentReaderValues;

    private final String field;
    private int bottom;                           // Value of bottom of
queue

    HighTrafficFieldComparator(int numHits, String field) {
      values = new int[numHits];
      this.field = field;
    }

    public int compare(int slot1, int slot2) {
      // TODO: there are sneaky non-branch ways to compute
      // -1/+1/0 sign
      // Cannot return values[slot1] - values[slot2] because that
      // may overflow
      final int v1 = values[slot1];
      final int v2 = values[slot2];

      if (v1 > v2) {
        return 1;
      } else if (v1 < v2) {
        return -1;
      } else {
        return 0;
      }
    }

    public int compareBottom(int doc) {
      // TODO: there are sneaky non-branch ways to compute
      // -1/+1/0 sign
      // Cannot return bottom - values[slot2] because that
      // may overflow
      final int v2 = currentReaderValues[doc];
      if (bottom > v2) {
        return 1;
      } else if (bottom < v2) {
        return -1;
      } else {
        return 0;
      }
    }

    public void copy(int slot, int doc) {
      values[slot] = currentReaderValues[doc];
}

    public void setNextReader(IndexReader reader, int docBase) throws
IOException {


     currentReaderValues = FieldCache.DEFAULT.getInts(reader, field, new
FieldCache.IntParser() {
      public final int parseInt(final String val) {
      return getValueByPattern(val);
}
});
    }

    public void setBottom(final int bottom) {
      this.bottom = values[bottom];

    }

    public Comparable value(int slot) {
      return values[slot];
    }

}

private Integer getValueByPattern(String text) {
// if pattern is not present I return then max or min value possible
(depends if sort is ascending or descending).
int value = !ascending ? Integer.MAX_VALUE : Integer.MIN_VALUE;
 // if pattern is pressent...
if (text.contains(pattern)
{
value = Integer.parseInt(...) // extract the value and return
}
 return new Integer(value);
}

}

My code does not sort fine. I'm not finding a explanation why.

Thanks
Víctor

On Sat, Jan 17, 2015 at 9:04 PM, Erick Erickson <erickerickson@gmail.com>
wrote:

> Ah, OK. H.L. Mencken wrote something like:
> "For every complex problem there is a solution
> that is simple, elegant, and wrong". I specialize in these...
>
> I don't have a good answer for your question then. How
> is what you're trying failing?
>
> Best,
> Erick
>
> On Fri, Jan 16, 2015 at 4:59 PM, Victor Podberezski
> <vpodberezski@cms-medios.com> wrote:
> > Erik, Thanks for your reply.
> >
> > I wrote a simplification of the problem. Not only the values in the field
> > that can be sorted are "val1, val2,..." . they can also be "patternX1,
> > patternX2", etc.
> >
> > and in that case I need to sort according to different criteria. They're
> a
> > lot of differents patterns but not to much documents as result of the
> query
> > filter
> > For that reason I think the best way is a custom FieldComparator.
> >
> > Thanks
> > Víctor Podberezski
> >
> > On Fri, Jan 16, 2015 at 9:31 PM, Erick Erickson <erickerickson@gmail.com
> >
> > wrote:
> >
> >> Personally I would do this on the ingestion side with a new field.
> >> That is, analyze the input field when you were indexing the doc,
> >> extract the min value from any numbers, and put that in a
> >> new field. Then it's simply sorting by the new field. This is likely
> >> to be much more performant than reprocessing this at query
> >> time in a comparator.
> >>
> >> FWIW,
> >> Erick
> >>
> >> On Fri, Jan 16, 2015 at 4:00 PM, Victor Podberezski
> >> <vpodberezski@cms-medios.com> wrote:
> >> > I need a hand with a custom comparator.
> >> >
> >> > I have a field filled with words separated by spaces. Some words has
> >> > numbers inside.
> >> >
> >> >  I need to extract those numbers and sort the documents by this
> number. I
> >> > need to get the lower if there are more than 1 number .
> >> >
> >> > For example:
> >> >
> >> > doc1 "val2 aaaa val3" --> 2, 3 --> 2
> >> > doc2 "val5 aaaa val1" --> 5, 1 --> 1
> >> > doc3 "val7 bbbbb val5" --> 7, 5 ---> 5
> >> >
> >> > the sorted results have to be:
> >> >
> >> > doc2
> >> > doc1
> >> > doc3
> >> >
> >> > how can I achieve this?
> >> >
> >> > I have trouble migrating a functional solution from lucene 2.4 to
> lucene
> >> > 3.9 or higher (migration from ScoreDocComparator to fieldComparator).
> >> >
> >> > I try this:
> >> >
> >> >     public void setNextReader(IndexReader reader, int docBase) throws
> >> > IOException {
> >> >
> >> >       currentReaderValues = FieldCache.DEFAULT.getInts(reader, field,
> new
> >> > FieldCache.IntParser() {
> >> >       public final int parseInt(final String val) {
> >> >      return extractNumber(val);
> >> >       }
> >> > });
> >> >
> >> > and the rest equal to the IntComparator.
> >> > but this is not working
> >> >
> >> > Anybody has an idea of how resolve this problem?
> >> > Thanks,
> >> >
> >> > Víctor Podberezski
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >> For additional commands, e-mail: java-user-help@lucene.apache.org
> >>
> >>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message