Return-Path: X-Original-To: apmail-lucene-java-user-archive@www.apache.org Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 8D3C8104E9 for ; Sun, 18 Jan 2015 04:26:40 +0000 (UTC) Received: (qmail 14947 invoked by uid 500); 18 Jan 2015 04:26:40 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 14884 invoked by uid 500); 18 Jan 2015 04:26:40 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 14871 invoked by uid 99); 18 Jan 2015 04:26:38 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 18 Jan 2015 04:26:38 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of vpodberezski@tfsla.com designates 209.85.217.175 as permitted sender) Received: from [209.85.217.175] (HELO mail-lb0-f175.google.com) (209.85.217.175) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 18 Jan 2015 04:26:12 +0000 Received: by mail-lb0-f175.google.com with SMTP id z11so23747969lbi.6 for ; Sat, 17 Jan 2015 20:26:10 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=tfsla.com; s=tfsla; h=mime-version:sender:in-reply-to:references:date:message-id:subject :from:to:content-type; bh=sfUVpeIT9GarQX5cZPWAuDjslsFMe1nOVqOkZIrfpY4=; b=Lgcm9hY77R883wOsygDfXUhXP3MF1wKHGbanRYGV7l0LP98socJ2BP+0lopjIKSD5c 0q6zgyOaeuUmMCdDSstVIVEqBEuPgBXCtdBuA9gd/mR6bOWeOqSVK3lZ+CnqAswxMtk/ QB4eQlydRJlpyd9fkj3LSBvn1rzl/WE0tv5Zk= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cms-medios.com; s=cms-medios; h=mime-version:sender:in-reply-to:references:date:message-id:subject :from:to:content-type; bh=sfUVpeIT9GarQX5cZPWAuDjslsFMe1nOVqOkZIrfpY4=; b=TI7eLE/L4/uTG0nI3IOoDrNNdpxuG3a5lOrEmbCfUp1QMKf74Bx91wwgET18UJpGBo G8T8YxmN2jLgLhXvzGn++kTR29kF84BFA4FprR0yME5Ublhd2CMaYV8cCGRI7sF6ThHm W5GzQ/eeEByQj92jwD6qJbOo/PK90NS/FSv+k= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:sender:in-reply-to:references:date :message-id:subject:from:to:content-type; bh=sfUVpeIT9GarQX5cZPWAuDjslsFMe1nOVqOkZIrfpY4=; b=nGlUwPXlpXfMIoleoGmLQIU2x0GoCegAs38oyNfd4CJhDi/msnRjB5VIlGCitP/gxz IM0cjcCayxCrjHGuoY4AxXa/v0pc6B4f9/k+UlEzfdGRCQVRNdwYYziia9Vt44V4/She yKz3EBRfdZNgEEEOARiyll/sp2PQpxaxcWoGyNw7p/qAq2oHRmT/qNMLFtWDl9BFoC5x iuQZySfxBAaHBnTfjGFfZa2CwgBm9xAAVcc8uT3z7R9C/mwK1xMyzdzvPIDk0gKFiuX8 iRE8eb1PJSP9KVOJR7d6euHb9lPPHfyHJU6zYMoZvOnHD56ISpd1/aYUoWUg+ur3OG8h dzMA== X-Gm-Message-State: ALoCoQkbGUdG0/4EW8JvcjLO8fbtN9vV7vAJbaO/JeF+JQHuTQnKVrn6fc0SWg+tuncatXMe9t1K MIME-Version: 1.0 X-Received: by 10.152.25.194 with SMTP id e2mr23380772lag.22.1421555170704; Sat, 17 Jan 2015 20:26:10 -0800 (PST) Sender: vpodberezski@tfsla.com Received: by 10.25.136.212 with HTTP; Sat, 17 Jan 2015 20:26:10 -0800 (PST) In-Reply-To: References: Date: Sun, 18 Jan 2015 01:26:10 -0300 X-Google-Sender-Auth: O_5Z-nwUUPZKiNZ2G5YXdDPMDRo Message-ID: Subject: Re: Help with a fieldcomparator! From: Victor Podberezski To: java-user@lucene.apache.org Content-Type: multipart/alternative; boundary=047d7b5db77afa96ca050ce59a23 X-Virus-Checked: Checked by ClamAV on apache.org --047d7b5db77afa96ca050ce59a23 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Erick: (sorry, I misspelled your name in my last email ) I tried a bunch of solutions.... none worked as I expected. Basically none of them sorts the documents using the pattern as I expect. This is my simplified code: public class PatternFieldComparatorSource extends FieldComparatorSource { private String pattern; private boolean ascending =3D false; public PatternFieldComparatorSource(String pattern, boolean ascending){ this.ascending =3D ascending; this.pattern =3D pattern; } public FieldComparator newComparator(String fieldname, int numHits, int sortPos, boolean reversed) throws IOException { return new PatternFieldComparator(numHits, fieldname); } class PatternFieldComparator extends FieldComparator { private final int[] values; private int[] currentReaderValues; private final String field; private int bottom; // Value of bottom of queue HighTrafficFieldComparator(int numHits, String field) { values =3D new int[numHits]; this.field =3D field; } public int compare(int slot1, int slot2) { // TODO: there are sneaky non-branch ways to compute // -1/+1/0 sign // Cannot return values[slot1] - values[slot2] because that // may overflow final int v1 =3D values[slot1]; final int v2 =3D values[slot2]; if (v1 > v2) { return 1; } else if (v1 < v2) { return -1; } else { return 0; } } public int compareBottom(int doc) { // TODO: there are sneaky non-branch ways to compute // -1/+1/0 sign // Cannot return bottom - values[slot2] because that // may overflow final int v2 =3D currentReaderValues[doc]; if (bottom > v2) { return 1; } else if (bottom < v2) { return -1; } else { return 0; } } public void copy(int slot, int doc) { values[slot] =3D currentReaderValues[doc]; } public void setNextReader(IndexReader reader, int docBase) throws IOException { currentReaderValues =3D FieldCache.DEFAULT.getInts(reader, field, new FieldCache.IntParser() { public final int parseInt(final String val) { return getValueByPattern(val); } }); } public void setBottom(final int bottom) { this.bottom =3D values[bottom]; } public Comparable value(int slot) { return values[slot]; } } private Integer getValueByPattern(String text) { // if pattern is not present I return then max or min value possible (depends if sort is ascending or descending). int value =3D !ascending ? Integer.MAX_VALUE : Integer.MIN_VALUE; // if pattern is pressent... if (text.contains(pattern) { value =3D Integer.parseInt(...) // extract the value and return } return new Integer(value); } } My code does not sort fine. I'm not finding a explanation why. Thanks V=C3=ADctor On Sat, Jan 17, 2015 at 9:04 PM, Erick Erickson wrote: > Ah, OK. H.L. Mencken wrote something like: > "For every complex problem there is a solution > that is simple, elegant, and wrong". I specialize in these... > > I don't have a good answer for your question then. How > is what you're trying failing? > > Best, > Erick > > On Fri, Jan 16, 2015 at 4:59 PM, Victor Podberezski > wrote: > > Erik, Thanks for your reply. > > > > I wrote a simplification of the problem. Not only the values in the fie= ld > > that can be sorted are "val1, val2,..." . they can also be "patternX1, > > patternX2", etc. > > > > and in that case I need to sort according to different criteria. They'r= e > a > > lot of differents patterns but not to much documents as result of the > query > > filter > > For that reason I think the best way is a custom FieldComparator. > > > > Thanks > > V=C3=ADctor Podberezski > > > > On Fri, Jan 16, 2015 at 9:31 PM, Erick Erickson > > > wrote: > > > >> Personally I would do this on the ingestion side with a new field. > >> That is, analyze the input field when you were indexing the doc, > >> extract the min value from any numbers, and put that in a > >> new field. Then it's simply sorting by the new field. This is likely > >> to be much more performant than reprocessing this at query > >> time in a comparator. > >> > >> FWIW, > >> Erick > >> > >> On Fri, Jan 16, 2015 at 4:00 PM, Victor Podberezski > >> wrote: > >> > I need a hand with a custom comparator. > >> > > >> > I have a field filled with words separated by spaces. Some words has > >> > numbers inside. > >> > > >> > I need to extract those numbers and sort the documents by this > number. I > >> > need to get the lower if there are more than 1 number . > >> > > >> > For example: > >> > > >> > doc1 "val2 aaaa val3" --> 2, 3 --> 2 > >> > doc2 "val5 aaaa val1" --> 5, 1 --> 1 > >> > doc3 "val7 bbbbb val5" --> 7, 5 ---> 5 > >> > > >> > the sorted results have to be: > >> > > >> > doc2 > >> > doc1 > >> > doc3 > >> > > >> > how can I achieve this? > >> > > >> > I have trouble migrating a functional solution from lucene 2.4 to > lucene > >> > 3.9 or higher (migration from ScoreDocComparator to fieldComparator)= . > >> > > >> > I try this: > >> > > >> > public void setNextReader(IndexReader reader, int docBase) throw= s > >> > IOException { > >> > > >> > currentReaderValues =3D FieldCache.DEFAULT.getInts(reader, fie= ld, > new > >> > FieldCache.IntParser() { > >> > public final int parseInt(final String val) { > >> > return extractNumber(val); > >> > } > >> > }); > >> > > >> > and the rest equal to the IntComparator. > >> > but this is not working > >> > > >> > Anybody has an idea of how resolve this problem? > >> > Thanks, > >> > > >> > V=C3=ADctor Podberezski > >> > >> --------------------------------------------------------------------- > >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org > >> For additional commands, e-mail: java-user-help@lucene.apache.org > >> > >> > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org > For additional commands, e-mail: java-user-help@lucene.apache.org > > --047d7b5db77afa96ca050ce59a23--