From java-user-return-63965-archive-asf-public=cust-asf.ponee.io@lucene.apache.org Fri Aug 24 10:12:50 2018 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx-eu-01.ponee.io (Postfix) with SMTP id 3786B180629 for ; Fri, 24 Aug 2018 10:12:50 +0200 (CEST) Received: (qmail 94968 invoked by uid 500); 24 Aug 2018 08:12:48 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 94954 invoked by uid 99); 24 Aug 2018 08:12:48 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 24 Aug 2018 08:12:48 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id B07011A038E for ; Fri, 24 Aug 2018 08:12:47 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 1.879 X-Spam-Level: * X-Spam-Status: No, score=1.879 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd2-us-west.apache.org (amavisd-new); dkim=pass (1024-bit key) header.d=detectum.com Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id 5emwtDGWDTXY for ; Fri, 24 Aug 2018 08:12:46 +0000 (UTC) Received: from mail-oi0-f50.google.com (mail-oi0-f50.google.com [209.85.218.50]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTPS id 0946D5F242 for ; Fri, 24 Aug 2018 08:12:46 +0000 (UTC) Received: by mail-oi0-f50.google.com with SMTP id l82-v6so8352120oih.11 for ; Fri, 24 Aug 2018 01:12:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=detectum.com; s=mail; h=mime-version:references:in-reply-to:from:date:message-id:subject:to; bh=hkSOCHrD0JUbIIlC5tmQqPZTAMUfibjLEjo5Hv0ONMI=; b=fai9yUoa5A2PnQa0vcOzp5R8kOLED4fepo9UwxeLBc7jUX+v9oFhfnrlYtvhgJpjYf f3tGhUxe2Nzlehsek/G08RUNORnaoWTkgC8J0aF5QIXQG8E5wpaxVnrIrMj1o+LqghVq rroLBrpoXrfR7qiFITqg92tQONMN2IQvyX4nM= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to; bh=hkSOCHrD0JUbIIlC5tmQqPZTAMUfibjLEjo5Hv0ONMI=; b=BiURPOsbKiSQH99PLtQxUW6/ARm9iABFOh1/Bi/KUlLChKFm6ZnL9u+UWVox34jPku r4GizIjM77enEwzLcS2eCnQSBSt2nNRfAzDXxmKCws7OS8Auo9ZUNh9MX+6XeE4aSjGV 6oVdcMM3Oh9W3pZ026qBSiOM6u1FfoNelE+zpdo9YboizayFoDYcNNnMxXTaRAN8dsfG xBiY/kxx9bQCtSLpF6w36a93T/gdrSCbyPadATTQdrrqeWMVy1I7TU5cB5kFwEaWeOzZ f0adGtDrcdjAjRpdJGetXM/EM/a4Wfg9MECSh9JfFyZi3SC9mPy6ITGoenCpk6npObhP mnWg== X-Gm-Message-State: APzg51BcSThBQAa+nzZzRw9SkkW/5y5rxhwCLBFhcsK3Jqe+Xvg1hueR QGAxHxtlp/WPI7TlbdP8J7nrVZobE0cbBSr7R+sozogbpTU= X-Google-Smtp-Source: ANB0Vdae5SmiUjirAc43BZU8o80NZRUGZYwNsFNe1bTJy5gnQAJpcW54hBNz+6U6O6Wvxnd+hPaqhOkORf77H9EcyiU= X-Received: by 2002:aca:4288:: with SMTP id p130-v6mr522284oia.78.1535098365192; Fri, 24 Aug 2018 01:12:45 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Vadim Gindin Date: Fri, 24 Aug 2018 13:12:34 +0500 Message-ID: Subject: Re: Question about BytesRef and BinaryDocValues To: java-user@lucene.apache.org Content-Type: multipart/alternative; boundary="000000000000c0c6ab057429f072" --000000000000c0c6ab057429f072 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Kevin, the sequence is the following: get terms for the field, get postings for a term and further get payload from the postings. Have a read a little about reverse index structure and it will be more clear to you. Your Query creates Weight, that must create a scorer in the method scorer(context). The scheme could be the following: private Scorer scorer(LeafReaderContext context) { Terms fieldTerms =3D reader.terms(field); TermsEnum te =3D fieldTerms.iterator(); if (te.seekExact(t.bytes())) { PostingsEnum postingsEnum =3D te.postings(null, PostingsEnum.ALL); return CustomFieldScorer(postingsEnum) } return null; } After that you're getting a payload in a CustomFieldScorer.score() in the following way: postingsEnum.nextPosition(); BytesRef payload =3D postings.getPayload(); Regards, Vadim Gindin On Fri, Aug 24, 2018 at 10:16 AM Kevin Manuel wrote: > Hi Vadim, > > Thank you so much for your reply. I think you were right. > > So if a field is 'analyzed' how can I get both terms 'hey' and 'tom'? > > Thanks, > Kevin > > On Thu, Aug 23, 2018, 20:26 Vadim Gindin wrote: > > > Hi Kevin! > > > > I think that your field is "analyzed" and so your field value is divide= d > to > > 2 terms "hey" and "tom". So docvalue is written for each of them. > > > > Regards > > Vadim Gindin > > > > > > =D0=BF=D1=82, 24 =D0=B0=D0=B2=D0=B3. 2018, 5:19 Kevin Manuel : > > > > > Hi, > > > > > > I'm using lucene version 4.3.1 and I've implemented a custom score > query. > > > I'm trying to read the value for a field from the field cache. It's a > > text > > > field so I'm using getTerms which returns a binarydocvalues object. > > > > > > However on trying to get the bytes ref object for a document and > > converting > > > it to a string using utf8ToString I think characters after a whitespa= ce > > and > > > not being returned in the string. For instance if the field has 'hey > > tom', > > > the string only returns 'hey'. > > > > > > I tried this with version 4.10.0 too and I see the same thing. I was > > > wondering if there's something wrong with the way I'm accessing it or > it > > > was an issue in these versions. > > > > > > Thanks, > > > Kevin > > > > > > --000000000000c0c6ab057429f072--