From java-user-return-64694-archive-asf-public=cust-asf.ponee.io@lucene.apache.org Mon Jan 6 04:33:37 2020 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [207.244.88.153]) by mx-eu-01.ponee.io (Postfix) with SMTP id D674418060E for ; Mon, 6 Jan 2020 05:33:36 +0100 (CET) Received: (qmail 69556 invoked by uid 500); 6 Jan 2020 04:33:32 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 69535 invoked by uid 99); 6 Jan 2020 04:33:32 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 06 Jan 2020 04:33:32 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id 19347C00CF for ; Mon, 6 Jan 2020 04:33:31 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 0 X-Spam-Level: X-Spam-Status: No, score=0 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, HTML_MESSAGE=0.2, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd4-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-ec2-va.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id 9cksXL-zu3tL for ; Mon, 6 Jan 2020 04:33:29 +0000 (UTC) Received-SPF: Pass (mailfrom) identity=mailfrom; client-ip=209.85.217.45; helo=mail-vs1-f45.google.com; envelope-from=ctengctsh@gmail.com; receiver= Received: from mail-vs1-f45.google.com (mail-vs1-f45.google.com [209.85.217.45]) by mx1-ec2-va.apache.org (ASF Mail Server at mx1-ec2-va.apache.org) with ESMTPS id 35179BC88C for ; Mon, 6 Jan 2020 04:33:29 +0000 (UTC) Received: by mail-vs1-f45.google.com with SMTP id p6so30893506vsj.11 for ; Sun, 05 Jan 2020 20:33:29 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:from:date:message-id:subject:to; bh=W2Mf0OWX0qsHsFIPAjOyYyiWUEKWGnHF4fU7Gt7GOFo=; b=QMqqLXx8jDmWTe0NkVZ/j0bFAohRfmD9obuYb5mAeSQINFdPjBJpGsuLe64WUR616P gzBdZFgxCmNYUQyYHXskzlJaKLqfUP+UI0KW4tG+19ObDgfaqFOL0FcHunYJxhmyv8df LVv+UsssFAGqPlqS0zDpf2Nra66QrFjdM+Hra37YNMAxEWNn6GWEbCPQWG/giCdyToR7 dzZP02E7YYy3Sj/jehUuwQ9NOKHytaYJ1d9S8sIFxxIbDGxX2tfSDf5/mi2OZYGD1Ye8 2Jz0sL8MyYKotBA+aYWL1RrPgzXd/qRzlnPpV9vjucwMiEv277OdP3dTJ/BFvLRkM3ew RCow== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:from:date:message-id:subject:to; bh=W2Mf0OWX0qsHsFIPAjOyYyiWUEKWGnHF4fU7Gt7GOFo=; b=mzzeYi3x50YMERdJ+7s8En4ZoUW8xAKG65GRIw26mbqliq6rZDSS/+XzhINnD5Z30Q 2G5ylWb/Fie3zsDkTMgtdmSJSttpKaNK4nAVgYNfUOhr0v/jL5xZdpNMojOpDP6dX3oh TLPs+FU/RVfqY5l3Ks239Lu7b/gDZZuefkJx+Tm1zDBS54Jn/EEOE3rWgLV4gwMUPx/+ E28vasOenrXv4629VSnPGq4iBk3j21vdsmKotDjJJoDe9uLElXnfBtOBeh2lMwO31Jv4 S4t6OXtm+E20aoU+6ADyix5cCB/QeuYgdqCU25halktb7K9sPgY3qFO1z6qGCc4rFErO dq0Q== X-Gm-Message-State: APjAAAWHbx9xKjME8bBNLn4a4Wn0Ge+OOkbbVDE5r3eMl6kVO7mimgYV b6/eFRBlvGNwKs+defZNBrtJFd4FPnQ9PdU48wcn8PktcZk= X-Google-Smtp-Source: APXvYqwtdOIMv1fO1BZ/4fbpLr+ss5VshZR2qt52aKTbHisu5oAIFGeuhxav5sDN8+PM4CCXph04rT/ggpNk2ibFUqY= X-Received: by 2002:a67:fb14:: with SMTP id d20mr52853100vsr.136.1578285203416; Sun, 05 Jan 2020 20:33:23 -0800 (PST) MIME-Version: 1.0 From: =?UTF-8?B?5bCP6bG85YS/?= Date: Mon, 6 Jan 2020 12:33:12 +0800 Message-ID: Subject: Needs advice on auto-keyword-correction mode custom query To: java-user@lucene.apache.org Content-Type: multipart/alternative; boundary="000000000000e76ce8059b712865" --000000000000e76ce8059b712865 Content-Type: text/plain; charset="UTF-8" Hi everybody, I want to implement an auto-keyword-correction mode custom query: suppose a scenario where user inputs a keyword query A, but due to typo or other reasons, A should be B, A is not a valid term in lucene's index which B is. (I'm not considering NLP in high-dimensional semantice space which is out of scope here) I could use 2 queries to do this, but it's too costly. What i need is a "early-termination" mode: (1) keyword A will hit a non-empty DocIDSet so will not query B; Or (2) keyword A's DocIDSet will be empty and B's will then match That is "A OR B" likewise in C/C++ language. But here i notice Lucene's BooleanQuery's SHOULD relationship is not the solving way. Perhaps i need to implement another custom query class? btw, How can Lucene's Query API become high-order composable? Lucene's "LeafContext" concept is really very confusing me... --000000000000e76ce8059b712865--