Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 59341 invoked from network); 14 Apr 2010 16:10:55 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 14 Apr 2010 16:10:55 -0000 Received: (qmail 6959 invoked by uid 500); 14 Apr 2010 16:10:52 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 6849 invoked by uid 500); 14 Apr 2010 16:10:52 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 6832 invoked by uid 99); 14 Apr 2010 16:10:52 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 14 Apr 2010 16:10:52 +0000 X-ASF-Spam-Status: No, hits=2.2 required=10.0 tests=FREEMAIL_FROM,HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of railan.xisto@gmail.com designates 209.85.218.211 as permitted sender) Received: from [209.85.218.211] (HELO mail-bw0-f211.google.com) (209.85.218.211) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 14 Apr 2010 16:10:44 +0000 Received: by bwz3 with SMTP id 3so309713bwz.11 for ; Wed, 14 Apr 2010 09:10:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:received:message-id:subject:from:to:content-type; bh=9fHoxSzHmodRgLlWXTxVctbEO45zr2y0K/KLt+yu8t0=; b=noOFgSOvyEK3OHAWWgm4zvF6/RBUtAAAf2Pv82QjwJ7ijWoUWJP2lyOZeNQsZpKV6t 0W/H1+hfWED1Li+xFzQd23NYMU9b+CT0IptbeUBAUBXsxVRjlgMl83RrnAHHTEAc/xtO XCfKqPdsUMeG141C/6OneTosI2hQeX6obfN/c= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; b=XluFAmrmtRe91Oa78IxSp6dfDoqGfmxspyoZJspb2Fl2MJDL0oc0XIxUQpWyDesgiK bIaRMEWRkDWVMeOXDnO+kaCJaHiogErUYNlynHoUWAgVZqLae7xHbXL75I9LP6cAfU97 XPVXZP9FABvM2ogVH3Xby+uBe/BEV/iFe5ZLs= MIME-Version: 1.0 Received: by 10.204.15.1 with HTTP; Wed, 14 Apr 2010 09:10:22 -0700 (PDT) In-Reply-To: References: Date: Wed, 14 Apr 2010 13:10:22 -0300 Received: by 10.204.74.98 with SMTP id t34mr8401945bkj.154.1271261422384; Wed, 14 Apr 2010 09:10:22 -0700 (PDT) Message-ID: Subject: Re: Removing terms in the Index From: Railan Xisto To: java-user@lucene.apache.org Content-Type: multipart/alternative; boundary=0016e6d77dac807ebd0484349d61 X-Virus-Checked: Checked by ClamAV on apache.org --0016e6d77dac807ebd0484349d61 Content-Type: text/plain; charset=ISO-8859-1 Actually the doc1 with the terms to be searched, has two words "Lucene in Action" and "Lucene". I want when I pass "Lucene in Action", it shows the result and remove the word not to be found when I pass only the term "Lucene". In short, the term "Lucene" not find the phrase "Lucene in Action", since the entire phrase was searched before. It is the idea of N-Gram (complete sentences) and U-Gram (isolated words). Gave to understand? 2010/4/13 Shai Erera > I ran your code. Since I don't have the queries file (Docs/documento.txt), > I > set this line instead: > > String termos = "\"Lucene in Action\""; > > When I set it to \"Lucene\", both documents are found. When I set it to > \"Lucene in Action\" only the first document is found. Seems correct to me. > > Can you please explain this: > "I pass the word "Lucene in Action", it find and > remove that term of phrase in the Index" > > --> what do you mean "find and remove"? > > Shai > > On Mon, Apr 12, 2010 at 8:49 PM, Railan Xisto >wrote: > > > And the main objective: when I pass the word "Lucene in Action", it find > > and > > remove that term of phrase in the Index, for when I pass the 2nd term > > ("Lucene"), he does not find that phrase anymore, as has been found the > > "Lucene in Action" . > > > > > > 2010/4/12 Railan Xisto > > > > > Ok. There is a piece of code attached.. As I already said, I want to > pass > > > that when the term "Lucene in Action" he finds only the 1st sentence. > > > > > > > > > > > > > > > 2010/4/10 Shai Erera > > > > > > Hi. I'm not sure I understand what you searched for. When you search > > >> for "Lucene in action", do you search it with the quotes or not? If > > >> with the quotes, then I don't understand how the 2nd dox is found. > > >> > > >> Do you perhaps have a test code you can share w/ us? It can be a short > > >> and simple main which creates an index w/ some documents and then > > >> searches them. > > >> > > >> Shai > > >> > > >> On Saturday, April 10, 2010, Fotos fotos > > wrote: > > >> > Hello! > > >> > I am a beginner with Lucene. I'm needing to do the following: > > >> > > > >> > I have a text file with the following terms: > > >> > > > >> > "Lucene in action" > > >> > "Lucene" > > >> > > > >> > and a file with the following sentences: > > >> > > > >> > 1 - "Lucene in action now." > > >> > 2 - "Lucene for Dummies" > > >> > 3 - "Managing Gigabytes" > > >> > > > >> > I need to search in phrases of doc2, the terms of doc1. > > >> > > > >> > But in search of the word n-grama: "Lucene in Action", he also finds > > the > > >> 2nd > > >> > sentence. > > >> > > > >> > In this case, I want to meet with the term 1 ("Lucene in Action"), > > only > > >> the > > >> > first phrase and remove the term of the index, for not to be found > > when > > >> I > > >> > pass the term 2 ("Lucene") > > >> > > > >> > Railan Xisto > > >> > Web Developer > > >> > > > >> > > >> --------------------------------------------------------------------- > > >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org > > >> For additional commands, e-mail: java-user-help@lucene.apache.org > > >> > > >> > > > > > > --0016e6d77dac807ebd0484349d61--