Return-Path: X-Original-To: apmail-lucene-java-user-archive@www.apache.org Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id EA22510BDC for ; Sat, 3 Aug 2013 13:56:18 +0000 (UTC) Received: (qmail 41295 invoked by uid 500); 3 Aug 2013 13:56:16 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 41247 invoked by uid 500); 3 Aug 2013 13:56:16 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 41239 invoked by uid 99); 3 Aug 2013 13:56:15 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 03 Aug 2013 13:56:15 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=5.0 tests=RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of SRS0=CeyV85=RQ=basetechnology.com=jack@yourhostingaccount.com designates 65.254.253.52 as permitted sender) Received: from [65.254.253.52] (HELO mailout06.yourhostingaccount.com) (65.254.253.52) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 03 Aug 2013 13:56:08 +0000 Received: from mailscan08.yourhostingaccount.com ([10.1.15.8] helo=mailscan08.yourhostingaccount.com) by mailout06.yourhostingaccount.com with esmtp (Exim) id 1V5cJ9-0004C8-Ha for java-user@lucene.apache.org; Sat, 03 Aug 2013 09:55:47 -0400 Received: from impout02.yourhostingaccount.com ([10.1.55.2] helo=impout02.yourhostingaccount.com) by mailscan08.yourhostingaccount.com with esmtp (Exim) id 1V5cJ8-000165-So for java-user@lucene.apache.org; Sat, 03 Aug 2013 09:55:46 -0400 Received: from authsmtp04.yourhostingaccount.com ([10.1.18.4]) by impout02.yourhostingaccount.com with NO UCE id 8Dvm1m00205G96J01DvmUZ; Sat, 03 Aug 2013 09:55:46 -0400 X-Authority-Analysis: v=2.0 cv=HIVB5/Rv c=1 sm=1 a=UdCbmyego4VUa/xJBgcoFg==:17 a=aQzbgH187woA:10 a=OzSOJTDQuR8A:10 a=3jZET7lWBKwA:10 a=8nJEP1OIZ-IA:10 a=jvYhGVW7AAAA:8 a=AMs5z5mXaaEA:10 a=mV9VRH-2AAAA:8 a=3RtEPhlxGRMV-NwAmOAA:9 a=wPNLvfGTeEIA:10 a=ZyCNx9LFiA0kwLx3ZJIN5w==:117 X-EN-OrigOutIP: 10.1.18.4 X-EN-IMPSID: 8Dvm1m00205G96J01DvmUZ Received: from 207-237-114-232.c3-0.nyr-ubr1.nyr.ny.cable.rcn.com ([207.237.114.232] helo=JackKrupansky) by authsmtp04.yourhostingaccount.com with esmtpa (Exim) id 1V5cJ8-00086y-KL for java-user@lucene.apache.org; Sat, 03 Aug 2013 09:55:46 -0400 Message-ID: From: "Jack Krupansky" To: References: <51FCAFAC.5060406@rancoretech.com> In-Reply-To: <51FCAFAC.5060406@rancoretech.com> Subject: Re: How to Index each file and then each Line for Complete Phrase Match. Sample Data shown. Date: Sat, 3 Aug 2013 09:55:33 -0400 MIME-Version: 1.0 Content-Type: text/plain; format=flowed; charset="iso-8859-1"; reply-type=original Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal Importance: Normal X-Mailer: Microsoft Windows Live Mail 15.4.3555.308 X-MimeOLE: Produced By Microsoft MimeOLE V15.4.3555.308 X-EN-UserInfo: e0a4b55451ed9f27313ebf02e3d4348d:fc4a93e1349e680c52bdd723c0ab3ef6 X-EN-AuthUser: jack@basetechnology.com Sender: "Jack Krupansky" X-EN-OrigIP: 207.237.114.232 X-EN-OrigHost: 207-237-114-232.c3-0.nyr-ubr1.nyr.ny.cable.rcn.com X-Virus-Checked: Checked by ClamAV on apache.org Why not start with something simple? Like, index each log line as a tokenized text field and then do PhraseQuery against that text field? Is there something else you need beyond that? -- Jack Krupansky -----Original Message----- From: Ankit Murarka Sent: Saturday, August 03, 2013 3:22 AM To: java-user@lucene.apache.org Subject: How to Index each file and then each Line for Complete Phrase Match. Sample Data shown. Hello All, I have this mentioned in the log file. Till now I am indexing the complete directory containing files which contain data like this: Now I need to index each line of the file to implement complete phrase search. I intend to store phrases in index and then use SpellChecker API to suggest me similar phrases. 7/20/2013 7:45 *package execution happening-1 * FATAL *check request has been sent for instance* Ip:Port *EXCEPTION* 7/20/2013 7:45 *This is not working perfectly * DEBUG *check request for instance being received is status=200 * Ip:Port *EXCEPTION* 7/20/2013 7:45 *Encountering a constant error. * DEBUG *response is not proper.Expecting some more information on this detail. * Ip:Port *EXCEPTION* 7/20/2013 7:45 *This needs urgent attention * FATAL *I am still trying to ensure it is running perfectly. Encountering some issues. * Ip:Port *EXCEPTION* 7/20/2013 8:01 *Job is running fine.* INFO *************************************************************************\ *Exception Occured in ClassFactory* * Function() java.nullPointerException: Value is null * *Should not be null* To implement complete phrase search I reckon I need to index each line and store the phrase .*Phrases in the above mentioned table are highlighted in Bold.* So, if I am able to index these and store these phrases as indexes, so when User tries to search for "package executing", the Lucene would be able to provide me "package execution happening-1" as a valid suggestion.. These columns does not have a name to them and hence I cannot index based on column name. Also as shown in the table above, first column may contain time/date or a phrase in itself (shown in last row). Please suggest. How is it possible using Lucene and its API. Javadoc does not seem to guide me anywhere for this case. -- Regards Ankit Murarka "What lies behind us and what lies before us are tiny matters compared with what lies within us" --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org