Return-Path: X-Original-To: apmail-lucene-java-user-archive@www.apache.org Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 03C42E808 for ; Thu, 24 Jan 2013 13:11:43 +0000 (UTC) Received: (qmail 72925 invoked by uid 500); 24 Jan 2013 13:11:01 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 70489 invoked by uid 500); 24 Jan 2013 13:10:45 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 65835 invoked by uid 99); 24 Jan 2013 13:07:14 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 24 Jan 2013 13:07:14 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of crockabiscuit@gmail.com designates 209.85.220.179 as permitted sender) Received: from [209.85.220.179] (HELO mail-vc0-f179.google.com) (209.85.220.179) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 24 Jan 2013 13:07:06 +0000 Received: by mail-vc0-f179.google.com with SMTP id gb23so1355249vcb.10 for ; Thu, 24 Jan 2013 05:06:46 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:date:message-id:subject:from:to :content-type; bh=LUoxuHVRWby/IeBHvO5EK8F4NDLr5/FhWtK2Dc6MEjA=; b=RVS4KJORIVXVzWWqImrjJUStZNMjD3oDyGHoaVojnb/Rd+3SnSdGeuC0kIk2nO2art hf7mLuZiQRm1IX5jpDytcN2G21VdhrIJGUzWrhbwnlvHltPGx7p2Yp8V5M/bM07GXsC6 w+uRbCnfSuJHAp7kAaURw9XczexQci0G/j6L61fR3ksP9AKmbCgxaAH2UVczrU7lDiC1 Weizyxeh9XDzWgZifdU4msqOMNaZtqiyEiCJVwij64fuhx20WdwHhXmHnf9tREpPRLOf kfDkKmQnYs6LKpe8N1iQHS1QInPRBNOqD6GUzH3DpvMa9xJW8gSN0vBG1SGlQVCe9ITt Lt3g== MIME-Version: 1.0 X-Received: by 10.52.99.106 with SMTP id ep10mr1578511vdb.53.1359032806054; Thu, 24 Jan 2013 05:06:46 -0800 (PST) Received: by 10.220.37.140 with HTTP; Thu, 24 Jan 2013 05:06:45 -0800 (PST) Date: Thu, 24 Jan 2013 22:06:45 +0900 Message-ID: Subject: How do I best store my IRC log data in lucene indexes? From: crocket To: java-user@lucene.apache.org Content-Type: multipart/alternative; boundary=20cf307f30f8a49ace04d4087bb4 X-Virus-Checked: Checked by ClamAV on apache.org --20cf307f30f8a49ace04d4087bb4 Content-Type: text/plain; charset=ISO-8859-1 I have three data I want to store, search, and restore. It is for logging IRC messages. NICK time=the number of seconds passed since the epoch, 1970-01-01 00:00:00 UTC+0 network= me=0 or 1 old= new= KICKED time=the number of seconds passed since the epoch, 1970-01-01 00:00:00 UTC+0 network= chan= msg= kicker= mynick= MSG time=the number of seconds passed since the epoch, 1970-01-01 00:00:00 UTC+0 network= chan= msg= me=0 or 1 nick= Below are ideas for IRC log search web UI. [] Main UI : network("", freenode, ...) | channel("", ...) | nick | message 1) network and channel have dropdown boxes. nick and message are text boxes. 2) duration, network, and nick can be applied to every data. 3) channel and message are applicable to KICKED and MSG. [] Facets 1) duration(1day, 1 week, 1 month, 1 year, all) <-- just like google search tools 2) ... [] Category search(categories registered as facets) 1) network 2) channel Is it better to store NICK, KICKED, and MSG in one index directory or to store them in separate index directories? Are there other things that I should know or consider? --20cf307f30f8a49ace04d4087bb4--