Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 90224 invoked from network); 12 Mar 2009 06:18:18 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 12 Mar 2009 06:18:18 -0000 Received: (qmail 34284 invoked by uid 500); 12 Mar 2009 06:18:10 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 34252 invoked by uid 500); 12 Mar 2009 06:18:10 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 34238 invoked by uid 99); 12 Mar 2009 06:18:10 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 11 Mar 2009 23:18:10 -0700 X-ASF-Spam-Status: No, hits=2.2 required=10.0 tests=HTML_MESSAGE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of visual.logic@gmail.com designates 209.85.221.108 as permitted sender) Received: from [209.85.221.108] (HELO mail-qy0-f108.google.com) (209.85.221.108) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 12 Mar 2009 06:18:00 +0000 Received: by qyk6 with SMTP id 6so545313qyk.29 for ; Wed, 11 Mar 2009 23:17:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:mime-version:to:message-id :content-type:from:subject:date:x-mailer; bh=NCjswuWeMEBqcEGLxNHmoNYVSd6ePPDQNj/OYB3ZBmU=; b=O7RQOuhaRlx03CJrK/O/Ua51OxlAMcnhTHwHYXBWUK7eya4deWn8UrUb3ixjx7iGPa Ep/WuhIFLW3wgnyzOl0CoRSKqEt9XNib6n71ATAM+nf726ht4ssMAwrmIY0Mj26dVGgk HRLNoZTyA72ljFKBaC3r2vnlc/imnd2mro7x8= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:to:message-id:content-type:from:subject:date:x-mailer; b=sFesFCbPgvipeGtP8iaJG9RFqHct5yYdE0Rf15UkQxFmijMLryeMZca1g6a+JjGD3k wQsybHINYkqanCI/zmK8HIrGHClAwc5YinpIDvw9W1WnLXaQiR4+ua7uU9BC8zuJgKNZ crj+G9EhcipCA6Qfh/pMR6Dx7ePPeAmDyL+o4= Received: by 10.224.6.83 with SMTP id 19mr12550902qay.242.1236838659530; Wed, 11 Mar 2009 23:17:39 -0700 (PDT) Received: from ?192.168.0.100? (S0106001cf046b1cd.wp.shawcable.net [24.79.95.139]) by mx.google.com with ESMTPS id 26sm153843qwa.12.2009.03.11.23.17.37 (version=TLSv1/SSLv3 cipher=RC4-MD5); Wed, 11 Mar 2009 23:17:38 -0700 (PDT) Mime-Version: 1.0 (Apple Message framework v753.1) To: java-user@lucene.apache.org Message-Id: <96BE5568-C565-4487-A4FB-045C8F244138@gmail.com> Content-Type: multipart/alternative; boundary=Apple-Mail-1-833420237 From: "Thomas J. Buhr" Subject: Dynamic Indexing? Date: Thu, 12 Mar 2009 01:17:37 -0500 X-Mailer: Apple Mail (2.753.1) X-Virus-Checked: Checked by ClamAV on apache.org --Apple-Mail-1-833420237 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=WINDOWS-1252; delsp=yes; format=flowed Lucene, =46rom what I have read on your website indexing does seem like a =20 useful thing. I'm considering the possible use of Lucene in a company =20= project and have a few research questions. What I'm considering is using Lucene as a backend data store for a =20 graphic editor. The typical usage examples given on your website =20 usually involve indexing data into an index from some file and then =20 searching the index. What I would like to do is create the data in a text editor in the UI =20= and then store the text data in an index as it is being created/=20 typed. Later, I would search for the data and display it as text =20 characters again, back in the text editor and edit it further. The kind of text I=92m creating and editing is unique. Each letter of a =20= word can have multiple character versions, below is a sample of this =20 where there are three possible letters for the second word position, =20 an =93e=94 or =93a=94 or =93i=94. 1 2 3 4 5 H e l l o a i I need to store all these characters for the five positions they =20 occupy as string values of one field in an index document. Then I =20 also need two more document fields for storing style (bold, italic=85) =20= data and also color data for each of the five positions. All these =20 three fields (text chars, style info, color info) need to be =20 positionally aligned. Some questions on data formatting and positional selection in my =20 scenario: 1 - Is it possible to store multiple chars for one position? How do I =20= specify the format? Can I use a delimited string list like x, (x, x, =20 x), x, x, x ? 2 - My text editor has a method that tells me what letter position in =20= string of words I clicked on with the mouse. Can I then create a =20 query and select the data for all three fields at the given word =20 letter position? In my view of indexing, fields are like rows of =20 values so this would be like picking out a complete column across the =20= three fields (the values intersecting the three rows at the given =20 position). Is this easy or difficult to do? More questions based on my simple data creation and editing scenario: 3 - If I type a new letter into a given word in the editor can I add/=20 insert its data at the right position in all three fields of data =20 used to describe/tag my letters? Do I have to create a whole new =20 document and re-index it or is the document flexible enough to allow =20 inplace dynamic editing? What if I remove/delete a letter? 4 - If I change a letter or it=92s style or color tags can these edits =20= be dynamically updated at the specific row and column intersection/=20 cell? Or do I need to re-index the whole document again? 5 - What does the term =93local alignment=94 mean in search engine =20 parlance? Is this referring to data value positions across fields =20 (rows) of data? 6 - Has anyone ever used an index as a =93Local History=94 system for =20= undo/redo operations? Would this be feasible? 7 - Is there a standard way to export/write documents out to a file =20 like JSON or XML? 8 - The Jackpot Plugin for the Netbeans IDE is a useful refactoring =20 tool. If Lucene was used as the model for Java code or other data =20 could a refactoring system like Jackpot be built to operate on data =20 in an Lucene index? Would the performance of Lucene be good enough to =20= be used for inplace dynamic editing/indexing? Thanks, hope this can work... Thom --Apple-Mail-1-833420237--