lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael McCandless (JIRA)" <j...@apache.org>
Subject [jira] Created: (LUCENE-2664) Add SimpleText codec
Date Thu, 23 Sep 2010 19:33:32 GMT
Add SimpleText codec
--------------------

                 Key: LUCENE-2664
                 URL: https://issues.apache.org/jira/browse/LUCENE-2664
             Project: Lucene - Java
          Issue Type: Improvement
          Components: Index
            Reporter: Michael McCandless
            Assignee: Michael McCandless
             Fix For: 4.0


Inspired by Sahin Buyrukbilen's question here:

  http://www.lucidimagination.com/search/document/b68846e383824653/how_to_export_lucene_index_to_a_simple_text_file#b68846e383824653

I made a simple read/write codec that stores all postings data into a
single text file (_X.pst), looking like this:

{noformat}
field contents
  term file
    doc 0
      pos 5
  term is
    doc 0
      pos 1
  term second
    doc 0
      pos 3
  term test
    doc 0
      pos 4
  term the
    doc 0
      pos 2
  term this
    doc 0
      pos 0
END
{noformat}

The codec is fully funtional -- all Lucene & Solr tests pass with
-Dtests.codec=SimpleText -- but, its performance is obviously poor.

However, it should be useful for debugging, transparency,
understanding just what Lucene stores in its index, etc.  And it's a
quick way to gain some understanding on how a codec works...


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message