lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael McCandless (JIRA)" <>
Subject [jira] Updated: (LUCENE-2923) cleanup contrib/demo
Date Thu, 17 Feb 2011 10:52:24 GMT


Michael McCandless updated LUCENE-2923:

    Attachment: LUCENE-2923.patch

OK new patch, fixing a number of things:

  * I close the Reader (thanks Mark).

  * I cutover to NumericField (and stopped using DateTools) for the
    "modified" field.

  * I added a -create option to IndexFiles, so you can see how to

  * I left commented-out optional things -- calling optimize,
    increasing IW's RAM buffer.

  * Don't use Version.LUCENE_CURRENT.

  * I sucked in test files from Lucene in Action 2E's tests (open
    source licenses).

  * I use addDocument or updateDocument depending on -create.

  * I made the "demo html parser" private to modules/benchmark, which
    had a dependency on it.  Can someone lookover my changes to the
    build xml files?  (Especially the Maven part, where I completely

  * IndexHTML is gone, and the webapp (src/jsp/*) is gone too.

To apply the patch you first have to do this:

svn mv lucene/contrib/benchmark/src/java/org/apache/lucene/demo/html modules/benchmark/src/java/org/apache/lucene/benchmark/byTask/feeds/demohtml
svn mv lucene/contrib/demo/src/test/org/apache/lucene/demo/html modules/benchmark/src/test/org/apache/lucene/benchmark/byTask/feeds/demohtml

> cleanup contrib/demo
> --------------------
>                 Key: LUCENE-2923
>                 URL:
>             Project: Lucene - Java
>          Issue Type: Improvement
>            Reporter: Michael McCandless
>            Assignee: Michael McCandless
>            Priority: Minor
>             Fix For: 3.1, 4.0
>         Attachments: LUCENE-2923.patch, LUCENE-2923.patch
> I don't think we should include optimize in the demo; many people start from the demo
and may think you must optimize to do searching, and that's clearly not the case.
> I think we should also use a buffered reader in FileDocument?
> And... I'm tempted to remove IndexHTML (and the html parser) entirely.  It's ancient,
and we now have Tika to extract text from many doc formats.

This message is automatically generated by JIRA.
For more information on JIRA, see:


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message