lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Uwe Schindler (JIRA)" <>
Subject [jira] Commented: (LUCENE-2384) Reset zzBuffer in StandardTokenizerImpl* when lexer is reset.
Date Thu, 08 Apr 2010 11:38:37 GMT


Uwe Schindler commented on LUCENE-2384:

patch to reset the zzBuffer when the input is reseted. The code is really taken from
so I can't really grant license to use it but I think the guy realeased it as public domain
by posting it to the mailing list. 
I tested it and it seems to work for me. Just including it here is case somebody want to apply
the patch directly to 3.0.1 (although it's better to wait for 3.1)

Your fix adds an addtional complexity. Just reset the buffer back to the default ZZ_BUFFERSIZE
if grown on reset. Your patch always reallocates a new buffer.

Use this:
public final void reset(Reader r) {
  // reset to default buffer size, if buffer has grown
  if (zzBuffer.length > ZZ_BUFFERSIZE) {
    zzBuffer = new char[ZZ_BUFFERSIZE];

> Reset zzBuffer in StandardTokenizerImpl* when lexer is reset.
> -------------------------------------------------------------
>                 Key: LUCENE-2384
>                 URL:
>             Project: Lucene - Java
>          Issue Type: Sub-task
>          Components: Analysis
>    Affects Versions: 3.0.1
>            Reporter: Uwe Schindler
>            Assignee: Uwe Schindler
>             Fix For: 3.1
>         Attachments: reset.diff
> When indexing large documents, the lexer buffer may stay large forever. This sub-issue
resets the lexer buffer back to the default on reset(Reader).
> This is done on the enclosing issue.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message