lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael McCandless (JIRA)" <j...@apache.org>
Subject [jira] Commented: (LUCENE-2960) Allow (or bring back) the ability to setRAMBufferSizeMB on an open IndexWriter
Date Mon, 14 Mar 2011 17:56:34 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-2960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13006533#comment-13006533
] 

Michael McCandless commented on LUCENE-2960:
--------------------------------------------

{quote}
Er.. While I have never personally witnessed unsynchronized long/double tearing,
I've seen the consequence of unsafely publishing a HashMap - an endless loop on get().
{quote}

I've also seen JMM strike too -- it caused one of our unit tests to
spin forever, because a "volatile" was missing.

But this will never impact rarely used fields (infoStream,
termIndexInterval, segmentWarmer, etc.), in practice.

Really we need an anal Java impl. (or, maybe, CPU) that randomly
asserts its "rights" under JMM, to hold a cached copy of any field
that's not volatile for unusual/random lengths of time (basically an
"adversary" yet still playing by the JMM rules).  Such an impl would
find TONS of JMM bugs in Lucene (and I imagine any other Java
app/library tested).

Yet, no "real" Java impl out there will ever do this since doing so
will simply make that Java impl appear buggy.  (Well, and, it'd be bad
for perf. -- obviously the Java impl, CPU cache levels, should cache
only frequently used things).

It's exactly why all web browsers today are tolerant to a missing
</html> tag and no browser could afford to suddenly refuse to render
because you're missing the </html> tag.

I'm not saying we shouldn't put in our </html> tags in Lucene; we
definitely should... we have no choice.  But, in practice, these
missing </html> tags all throughout Lucene are not a problem.

bq. I ask to make IWC immutable at the very least

IWC cannot be made immutable -- you build it up incrementally (new
IWC(...).setThis(...).setThat(...)).  Its fields cannot be
final. (Well, one field can and is: analyzer).

How about this as a compromise: IW continues cloning the incoming IWC
on init, as it does today.  This means any changes to the IWC instance
you passed to IW will have no effect on IW.

But, if you want to change something live, you can
IW.getConfig().setFoo(...).  The config instance is a private clone to
that IW.


> Allow (or bring back) the ability to setRAMBufferSizeMB on an open IndexWriter
> ------------------------------------------------------------------------------
>
>                 Key: LUCENE-2960
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2960
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>            Reporter: Shay Banon
>            Priority: Blocker
>             Fix For: 3.1, 4.0
>
>
> In 3.1 the ability to setRAMBufferSizeMB is deprecated, and removed in trunk. It would
be great to be able to control that on a live IndexWriter. Other possible two methods that
would be great to bring back are setTermIndexInterval and setReaderTermsIndexDivisor. Most
of the other setters can actually be set on the MergePolicy itself, so no need for setters
for those (I think).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message