commons-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "David Smiley (@MITRE.org)" <DSMI...@mitre.org>
Subject Re: Thread-safety of Commons-Codec Encoder
Date Wed, 02 May 2012 21:23:23 GMT

On May 2, 2012, at 3:43 PM, sebb-2-2 [via Apache Commons] wrote:

On 2 May 2012 20:16, David Smiley (@MITRE.org) <[hidden email]<x-msg://487/user/SendEmail.jtp?type=node&node=4604253&i=0>>
wrote:
> The "kind of thread-safety" I would expect and want is for Encoder.encode()
> to be thread-safe.  I should be able to instantiate and configure an
> Encoder, then share it for multiple concurrent threads to use to call
> encode() as many times as they please at the same time.  I will assume it is
> up to me (the client/user of the API) to arrange for a happens-before
> relationship between instantiation/configuration and the encode() call, by
> using any one of a variety of means of doing so.  This is kind of a given
> but I want to be clear.

That is necessary for safe publication, but not sufficient if the
class changes the values of any instance variables as part of the
encode() processing.

Yes, obviously an encode() method that changes state is not thread-safe.  I'm establishing
that encode() should be thread-safe and then elaborating on other matters aside from the thread-safety
of encode(), namely Encoder configuration.

In the case of Metaphone#setLength(), the variable is not volatile, so
is not inherently thread-safe.
Provided that you set the length once initially, and used suitable
means to publish the instance to the different threads, it would then
be safe to use from multiple threads, so long as setLength() was not
called again. So you would need to create separate instances for each
length.

I'm not sure it would make sense to share an instance between multiple
threads and then change the length in one thread, so the fact that the
length variable is not volatile really only affects how the instance
needs to be shared initially.

It makes no sense to me either, to try and make initialization/configuration methods of class
thread-safe (speaking generally here, not just for Encoder), and setLength and setMaxCodeLength
are such methods.

So I think you can say that the Metaphone class is conditionally thread-safe.

Awesome.  Judging from Thomas's table in this thread every Encoder is already thread-safe
by our definition of it.  I looked at BeiderMorseEncoder & PhoneticEngine and I think
it's thread-safe.  I saw many cases where methods return new altered copies of objects instead
of modifying internal object state.

> I haven't looked at any of the source behind the Encoders, but one area that
> may be problematic is any lazy-instantiation/initialization of state to
> instance variables in a non-thread-safe manner by the encode() method.  If
> Encoder had an init() method as part of its api, it would be easy to avoid
> this problem but alas, it doesn't.

A public init() method is but one way to do this, and would not help
where there are setters.
[Well, I suppose the init() commmand could "freeze" any mutable values]

Yes, that's what I implied with such a name -- it might freeze it or not and and users are
expected to not change the state following initialization which I think of as a given in the
absence of documentation otherwise.

So if we agree that the Encoders that exist are thread-safe (as we've defined it) and that
they should be in general, then I plan to treat them that way in Lucene.

~ David


-----
 Author: http://www.packtpub.com/apache-solr-3-enterprise-search-server/book
--
View this message in context: http://apache-commons.680414.n4.nabble.com/Thread-safety-of-Commons-Codec-Encoder-tp4600956p4604487.html
Sent from the Commons - User mailing list archive at Nabble.com.
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message