lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From lukai <lukai1...@gmail.com>
Subject Re: Questions about lucene TokenStream
Date Sun, 04 Nov 2012 22:59:36 GMT
Hi, thanks for the reply. Could you elaborate "The AttributeFactory creates
a new one for every new TokenStream instance." ? because i only find the
implementation like this:

 private static Class<? extends AttributeImpl> getClassForInterface(Class<?
extends Attribute> attClass) {

        final WeakReference<Class<? extends AttributeImpl>> ref =
attClassImplMap.get(attClass);

        Class<? extends AttributeImpl> clazz = (ref == null) ? null :
ref.get();

        if (clazz == null) {

          // we have the slight chance that another thread may do the same,
but who cares?

          try {

            attClassImplMap.put(attClass,

              new WeakReference<Class<? extends AttributeImpl>>(

                clazz = Class.forName(attClass.getName() + "Impl", true,
attClass.getClassLoader())

                .asSubclass(AttributeImpl.class)

              )

            );

          } catch (ClassNotFoundException e) {

            throw new IllegalArgumentException("Could not find implementing
class for " + attClass.getName());

          }

        }

        return clazz;

      }


Seems the key is Class of attribute, the impl instance is cached.


Thanks,

On Sun, Nov 4, 2012 at 2:35 PM, Uwe Schindler <uwe@thetaphi.de> wrote:

> Hi,
>
> > Hmmm, the reason i asked this question is regarding to implementation of
> :
> >
> > CharTermAttribute.
> >
> >
> > It seems tokenizer will set token read from reader into it, and the
> following
> > tokenstream can also get this instance. My concern is in a multi-thread
> > envioment.  another thread can also change the content of
> > CharTermAttributeImpl.
> > the token is got with this api:
> > char[] buffer = termAtt.buffer();
> >
> > but, the buffer can be changed by another thread, right? cuz it's the
> same
> > object. I'm not concern about the create/get object phrase.
>
> No, it cannot. TokenStream instances can only be consumed by one thread
> (iterator pattern). In addition, every TokenStream has its private
> CharTermAttribute, there is no singleton. The AttributeFactory creates a
> new one for every new TokenStream instance.
>
> Uwe
>
> > Thanks,
> >
> > On Sun, Nov 4, 2012 at 2:07 PM, Uwe Schindler <uwe@thetaphi.de> wrote:
> >
> > > Hi,
> > >
> > > >  I have two confused questions regarding Lucene implementation, hope
> > > > someone can give me some clues.
> > > >
> > > >   1. It's about the AttributeSource/AttributeSourceImpl
> implemenation.
> > > > Seems like the default instance was kept as "static"
> > > > in DefaultAttributeFactory. But we get these instances in analyzer
> > > directly. In
> > > > this point of view, analyzer implementation is not thread safe,
> right?
> > > > Because each attributesourceimpl object will refer to same instance
> > > > but without synchronization.
> > >
> > > This is not a problem. Analyzers are threadsafe; every TokenStream
> > > instance is used only in one thread.
> > > The DefaultAttributeFactory singleton is not a problem, as the
> > > instance is only used read-only and all members are final and the
> > > internal state of DefaultAttributeFactory instance does not change.
> > > The static cache inside DefaultAttributeFactory#getClassForInterface
> > > is thread safe, as it uses a concurrent map. The members of the map
> > > are Class<? extends AttributeImpl>; each AttributeImpl is created with
> > > newInstance() from the Class<? extends
> > > AttributeImpl> instance.
> > >
> > > Uwe
> > >
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > > For additional commands, e-mail: java-user-help@lucene.apache.org
> > >
> > >
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message