accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rob Povey <>
Subject Re: Accumulo Iterator painful development because TS don't pick up changes to Jars
Date Fri, 30 Oct 2015 21:13:27 GMT
Thanks, we’ll give this a try next week, and see if the issue is in fact the VFS Jar version.
We certainly commonly change several iterator Jars simultaneously.

Rob Povey

From: "<>" <<>>
Reply-To: "<>" <<>>
Date: Friday, October 30, 2015 at 9:33 AM
To: "<>" <<>>
Subject: Re: Accumulo Iterator painful development because TS don't pick up changes to Jars

Also, turn the logging on the tservers up to DEBUG for org.apache.accumulo.start.classloader.*.
You should see a line in the log that starts with "monitoring <file>"

Sent: Friday, October 30, 2015 12:22:53 PM
Subject: Re: Accumulo Iterator painful development because TS don't pick up changes to Jars

Try replacing the vfs jar in lib with a 2.1-SNAPSHOT. Several issues have been fixed, but
one of them is that if more than one monitored resource changed then it would miss some of

From: "Rob Povey" <<>>
Sent: Friday, October 30, 2015 11:57:27 AM
Subject: Re: Accumulo Iterator painful development because TS don't pick up changes to Jars

Thanks for the help with this,

To be clear I believe we are using the context class loader for each of the sets of tables,
and we don’t see the jar reloaded reliably when they are changed. This behavior is consistent
running just a 1 node stop on my local machine for development or on the cluster. Copying
the jars into the lib/ext directory however always seems to pick up the change.

These are from my dev box, but the clusters look the same just with many more contexts

default    | general.vfs.classpaths ................................. |
site       | general.vfs.context.classpath.rob_maanaNgram ........... |
system     |    @override ........................................... | hdfs://localhost/user/maana/rob/iterators/maanaNgram/maana-iterators-plugins-core_2.11-assembly.jar
site       | general.vfs.context.classpath.rob_maanaSearch .......... |
system     |    @override ........................................... | hdfs://localhost/user/maana/rob/iterators/maanaSearch/maana-iterators-core_2.11-1.0-SNAPSHOT-assembly.jar

And then the context is set on the table like this

default    | table.classpath.context ................................ |
table      |    @override ........................................... | rob_maanaNgram

And below is most of the accumulo site.xml minus the secrets and Zookeeper sections, but none
of the iterators are in the  config classpaths.









    <description>Classpaths that accumulo checks for updates and class files.</description>


On 10/29/15, 5:27 PM, "<>" <<>>

>So, without seeing your configuration, I would suggest trying something before upgrading
to 1.7. In 1.5 we changed the classloader so that it could load from different locations.
At the same time, we added the concept of classloader contexts which are basically names for
locations for jars. Table(s) can be configured to use a classloader context allowing you to
deploy server side code for different applications in different locations. This new classloader
does "reload" jars on the classpath when they change; the same behavior with the older classloader
reading from lib/ext. You can read more about this feature at [1].
>We currently depend on Commons VFS 2.0 for this feature. Some bugs have been fixed and
you will have a better experience if you replace the VFS jar in the lib directory with a snapshot
of the 2.1 release[2].
>> -----Original Message-----
>> From:<> []
>> Sent: Thursday, October 29, 2015 8:04 PM
>> To:<>
>> Subject: RE: Accumulo Iterator painful development because TS don't pick up
>> changes to Jars
>> Can you provide the relevant classpath sections of your accumulo-site.xml
>> file?
>> > -----Original Message-----
>> > From: Rob Povey []
>> > Sent: Thursday, October 29, 2015 8:01 PM
>> > To:<>
>> > Subject: Accumulo Iterator painful development because TS don't pick
>> > up changes to Jars
>> >
>> > Caveat I’m still running 1.6.2 internally here, and things may have
>> > changed and I could simply “be doing it wrong”, or have missed the
>> > solution in the docs. It’s also probably not a typical use case.
>> >
>> > This is not really an issue for most day to day development, but our
>> > internal testing process makes this changing iterators a nightmare.
>> > Before I start I am aware of general.dynamic.classpaths, and because
>> > it appears that wildcards are only respected at the file level, which
>> > is insufficient for our use case as you’ll see later.
>> >
>> > I’ll try and explain our internal test environment to help understand
>> > the issue.
>> > We run daily (or more frequent) drops of our codebase against two
>> > internal clusters across a variety of data sources (most of them
>> > aren’t particularly large).
>> > To give some idea I count 462 tables on one of of the clusters and
>> > each instance of the application is using 11 or so tables of which 4
>> > or so have a variety of iterators we’ve written.
>> > To resolve the conflicts since our application predates namespaces we
>> > prefix the tables and the table contexts and upload the iterators to
>> > subdirectories with matching names.
>> > To complicate matters further many of the tables are dropped and new
>> > tables added at a pretty frightening rate, so having to change the
>> > configuration, and restart servers to add a new path to the
>> > dynamic.classpath property is something of a none starter.
>> >
>> > It all works fine until a build has a change in an iterator and is
>> > targeted against an existing table, the app correctly identifies and
>> > uploads the new jars, but accumulo obviously doesn’t pick up the
>> > change. In many cases I could live with it if simply dropping the
>> > tables and reingesting was sufficient, but short of ingesting into a
>> > new table name even that doesn’t always pick up the new Iterators.
>> > We have currently resorted to manually tracking every iterator change
>> > (the rate of which has at least slowed down recently) and doing
>> > rolling restarts of tablet servers on off hours, but we end up often
>> > not knowing if an bug is real or an issue in a TS having an old iterator loaded.
>> >
>> > Is there a way to get the TS to watch an entire subtree for Jar changes?
>> >
>> > Assuming there isn’t, when I get a few days without a looming
>> > deliverable, I was going to migrate to 1.7 and if that has the same
>> > issue look at making and submitting a fix.
>> >
>> >
>> > Rob Povey
>> >
>> >
>> >
>> >
>> >
>> >
>> > On 10/28/15, 2:25 PM, "Josh Elser" <<>>
>> >
>> > >Rob Povey wrote:
>> > >> However I’m pretty reticent right now to add anymore iterators to
>> > >> our project, they’ve been a test nightmare for us internally.
>> > >
>> > >Off-topic, I'd like to hear more about what is painful. Do you have
>> > >the time to fork off a thread and let us know how it hurts?

View raw message