accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sukant Hajra" <>
Subject Re: sanity checking application WALogs make sense
Date Sat, 15 Sep 2012 18:14:49 GMT
Excerpts from William Slacum's message of 2012-09-15 08:46:17 -0500:
> I'm a bit confused as to what you mean "if an iterator goes down
> mid-processing." If it goes down at all, then whatever scope it's running in-
> minor compaction, major compaction and scan- will most likely go down as well
> (unless your iterator eats an exception and ignores errors). A WALog
> shouldn't be deleted if whatever you were trying to do failed.

I believe I've answered my own question after thinking about iterators more and
looking at the code for some of the implementations.

I was thinking about iterators "writing" changes to Accumulo using something
like a BatchWriter.  Now I'm coming to the conclusion that even if that were
possible, it is not how iterators were designed, and very likely bad for data
integrity.  I don't feel that iterators should have any side-effects beyond
scanning data through the source provided by the init() method.  In this way,
I'm beginning to think about iterators more purely functionally.  Does that
sound right?  Or have people come up with iterator implementations with more

For instance, in one of my algorithms, authors might write conflicting data to
a row that needs to be resolved.  I feel I could install iterators at scan,
minor compaction, and major compaction to perform this resolution (which
happens to be a very simple idempotent operation).

Sorry if none of this sounds like a concrete question.  Some of what I'm
looking for is conversation and validation in light of some limited local
Accumulo expertise on my team.

Has anyone thought about building up a small IRC community, say on #accumulo on
Freenode?  There's a nice #hbase channel there, but at this point, I think I'm
past the point of asking Bigtable-general questions.


View raw message