accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Christopher <>
Subject Re: Should I store Long values as String or Long?
Date Tue, 14 May 2013 00:47:54 GMT
Well, encoding it might save space, but strings are nice and
human-readable, especially in the shell, and in the overall scheme of
things, a string probably isn't really that much larger on disk,
especially after compression.

Christopher L Tubbs II

On Mon, May 13, 2013 at 6:09 PM, Mike Hugo <> wrote:
> I've been playing around with the LongCombiner on a table that's summing up
> the counts of output of a MapReduce job, very similar to the WordCount
> example from the user manual.
> I started out encoding the values using LongCombiner.FIXED_LEN_ENCODER, but
> have noticed that this can lead to some confusion later on downstream.  For
> example, a co-worker was scanning using the shell and was caught off guard
> by the encoded values.  Also, out of the box, the StatsCombiner example
> works using String values, not Long values so we built a custom piece to
> essentially do the same thing with Long values instead.
> It looks to me like most of the examples I've seen just store things are
> String values, rather than encoding them.  What are the tradeoffs?  We're at
> a point where we could pretty easily switch things to just use strings - it
> seems like that might make things more convenient from a maintenance
> perspective (human readable values) and would allow us to re-use some
> existing components (e.g. StatsCombiner).  Any thoughts?
> Thanks,
> Mike

View raw message