accumulo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Billie J Rinaldi <billie.j.rina...@ugov.gov>
Subject Re: Could combiners be coded using groovy?
Date Sun, 20 May 2012 14:07:29 GMT
I wouldn't recommend a custom comparator, as this will cause your test implementation to be
farther away from what Accumulo is actually doing.  The key point to understand is that when
an entry is inserted into Accumulo it is assigned a timestamp unless it already has one. 
Thus, the three key/value pairs you've put into a TreeMap would have been assigned different
timestamps if they had been put into Accumulo, making their Keys different instead of identical.
 To simulate Accumulo's versioning / combining behavior using a TreeMap, you need to explicitly
set the timestamps of the Keys you are putting into the TreeMap.

For example:

    Key key = new Key("row", "cf", "cq", 1l);
    key = new Key("row", "cf", "cq", 2l);
    key = new Key("row", "cf", "cq", 3l);

Billie


On Saturday, May 19, 2012 6:12:28 PM, "Adam Fuchs" <adam.p.fuchs@ugov.gov> wrote:
> The base semantics of Accumulo are actually more of multimap, and the
> VersioningIterator is what turns a table into a map. You could try
> using a
> custom comparator when you construct your TreeMap to effectively turn
> it
> into a multimap. Something like this ought to do the trick:
> 
> // warning -- totally untested code -- might not even compile
> class NeverEqual implements Comparator<Key> {
> public int compare(Key a, Key b) {
> int result = a.compareTo(b);
> if(result == 0)
> return 1;
> return result;
> }
> }
> 
> Cheers,
> Adam
> 
> 
> On Sat, May 19, 2012 at 5:57 PM, David Medinets
> <david.medinets@gmail.com>wrote:
> 
> > I finally got a chance to try your suggestion. But I'm confused
> > because the semantics of a TreeMap seem different from those of
> > Accumulo. For example, here I insert some data into the TreeMap:
> >
> >                TreeMap<Key, Value> tm = new TreeMap<Key, Value>();
> >                Key key = new Key(new Text("row"), new Text("cf"),
> >                new
> > Text("cq"),
> > new Text(""));
> >                Value value = new Value("13".getBytes());
> >                tm.put(key, value);
> >
> >                key = new Key(new Text("row"), new Text("cf"), new
> > Text("cq"), new Text(""));
> >                value = new Value("14".getBytes());
> >                tm.put(key, value);
> >
> >                key = new Key(new Text("row"), new Text("cf"), new
> > Text("cq"), new Text(""));
> >                value = new Value("15".getBytes());
> >                tm.put(key, value);
> >
> > And then I try to use a SummingCombiner which I have used
> > successfully
> > against Accumulo. Here is that code:
> >
> >                Map<String,String> options = new HashMap<String,
> >                String>();
> >                options.put("type", "STRING");
> >
> >                SummingCombiner iter = new SummingCombiner();
> >
> >                IteratorSetting is = new IteratorSetting(1,
> > SummingCombiner.class, options);
> >                Combiner.setCombineAllColumns(is, true);
> >
> >                iter.init(new SortedMapIterator(tm), is.getOptions(),
> >                null);
> >                iter.seek(new Range(), new ArrayList<ByteSequence>(),
> > false);
> >
> >                while (iter.hasTop()) {
> >                        Key k = iter.getTopKey();
> >                        Value v = iter.getTopValue();
> >                         System.out.println("K: " + k + " V: " + v);
> >                        iter.next();
> >                }
> >                System.out.println("END");
> >
> > Here is the output:
> >
> > START
> > K: row cf:cq [] 9223372036854775807 false V: 15
> > END
> >
> > The SummingCombiner is only seeing one record which makes sense
> > since
> > the keys overwrite each other in the TreeMap. Am I missing
> > something?
> >
> > On Tue, Apr 10, 2012 at 3:57 PM, Billie J Rinaldi
> > <billie.j.rinaldi@ugov.gov> wrote:
> > > I'm not familiar with Groovy, but it sounds interesting. I could
> > recommend some ways to test your iterator before you push it out to
> > Accumulo. You can make some fake data for a unit test by creating a
> > TreeMap<Key,Value> and then using a SortedMapIterator to turn that
> > into a
> > source for your iterator. A lot of our unit tests look like the
> > following.
> > >
> > >  TreeMap<Key,Value> tm = new TreeMap<Key,Value>();
> > >  // put some data into the tree map
> > >
> > >  MyIterator iter = new MyIterator();
> > >
> > >  IteratorSetting is = new IteratorSetting(1, MyIterator.class);
> > >  MyIterator.setSomeOption(is, option);
> > >
> > >  iter.init(new SortedMapIterator(tm), is.getOptions(), null);
> > >  iter.seek(new Range(), new ArrayList<ByteSequence>(), false);
> > >
> > >  while (iter.hasTop()) {
> > >    Key k = iter.getTopKey();
> > >    Value v = iter.getTopValue();
> > >    // check that k and v are what you expected
> > >    iter.next();
> > >  }
> > >
> > > Another option is to use the ClientSideIteratorScanner to test
> > > your
> > iterator in your local JVM before running it on a tserver.
> > >
> > > Billie
> > >
> > >
> > > On Sunday, April 8, 2012 11:08:05 PM, "David Medinets" <
> > david.medinets@gmail.com> wrote:
> > >> I was working with combiners and seeing the jar file loaded and
> > >> reloaded. And seeing my accumulo crash because I coded the
> > >> combiner
> > >> incorrectly. I started to wonder how easier it might be to easy a
> > >> dynamically compiled language like Groovy to developer combiners.
> > >>
> > >> How hard would it be to integrate Groovy? Have any of the core
> > >> accumulo developers used groovy?
> > >>
> > >> Is there a better language than groovy now? I last worked with
> > >> groovy
> > >> #$%$ years ago. It worked very well for me.
> >

Mime
View raw message