accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Roshan Punnoose <rosh...@gmail.com>
Subject Re: Teardown and deepCopy
Date Wed, 04 Jan 2017 16:57:49 GMT
Keith, just would like to ignore it. Basically just doing a distinct
operation on the column qualifiers.

Dylan, the hard part is that we are trying not to constitute the results on
the client side completely to do the distinct on the client side. This
piece is just a smaller piece of a larger query.

Thanks guys for the help. I feel like I'm trying to do something way out of
bounds of what Accumulo is really built to do. Just testing the bounds :)

Roshan

On Wed, Jan 4, 2017 at 11:54 AM Keith Turner <keith@deenlo.com> wrote:

> On Wed, Jan 4, 2017 at 11:42 AM, Roshan Punnoose <roshanp@gmail.com>
> wrote:
> > I have a tablet with an unsorted list of IDs in the Column Qualifier,
> these
> > IDs can repeat sporadically. So I was hoping to keep a set of these IDs
> > around in memory to check if I have seen an ID or not. There is some
> other
>
> When you see an ID again, what action do you want to take?
>
> > logic to ensure that the set does not grow unbounded, but just trying to
> > figure out if I can keep this ID set around. With the teardown, even
> though
> > I know which was the last Key to return from the new seek Range, I don't
> > know if I have seen the upcoming IDs. Not sure if that makes sense...
> >
> > Was thinking that on teardown, we could use either the deepCopy or init
> > method to rollover state from the torn down iterator to the new iterator.
> >
> > On Wed, Jan 4, 2017 at 11:14 AM Keith Turner <keith@deenlo.com> wrote:
> >>
> >> On Wed, Jan 4, 2017 at 10:44 AM, Roshan Punnoose <roshanp@gmail.com>
> >> wrote:
> >> > Keith,
> >> >
> >> > If an iterator has state that it is maintaining, what is the best way
> to
> >> > transfer that state to the new iterator after a tear down?  For
> example,
> >> > MyIterator might have a Boolean flag of some sort. After tear down, is
> >> > there
> >> > a way to copy that state to the new iterator before it starts seeking
> >> > again?
> >>
> >> There is nothing currently built in to help with this.
> >>
> >> What are you trying to accomplish?  Are you interested in maintaining
> >> this state for a scan or batch scan?
> >>
> >>
> >> >
> >> > Roshan
> >> >
> >> > On Wed, Jan 4, 2017 at 10:33 AM Keith Turner <keith@deenlo.com>
> wrote:
> >> >>
> >> >> Josh,
> >> >>
> >> >> Deepcopy is not called when an iterator is torn down.  It has an
> >> >> entirely different use. Deepcopy allows cloning of an iterator during
> >> >> init().  The clones allow you to have multiple pointers into a
> tablets
> >> >> data which allows things like server side joins.
> >> >>
> >> >> Keith
> >> >>
> >> >> On Wed, Dec 28, 2016 at 12:50 PM, Josh Clum <joshclum@gmail.com>
> wrote:
> >> >> > Hi,
> >> >> >
> >> >> > I have a question about iterator teardown. It seems from
> >> >> >
> >> >> >
> >> >> >
> https://github.com/apache/accumulo/blob/master/docs/src/main/asciidoc/chapters/iterator_design.txt#L383-L390
> >> >> > that deepCopy should be called when an iterator is torn down.
I'm
> not
> >> >> > seeing
> >> >> > that behavior. Below is a test that sets table.scan.max.memory
to 1
> >> >> > which
> >> >> > should force a tear down for each kv returned. I should see
> deepCopy
> >> >> > being
> >> >> > called 3 times but when I tail the Tserver logs I'm not seeing
it
> >> >> > being
> >> >> > called. Below is the test and the Tserver output.
> >> >> >
> >> >> > What am I missing here?
> >> >> >
> >> >> > Josh
> >> >> >
> >> >> > ➜  tail -f -n200 ...../accumulo/logs/TabletServer_*.out | grep
> >> >> > MyIterator
> >> >> > MyIterator: init
> >> >> > MyIterator: seek
> >> >> > MyIterator: hasTop
> >> >> > MyIterator: getTopKey
> >> >> > MyIterator: getTopValue
> >> >> > MyIterator: init
> >> >> > MyIterator: seek
> >> >> > MyIterator: hasTop
> >> >> > MyIterator: getTopKey
> >> >> > MyIterator: getTopValue
> >> >> > MyIterator: init
> >> >> > MyIterator: seek
> >> >> > MyIterator: hasTop
> >> >> > MyIterator: getTopKey
> >> >> > MyIterator: getTopValue
> >> >> > MyIterator: init
> >> >> > MyIterator: seek
> >> >> > MyIterator: hasTop
> >> >> >
> >> >> > public static class MyIterator implements
> SortedKeyValueIterator<Key,
> >> >> > Value>
> >> >> > {
> >> >> >
> >> >> >     private SortedKeyValueIterator<Key, Value> source;
> >> >> >
> >> >> >     public MyIterator() { }
> >> >> >
> >> >> >     @Override
> >> >> >     public void init(SortedKeyValueIterator<Key, Value>
source,
> >> >> >                      Map<String, String> options,
> >> >> >                      IteratorEnvironment env) throws IOException
{
> >> >> >         System.out.println("MyIterator: init");
> >> >> >         this.source = source;
> >> >> >     }
> >> >> >
> >> >> >     @Override
> >> >> >     public boolean hasTop() {
> >> >> >         System.out.println("MyIterator: hasTop");
> >> >> >         return source.hasTop();
> >> >> >     }
> >> >> >
> >> >> >     @Override
> >> >> >     public void next() throws IOException {
> >> >> >         System.out.println("MyIterator: next");
> >> >> >         source.next();
> >> >> >     }
> >> >> >
> >> >> >     @Override
> >> >> >     public void seek(Range range, Collection<ByteSequence>
> >> >> > columnFamilies,
> >> >> > boolean inclusive) throws IOException {
> >> >> >         System.out.println("MyIterator: seek");
> >> >> >         source.seek(range, columnFamilies, inclusive);
> >> >> >     }
> >> >> >
> >> >> >     @Override
> >> >> >     public Key getTopKey() {
> >> >> >         System.out.println("MyIterator: getTopKey");
> >> >> >         return source.getTopKey();
> >> >> >     }
> >> >> >
> >> >> >     @Override
> >> >> >     public Value getTopValue() {
> >> >> >         System.out.println("MyIterator: getTopValue");
> >> >> >         return source.getTopValue();
> >> >> >     }
> >> >> >
> >> >> >     @Override
> >> >> >     public SortedKeyValueIterator<Key, Value>
> >> >> > deepCopy(IteratorEnvironment
> >> >> > env) {
> >> >> >         System.out.println("MyIterator: deepCopy");
> >> >> >         return source.deepCopy(env);
> >> >> >     }
> >> >> > }
> >> >> >
> >> >> > @Test
> >> >> > public void testTearDown() throws Exception {
> >> >> >     String table = "test";
> >> >> >     Connector conn = cluster.getConnector("root", "secret");
> >> >> >     conn.tableOperations().create(table);
> >> >> >     conn.tableOperations().attachIterator(table, new
> >> >> > IteratorSetting(25,
> >> >> > MyIterator.class));
> >> >> >     conn.tableOperations().setProperty(table,
> >> >> > "table.scan.max.memory",
> >> >> > "1");
> >> >> >
> >> >> >     BatchWriter writer = conn.createBatchWriter(table, new
> >> >> > BatchWriterConfig());
> >> >> >
> >> >> >     Mutation m1 = new Mutation("row");
> >> >> >     m1.put("f1", "q1", 1, "val1");
> >> >> >     writer.addMutation(m1);
> >> >> >
> >> >> >     Mutation m2 = new Mutation("row");
> >> >> >     m2.put("f2", "q2", 1, "val2");
> >> >> >     writer.addMutation(m2);
> >> >> >
> >> >> >     Mutation m3 = new Mutation("row");
> >> >> >     m3.put("f3", "q3", 1, "val3");
> >> >> >     writer.addMutation(m3);
> >> >> >
> >> >> >     writer.flush();
> >> >> >     writer.close();
> >> >> >
> >> >> >     BatchScanner scanner = conn.createBatchScanner(table, new
> >> >> > Authorizations(), 3);
> >> >> >     scanner.setRanges(Collections.singletonList(new Range()));
> >> >> >     for(Map.Entry<Key, Value> entry : scanner) {
> >> >> >         System.out.println(entry.getKey() + " : " +
> >> >> > entry.getValue());
> >> >> >     }
> >> >> >     System.out.println("Results complete!");
> >> >> > }
>

Mime
View raw message