accumulo-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dave Marion (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (ACCUMULO-4062) Change MutationSet.mutations to use HashSet
Date Fri, 20 Nov 2015 15:29:11 GMT

    [ https://issues.apache.org/jira/browse/ACCUMULO-4062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15018153#comment-15018153
] 

Dave Marion commented on ACCUMULO-4062:
---------------------------------------

Looking into this further - I looked at the server side code 
{noformat}
TabletServer.flush() -> CommitSession.commit() -> Tabet.commit() -> TabletMemory.mutate()
-> CommitSession.mutate() -> InMemoryMap.mutate()
{noformat}

at this point it calls one of the SimpleMap.mutate() implementations passing a list of mutations
and a counter which gets incremented each time the SimpleMap.mutate() method is called. Looking
at DefaultMap.mutate(), it creates a MemKey and add its to a map that uses the MemKeyComparator.
The MemKeyComparator uses the counter if the two keys are identical.

Having said all of that, the order of the mutations does appear to be preserved as you indicate.
However, this would only hold true if there is one client writing in that key space. If more
than one client were writing in that key space, then I think the tablet server would apply
them as they were received. 

Maybe some clients are counting on this behavior, but I don't think this behavior has been
explicitly stated as being guaranteed. I don't want to break any client that are counting
on this working, but I would like to see if there is a way to dedupe on the client side.


> Change MutationSet.mutations to use HashSet
> -------------------------------------------
>
>                 Key: ACCUMULO-4062
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-4062
>             Project: Accumulo
>          Issue Type: Improvement
>          Components: client
>            Reporter: Dave Marion
>
> Change TabletServerBatchWriter.MutationSet.mutations from a
> {code}
>   HashMap<String,List<Mutation>>
> {code}
> to
> {code}
>   HashMap<String,HashSet<Mutation>>
> {code}
> so that duplicate mutations added by a client are not sent to the server.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message