accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "shweta.agrawal" <shweta.agra...@orkash.com>
Subject Re: how to maintain versioning in D4M schema?
Date Mon, 30 Nov 2015 04:09:59 GMT
The example which I am working is:

rowid        colf          colq          value
   id                        field|value1      1
   id                        field|value2      1
   id                        field|value3      1
   id                        field|value4      1
   id                        field|value5      1
   id                        field|value6      1

This is my schema in D4M style. Here one field has multiple values. And 
I want to keep latest 3 values and I want that automatically other 
values to be deleted as in case of versioning iterator.

So after versioning my table should look like this:

rowid        colf          colq          value
   id                        field|value1      1
   id                        field|value2      1
   id                        field|value3      1

Thanks
Shweta

On Friday 27 November 2015 07:15 PM, Jeremy Kepner wrote:
> Can you provide a made up specific example?  I think that will
> make the discussion easier.
>
>
> On Fri, Nov 27, 2015 at 02:46:33PM +0530, shweta.agrawal wrote:
>> Thanks for the answer.
>> But I am asking about versioning in D4M style. How can I use
>> versioning iterator in D4M style as in D4M style, in Rowid id is
>> strored and field|value is stored in ColumnQualifier. So as value is
>> stored in columnQualifier I cannot maintain versions through
>> versioning iterator. So I am asking how will I maintain versioning
>> in D4M style?
>>
>> Thanks
>> Shweta
>>
>> On Friday 27 November 2015 12:45 PM, Dylan Hutchison wrote:
>>> In order to store five versions of a key but return only one of
>>> them during a scan, set the minc and majc VersioningIterator to 5
>>> and set the scan VersioningIterator to 1.  You can set scanning
>>> iterators on a per-scan basis if this helps.
>>>
>>> It is not necessary to put the timestamp in the column family if
>>> you are going with the VersioningIterator approach.
>>>
>>> There are many ways to achieve versioning in Accumulo. As the
>>> designer/programmer, you must choose one that fits your
>>> application, of which we do not know the full details. It sounds
>>> like you have narrowed your choice to (1) putting the timestamp in
>>> the column family, or (2) not putting the timestamp anywhere else
>>> but instead changing the VersioningIterator such that Accumulo
>>> stores more versions than the latest version of a
>>> (row,colfam,colqual,colvis) key.
>>>
>>>
>>>
>>> On Thu, Nov 26, 2015 at 8:45 PM, mohit.kaushik
>>> <mohit.kaushik@orkash.com <mailto:mohit.kaushik@orkash.com>>
>>> wrote:
>>>
>>>     David,
>>>
>>>     But this is the case when we store versions based on timestamp
>>>     field. The point is, in D4M schema we can not achieve it by doing
>>>     this. In this case we are considering CF to store timestamp in
>>>     reverse order as described by Dylan. Then how can we configure
>>>     Accumulo to return only latest version and store only 5 versions?
>>>
>>>     Thanks
>>>     Mohit Kaushik
>>>
>>>     On 11/27/2015 09:54 AM, David Medinets wrote:
>>>>      From the user manual:
>>>>
>>>>     user@myinstance  mytable>  config  -t  mytable  -s  table.iterator.scan.vers.opt.maxVersions=5
>>>>     user@myinstance  mytable>  config  -t  mytable  -s  table.iterator.minc.vers.opt.maxVersions=5
>>>>     user@myinstance  mytable>  config  -t  mytable  -s  table.iterator.majc.vers.opt.maxVersions=5
>>>>
>>>>     On Thu, Nov 26, 2015 at 11:10 PM, shweta.agrawal
>>>>     <shweta.agrawal@orkash.com <mailto:shweta.agrawal@orkash.com>>
wrote:
>>>>
>>>>         I want to maintain 5 versions only and user can enter any
>>>>         number of versions but I want to keep only 5 latest version.
>>>>
>>>>
>>>>         On Friday 27 November 2015 09:38 AM, David Medinets wrote:
>>>>>         Do you want five versions of every entry or will the number
>>>>>         of versions vary?
>>>>>
>>>>>         On Thu, Nov 26, 2015 at 10:53 PM, shweta.agrawal
>>>>>         <shweta.agrawal@orkash.com
>>>>>         <mailto:shweta.agrawal@orkash.com>> wrote:
>>>>>
>>>>>             Thanks Dylan and David.
>>>>>             I can store version information in column family. But my
>>>>>             problem is when I have many versions of the same key how
>>>>>             will I manage that. In Accumulo versioning I can specify
>>>>>             that how many versions I want to manage.
>>>>>
>>>>>             Suppose I have 10 versions and I only want 5 versions to
>>>>>             store, how to manage this in a big table?
>>>>>
>>>>>             Thanks
>>>>>             Shweta
>>>>>
>>>>>             On Thursday 26 November 2015 10:22 PM, David Medinets wrote:
>>>>>>             What are the query patterns? If you are versioning for
>>>>>>             auditing then changing the VersioningIterator seems the
>>>>>>             easiest approach. You could also store
>>>>>>             application-specific version information in the column
>>>>>>             family. One of the reasons that D4M does not use it is
>>>>>>             to allow application-specific uses. Using the CF means
>>>>>>             that any applications that understand D4M would not
>>>>>>             need to change their queries to adjust for the version
>>>>>>             information.
>>>>>>
>>>>>>             On Thu, Nov 26, 2015 at 4:26 AM, shweta.agrawal
>>>>>>             <shweta.agrawal@orkash.com
>>>>>>             <mailto:shweta.agrawal@orkash.com>> wrote:
>>>>>>
>>>>>>                 Hi,
>>>>>>
>>>>>>                 I have my data stored in D4M style. I also want to
>>>>>>                 maintain versions of different value on the basis
>>>>>>                 of time.  As in D4M style  data is only in rowid
>>>>>>                 and colQualifier only.
>>>>>>
>>>>>>                 Is there any way to achieve versioning in D4M schema?
>>>>>>
>>>>>>                 Thanks
>>>>>>                 Shweta
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>>     --
>>>
>>>     *Mohit Kaushik*
>>>     Software Engineer
>>>     A Square,Plot No. 278, Udyog Vihar, Phase 2, Gurgaon 122016, India
>>>     *Tel:*+91 (124) 4969352 <tel:%2B91%20%28124%29%204969352> |
>>>     *Fax:*+91 (124) 4033553 <tel:%2B91%20%28124%29%204033553>
>>>
>>>     <http://politicomapper.orkash.com>interactive social intelligence
>>>     at work...
>>>
>>>     <https://www.facebook.com/Orkash2012>
>>>     <http://www.linkedin.com/company/orkash-services-private-limited>
>>>     <https://twitter.com/Orkash> <http://www.orkash.com/blog/>
>>>     <http://www.orkash.com>
>>>     <http://www.orkash.com> ... ensuring Assurance in complexity and
>>>     uncertainty
>>>
>>>     /This message including the attachments, if any, is a confidential
>>>     business communication. If you are not the intended recipient it
>>>     may be unlawful for you to read, copy, distribute, disclose or
>>>     otherwise use the information in this e-mail. If you have received
>>>     it in error or are not the intended recipient, please destroy it
>>>     and notify the sender immediately. Thank you /
>>>
>>>


Mime
View raw message