kudu-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Juan Pablo Briganti <juan.briga...@globant.com>
Subject Re: Backup and restore of Kudu Metadata/Data
Date Mon, 14 Nov 2016 13:46:05 GMT
Hello everyone!

following Amit's questions about back up and restore. Any update about this
policy in the latest kudu version? I'm also interested on this and would
like to avoid HDFS if it's possible.

Thanks for the hard work!

2016-10-28 9:32 GMT-03:00 Amit Adhau <amit.adhau@globant.com>:

> Hi Kudu Team/Mike,
>
> Since Kudu 1.0.1 is released, any plan to have more clarity on backup and
> restoration procedures for kudu, as this is important for production
> deployment.
>
> Thanks,
> Amit
>
> On Thu, Aug 25, 2016 at 2:45 PM, Amit Adhau <amit.adhau@globant.com>
> wrote:
>
>> Thanks Mark, but this brings the question, I'm not sure but I guess it
>> would be tricky to backup and restore, if we altered the table with new
>> partition[New range partition feature introduced in 0.10] and then we tries
>> to restore the old backup.
>>
>> Thanks,
>> Amit
>>
>> On Wed, Aug 24, 2016 at 3:00 AM, Mike Percy <mpercy@apache.org> wrote:
>>
>>> Correction to my previous mail: it was pointed out to me that the create
>>> table statement in the web UI does not include the partitioning
>>> information. Actually, I filed that bug myself a while back and thought it
>>> had been fixed but that's not the case: https://issues.apache.or
>>> g/jira/browse/KUDU-1253
>>>
>>> For now, it seems you would need to manually keep track of your
>>> partitioning scheme so that you can use the same one when recreating the
>>> table. Actually, at recreation time, you could choose whatever partitioning
>>> scheme you want before upserting the snapshot data.
>>>
>>> Mike
>>>
>>> On Tue, Aug 23, 2016 at 1:15 PM, Mike Percy <mpercy@apache.org> wrote:
>>>
>>>> Hi Amit,
>>>> If you only want to restore a single table then the data part should be
>>>> easy, since you can only snapshot scan the data of a single table at a
>>>> given time.
>>>>
>>>> An alternative way to restore a table is to look at the web UI and
>>>> check out the create table statement shown there. It should list
>>>> partitions, etc. If you copy that DDL then you can use that to recreate the
>>>> table and then reload the data from the snapshot scan. I am assuming you
>>>> would use something like HDFS to store the results of the snapshot scan.
>>>> However that would be a somewhat manual process, we need to improve this.
>>>>
>>>> A couple other things to note:
>>>> 1. Kudu doesn't currently provide table-wide snapshot scan consistency.
>>>> The snapshot scan will be consistent on a per-tablet basis, however.
>>>> 2. Using a snapshot scan will not restore the historical MVCC data
>>>> after you load the snapshot.
>>>>
>>>> Best,
>>>> Mike
>>>>
>>>>
>>>> On Tue, Aug 23, 2016 at 3:17 AM, Amit Adhau <amit.adhau@globant.com>
>>>> wrote:
>>>>
>>>>> Thanks a lot Mike. yes, proper backup and restore mechanism will
>>>>> certainly help.
>>>>>
>>>>> I have one more question, if a need arise to restore any specific
>>>>> table[partial restore], how can I identify which are the metadata and
data
>>>>> files related to that table, which I should restore or is it possible?
>>>>>
>>>>> Thanks,
>>>>> Amit
>>>>>
>>>>> On Tue, Aug 23, 2016 at 4:44 AM, Mike Percy <mpercy@apache.org>
wrote:
>>>>>
>>>>>> I would recommend a snapshot scan for data backup. You can easily
do
>>>>>> that with MapReduce.
>>>>>>
>>>>>> Metadata backup is tough. One thing you could do is backup the master
>>>>>> data and wal directories. If your filesystem supports snapshots then
taking
>>>>>> a snapshot of those directories should give you a consistent backup.
>>>>>> Otherwise you should shut down the master, copy the master data and
wal
>>>>>> dirs, then bring the master back up.
>>>>>>
>>>>>> For restoring a metadata backup, it's as simple as restoring the
file
>>>>>> system data for the master. For restoring a data backup, you could
first
>>>>>> drop the tables, recreate them, then run a MapReduce job that upserts
all
>>>>>> the data from the snapshot scan.
>>>>>>
>>>>>> All in all, backup and restore is something that is probably going
to
>>>>>> get worked on very soon, so thanks for reminding us. We know we need
to
>>>>>> document these procedures and make them easier and less rough around
the
>>>>>> edges.
>>>>>>
>>>>>> Although I know this has been discussed in the past, I couldn't find
>>>>>> a JIRA so I filed https://issues.apache.org/jira/browse/KUDU-1575
to
>>>>>> track this work.
>>>>>>
>>>>>> Best,
>>>>>> Mike
>>>>>>
>>>>>>
>>>>>> On Wed, Aug 17, 2016 at 7:05 PM, Mac Noland <mcdonaldnoland@gmail.com
>>>>>> > wrote:
>>>>>>
>>>>>>> From an Impala perspective, is making a scheduled copy of the
table
>>>>>>> into HDFS an option for you?
>>>>>>>
>>>>>>> http://kudu.apache.org/faq.html
>>>>>>>
>>>>>>> How can I back up my Kudu data?
>>>>>>> <http://kudu.apache.org/faq.html#how-can-i-back-up-my-kudu-data>
>>>>>>>
>>>>>>> Kudu doesn’t yet have a built-in backup mechanism. Similar
to bulk
>>>>>>> loading data, Impala can help if you have it available. You can
use it to
>>>>>>> copy your data into Parquet format using a statement like:
>>>>>>>
>>>>>>> INSERT INTO TABLE some_parquet_table SELECT * FROM kudu_table
>>>>>>>
>>>>>>> then use distcp <http://hadoop.apache.org/docs/r1.2.1/distcp2.html>
to
>>>>>>> copy the Parquet data to another cluster. While Kudu is in beta,
we’re not
>>>>>>> expecting people to deploy mission-critical workloads on it yet.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Wed, Aug 17, 2016 at 7:07 AM, Amit Adhau <amit.adhau@globant.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Hi Kudu team,
>>>>>>>>
>>>>>>>> Can you please suggest what would be the best way/policy
to backup
>>>>>>>> and restore the Kudu metadata/data on kudu side as well as
on Impala side
>>>>>>>> and also, if that can be automated.
>>>>>>>>
>>>>>>>> --
>>>>>>>> Thanks & Regards,
>>>>>>>>
>>>>>>>> *Amit Adhau* | Data Architect
>>>>>>>>
>>>>>>>> *GLOBANT* | IND:+91 9821518132
>>>>>>>>
>>>>>>>> [image: Facebook] <https://www.facebook.com/Globant>
>>>>>>>>
>>>>>>>> [image: Twitter] <http://www.twitter.com/globant>
>>>>>>>>
>>>>>>>> [image: Youtube] <http://www.youtube.com/Globant>
>>>>>>>>
>>>>>>>> [image: Linkedin] <http://www.linkedin.com/company/globant>
>>>>>>>>
>>>>>>>> [image: Pinterest] <http://pinterest.com/globant/>
>>>>>>>>
>>>>>>>> [image: Globant] <http://www.globant.com/>
>>>>>>>>
>>>>>>>> The information contained in this e-mail may be confidential.
It
>>>>>>>> has been sent for the sole use of the intended recipient(s).
If the reader
>>>>>>>> of this message is not an intended recipient, you are hereby
notified that
>>>>>>>> any unauthorized review, use, disclosure, dissemination,
distribution or
>>>>>>>> copying of this communication, or any of its contents,
>>>>>>>> is strictly prohibited. If you have received it by mistake
please
>>>>>>>> let us know by e-mail immediately and delete it from your
system. Many
>>>>>>>> thanks.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> La información contenida en este mensaje puede ser confidencial.
Ha
>>>>>>>> sido enviada para el uso exclusivo del destinatario(s) previsto.
Si el
>>>>>>>> lector de este mensaje no fuera el destinatario previsto,
por el presente
>>>>>>>> queda Ud. notificado que cualquier lectura, uso, publicación,
diseminación,
>>>>>>>> distribución o copiado de esta comunicación o su contenido
está
>>>>>>>> estrictamente prohibido. En caso de que Ud. hubiera recibido
este mensaje
>>>>>>>> por error le agradeceremos notificarnos por e-mail inmediatamente
y
>>>>>>>> eliminarlo de su sistema. Muchas gracias.
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Thanks & Regards,
>>>>>
>>>>> *Amit Adhau* | Data Architect
>>>>>
>>>>> *GLOBANT* | IND:+91 9821518132
>>>>>
>>>>> [image: Facebook] <https://www.facebook.com/Globant>
>>>>>
>>>>> [image: Twitter] <http://www.twitter.com/globant>
>>>>>
>>>>> [image: Youtube] <http://www.youtube.com/Globant>
>>>>>
>>>>> [image: Linkedin] <http://www.linkedin.com/company/globant>
>>>>>
>>>>> [image: Pinterest] <http://pinterest.com/globant/>
>>>>>
>>>>> [image: Globant] <http://www.globant.com/>
>>>>>
>>>>> The information contained in this e-mail may be confidential. It has
>>>>> been sent for the sole use of the intended recipient(s). If the reader
of
>>>>> this message is not an intended recipient, you are hereby notified that
any
>>>>> unauthorized review, use, disclosure, dissemination, distribution or
>>>>> copying of this communication, or any of its contents,
>>>>> is strictly prohibited. If you have received it by mistake please let
>>>>> us know by e-mail immediately and delete it from your system. Many
>>>>> thanks.
>>>>>
>>>>>
>>>>>
>>>>> La información contenida en este mensaje puede ser confidencial. Ha
>>>>> sido enviada para el uso exclusivo del destinatario(s) previsto. Si el
>>>>> lector de este mensaje no fuera el destinatario previsto, por el presente
>>>>> queda Ud. notificado que cualquier lectura, uso, publicación, diseminación,
>>>>> distribución o copiado de esta comunicación o su contenido está
>>>>> estrictamente prohibido. En caso de que Ud. hubiera recibido este mensaje
>>>>> por error le agradeceremos notificarnos por e-mail inmediatamente y
>>>>> eliminarlo de su sistema. Muchas gracias.
>>>>>
>>>>>
>>>>
>>>
>>
>>
>> --
>> Thanks & Regards,
>>
>> *Amit Adhau* | Data Architect
>>
>> *GLOBANT* | IND:+91 9821518132
>>
>> [image: Facebook] <https://www.facebook.com/Globant>
>>
>> [image: Twitter] <http://www.twitter.com/globant>
>>
>> [image: Youtube] <http://www.youtube.com/Globant>
>>
>> [image: Linkedin] <http://www.linkedin.com/company/globant>
>>
>> [image: Pinterest] <http://pinterest.com/globant/>
>>
>> [image: Globant] <http://www.globant.com/>
>>
>
>
>
> --
> Thanks & Regards,
>
> *Amit Adhau* | Data Architect
>
> *GLOBANT* | IND:+91 9821518132
>
> [image: Facebook] <https://www.facebook.com/Globant>
>
> [image: Twitter] <http://www.twitter.com/globant>
>
> [image: Youtube] <http://www.youtube.com/Globant>
>
> [image: Linkedin] <http://www.linkedin.com/company/globant>
>
> [image: Pinterest] <http://pinterest.com/globant/>
>
> [image: Globant] <http://www.globant.com/>
>
> The information contained in this e-mail may be confidential. It has been
> sent for the sole use of the intended recipient(s). If the reader of this
> message is not an intended recipient, you are hereby notified that any
> unauthorized review, use, disclosure, dissemination, distribution or
> copying of this communication, or any of its contents,
> is strictly prohibited. If you have received it by mistake please let us
> know by e-mail immediately and delete it from your system. Many thanks.
>
>
>
> La información contenida en este mensaje puede ser confidencial. Ha sido
> enviada para el uso exclusivo del destinatario(s) previsto. Si el lector de
> este mensaje no fuera el destinatario previsto, por el presente queda Ud.
> notificado que cualquier lectura, uso, publicación, diseminación,
> distribución o copiado de esta comunicación o su contenido está
> estrictamente prohibido. En caso de que Ud. hubiera recibido este mensaje
> por error le agradeceremos notificarnos por e-mail inmediatamente y
> eliminarlo de su sistema. Muchas gracias.
>
>


-- 
*Juan Pablo Briganti* | Data Architect
*GLOBANT* | AR: +54 11 4109 1700 ext. 19508 | US: +1 877 215 5230 ext. 19508
|
[image: Facebook] <https://www.facebook.com/Globant> [image: Twitter]
<http://www.twitter.com/globant> [image: Youtube]
<http://www.youtube.com/Globant> [image: Linkedin]
<http://www.linkedin.com/company/globant> [image: Pinterest]
<http://pinterest.com/globant/> [image: Globant] <http://www.globant.com>

-- 


The information contained in this e-mail may be confidential. It has been 
sent for the sole use of the intended recipient(s). If the reader of this 
message is not an intended recipient, you are hereby notified that any 
unauthorized review, use, disclosure, dissemination, distribution or 
copying of this communication, or any of its contents, 
is strictly prohibited. If you have received it by mistake please let us 
know by e-mail immediately and delete it from your system. Many thanks.

 

La información contenida en este mensaje puede ser confidencial. Ha sido 
enviada para el uso exclusivo del destinatario(s) previsto. Si el lector de 
este mensaje no fuera el destinatario previsto, por el presente queda Ud. 
notificado que cualquier lectura, uso, publicación, diseminación, 
distribución o copiado de esta comunicación o su contenido está 
estrictamente prohibido. En caso de que Ud. hubiera recibido este mensaje 
por error le agradeceremos notificarnos por e-mail inmediatamente y 
eliminarlo de su sistema. Muchas gracias.


Mime
View raw message