kudu-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Amit Adhau <amit.ad...@globant.com>
Subject Re: Kudu Storage size mismatch
Date Mon, 25 Apr 2016 20:45:08 GMT
Thanks Todd, will check the suggested python script. Also. Missed to
mention one point is that data folder size before taking backup and after
restore is almost same i.e. close to 185GB that means data size increase
after restore is not an issue in this case. At this moment, the gap is
between the data folder and the total 'on disk size'.

Thank you,
On Apr 25, 2016 11:54 PM, "Todd Lipcon" <todd@cloudera.com> wrote:

On Mon, Apr 25, 2016 at 11:07 AM, Amit Adhau <amit.adhau@globant.com> wrote:

> Thanks a lot Todd for quick response.
> Answers to your queries are inline in green, along with few more queries;
> On Mon, Apr 25, 2016 at 10:57 PM, Todd Lipcon <todd@cloudera.com> wrote:
>> Hi Amit,
>> Answers inline below:
>> On Mon, Apr 25, 2016 at 10:12 AM, Amit Adhau <amit.adhau@globant.com>
>> wrote:
>>> Hi Kudu Team,
>>> I have queries related to the kudu storage structure.
>>> Few days back, we were able to restore the backup of kudu metadata and
>>> data [almost 200GB] with loss of few data.
>> Do you mean that you took a backup of the Kudu data folders using normal
>> Linux backup tools like rsync/tar/etc? Was this just a test of a backup and
>> restore scenario, or did you experience some problem with Kudu and
>> therefore have to restore from some backup?
>     Yes, we took a backup of data folders and for metadata took a backup
> of master's directory using simple linux copy command. No, it was not a
> backup and restore test neither a kudu issue, was having an serious issues
> in CDH and server partition on the kudu master and slave server and forced
> to re-install everything including kudu[didn't got chance to take backup
> using impala too as it was also not working] at the same time having a
> challenge of preserving the kudu data, hence took folders backup as
> mentioned.

Got it, thanks.

>> At present, if we look at the kudu tablet server dashboard the
>>> observations are none of the parameters like overall mem-trackers(Memory
>>> (detail)), overall memz(Memory (total)) or overall tablets On-disk size not
>>> crosses 4-5 GB, however the tablet server folders details are as per below;
>> The 'on disk size' is listed per tablet, so if you sum up all of those,
>> you should have a total which is similar to the amount of data in the data/
>> directories. Is that not the case?
>     No, as mentioned data folder size is 185 GB, however the 'on disk
> size' total sum does not exceed even 4-5 GB and that is the reason we are
> wondering why there is a huge gap of 180~ GB between data folder and On
> Disk Size, can you please suggest anything? as this can hep us in proper
> kudu storage management in production.

OK, if using the standard linux 'cp' command, it has some heuristics to try
to detect sparse files, but maybe it got it wrong.

>> You can also check on a per-tablet-server basis how much space is used on
>> disk by looking at the 'log_block_manager_bytes_under_management' metric.
>> This is exposed in Cloudera Manager, or you can visit a URL like:
>> http://my-tablet-server:8051/metrics?metrics=bytes_under
>> which will dump the metric in JSON.
> We get below number for the metric
> "name": "log_block_manager_bytes_under_management",
>          "value": 2421794
> Sorry, I accidentally gave you the port number for the Kudu _master_
above. Try with port 8050 instead of 8051 on one of the tablet servers.

> One thing to note here is that the design of our on-disk storage uses
>> sparse files. In other words, the total logical size of the data files can
>> be much larger than the actual size. Depending which backup and restore
>> process you've used, it's possible that you ended up restoring a non-sparse
>> file in place of the original sparse file, which would make it increase in
>> size substantially.
>    Backup and restore was done using the linux copy command, but does the
> increase would be so substantial almost 180 GB gap? Is there a way to
> understand the .metadata and .data files correctly so that we can either
> remove any unwanted data or shrink it somehow.

You can try using this little Python script to see if this is the issue:

#!/usr/bin/env python
import sys

EMPTY_CHUNK = "\0" * 4096
f = file(sys.argv[1], "rb")

empty = 0
total = 0

for c in iter(lambda: f.read(4096), ''):
  total += 1
  if c == EMPTY_CHUNK:
    empty += 1

print "%d/%d chunks are empty (%f%%)" % (
    empty, total, (empty/float(total) * 100))

For example, when I run it on my system on a data file I see:

[todd@vd0340 data]$ /tmp/count.py ffe8c52849e3459fb28b3d8e9aa46ee0.data
2304099/2629085 chunks are empty (87.638817%)
[todd@vd0340 data]$ du -h --apparent-size
11G ffe8c52849e3459fb28b3d8e9aa46ee0.data
[todd@vd0340 data]$ du -h ffe8c52849e3459fb28b3d8e9aa46ee0.data
1.3G ffe8c52849e3459fb28b3d8e9aa46ee0.data

(the 87% empty seems to line up correctly with the actual space used of the

Todd Lipcon
Software Engineer, Cloudera


The information contained in this e-mail may be confidential. It has been 
sent for the sole use of the intended recipient(s). If the reader of this 
message is not an intended recipient, you are hereby notified that any 
unauthorized review, use, disclosure, dissemination, distribution or 
copying of this communication, or any of its contents, 
is strictly prohibited. If you have received it by mistake please let us 
know by e-mail immediately and delete it from your system. Many thanks.


La información contenida en este mensaje puede ser confidencial. Ha sido 
enviada para el uso exclusivo del destinatario(s) previsto. Si el lector de 
este mensaje no fuera el destinatario previsto, por el presente queda Ud. 
notificado que cualquier lectura, uso, publicación, diseminación, 
distribución o copiado de esta comunicación o su contenido está 
estrictamente prohibido. En caso de que Ud. hubiera recibido este mensaje 
por error le agradeceremos notificarnos por e-mail inmediatamente y 
eliminarlo de su sistema. Muchas gracias.

View raw message