hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Lars Hofhansl (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (HBASE-15506) FSDataOutputStream.write() allocates new byte buffer on each operation
Date Wed, 30 Mar 2016 04:52:25 GMT

    [ https://issues.apache.org/jira/browse/HBASE-15506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15217374#comment-15217374
] 

Lars Hofhansl edited comment on HBASE-15506 at 3/30/16 4:51 AM:
----------------------------------------------------------------

My point is that that's not a problem as such. That's how Java is designed to work.
It's only a problem when it is problem :)  As in, it definitely slows things down, causes
long GC pauses, etc.

I don't have to tell you, but for just completeness of the discussion here:
The principle costs to the garbage collector are (1) tracing all objects from the "root" objects
and (2) collecting all unreachable objects.
Obviously #1 is expensive when many objects need to be traced, and #2 is expensive when objects
have to be moved (for example to reduce memory fragmentation)

64KB objects do not worry me, even if we have many GBs of them, it's just not many references
to track. Further since they are all the same size, we won't fragment the heap in bad ways.

Reusing objects (IMHO) is simply a very questionable technique. Especially when you have to
reset the objects, which is more expensive then fitting a new object into a free slot of the
same size.

For what's it's worth, I have seen bad behaviour during heavy loading phases. I was always
able to configure the GC accordingly, though.

In any case, we should be able to create some single server test workload that exhibits problems
if there are any. Those are good tests to have anyway, not just a way to appease me.



was (Author: lhofhansl):
My point is that that's a problem as such. That's how Java is designed to work.
It's only a problem when it is problem :)  As in, it definitely slows things down, causes
long GC pauses, etc.

I don't have to tell you, but for just completeness of the discussion here:
The principle costs to the garbage collector are (1) tracing all objects from the "root" objects
and (2) collecting all unreachable objects.
Obviously #1 is expensive when many objects need to be traced, and #2 is expensive when objects
have to be moved (for example to reduce memory fragmentation)

64KB objects do not worry me, even if we have many GBs of them, it's just not many references
to track. Further since they are all the same size, we won't fragment the heap in bad ways.

Reusing objects (IMHO) is simply a very questionable technique. Especially when you have to
reset the objects, which is more expensive then fitting a new object into a free slot of the
same size.

For what's it's worth, I have seen bad behaviour during heavy loading phases. I was always
able to configure the GC accordingly, though.

In any case, we should be able to create some single server test workload that exhibits problems
if there are any. Those are good tests to have anyway, not just a way to appease me.


> FSDataOutputStream.write() allocates new byte buffer on each operation
> ----------------------------------------------------------------------
>
>                 Key: HBASE-15506
>                 URL: https://issues.apache.org/jira/browse/HBASE-15506
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Vladimir Rodionov
>            Assignee: Vladimir Rodionov
>
> Deep inside stack trace in DFSOutputStream.createPacket.
> This should be opened in HDFS. This JIRA is to track HDFS work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message