hama-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Thomas Jungblut (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HAMA-559) Add a spilling message queue
Date Fri, 02 Nov 2012 20:44:12 GMT

    [ https://issues.apache.org/jira/browse/HAMA-559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13489738#comment-13489738

Thomas Jungblut commented on HAMA-559:

Hey Suraj,

managed to further improve. I have written my own OutputStream which enables asynchronous
flushing to disk.

Have a look here:


Looks a bit simpler than your solution and is also faster ;)

 0% Scenario{vm=java, trial=0, benchmark=Spill, size=1000000, type=SPILLING_BUFFER, memoryMax=-Xmx8g}
22447456,65 ns; σ=2294787,10 ns @ 10 trials
17% Scenario{vm=java, trial=0, benchmark=Spill, size=10000000, type=SPILLING_BUFFER, memoryMax=-Xmx8g}
208675630,29 ns; σ=129832,20 ns @ 3 trials
33% Scenario{vm=java, trial=0, benchmark=Spill, size=100000000, type=SPILLING_BUFFER, memoryMax=-Xmx8g}
2179034450,50 ns; σ=159017861,44 ns @ 10 trials
50% Scenario{vm=java, trial=0, benchmark=Spill, size=1000000, type=DISK_LIST, memoryMax=-Xmx8g}
17597941,21 ns; σ=156327,75 ns @ 3 trials
67% Scenario{vm=java, trial=0, benchmark=Spill, size=10000000, type=DISK_LIST, memoryMax=-Xmx8g}
174984110,83 ns; σ=5126997,66 ns @ 10 trials
83% Scenario{vm=java, trial=0, benchmark=Spill, size=100000000, type=DISK_LIST, memoryMax=-Xmx8g}
1731678008,00 ns; σ=5150036,16 ns @ 3 trials

     size            type     ms linear runtime
  1000000 SPILLING_BUFFER   22,4 =
  1000000       DISK_LIST   17,6 =
 10000000 SPILLING_BUFFER  208,7 ==
 10000000       DISK_LIST  175,0 ==
100000000 SPILLING_BUFFER 2179,0 ==============================
100000000       DISK_LIST 1731,7 =======================

vm: java
trial: 0
benchmark: Spill
memoryMax: -Xmx8g

Note: benchmarks printed 11516 characters to System.out and 0 characters to System.err. Use
--debug to see this output.


So the disklist can be updated by using the stream like this:


> Add a spilling message queue
> ----------------------------
>                 Key: HAMA-559
>                 URL: https://issues.apache.org/jira/browse/HAMA-559
>             Project: Hama
>          Issue Type: Sub-task
>          Components: bsp core
>    Affects Versions: 0.5.0
>            Reporter: Thomas Jungblut
>            Assignee: Suraj Menon
>            Priority: Minor
>             Fix For: 0.7.0
>         Attachments: HAMA-559.patch-v1, spillbench_code.tar.gz, spilling_buffer_cpu_usage_text_write.png,
SpillingBufferProfile-2012-10-27.snapshot, spilling_buffer_profile_cpu_graph_test_write.png,
spilling_buffer_profile_cpugraph_writeUTF.png, spillingbuffer_profile_cpu_writeUTF.png, spilling_buffer_profile_LOCK.JPG,
spilling_buffer_profile_timesplit_text_write.png, spilling_buffer_profile_writeUTF.png
> After HAMA-521 is done, we can add a spilling queue which just holds the messages in
RAM that fit into the heap space. The rest can be flushed to disk.
> We may call this a HybridQueue or something like that.
> The benefits should be that we don't have to flush to disk so often and get faster. However
we may have more GC so it is always overall faster.
> The requirements for this queue also include:
> - The message object once written to the queue (after returning from the write call)
could be modified, but the changes should not be reflected in the messages stored in the queue.
> - For now let's implement a queue that does not support concurrent reading and writing.
This feature is needed when we implement asynchronous communication.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message