phoenix-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "James Taylor (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (PHOENIX-1711) Improve performance of CSV loader
Date Mon, 09 Mar 2015 16:00:42 GMT

    [ https://issues.apache.org/jira/browse/PHOENIX-1711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14353145#comment-14353145
] 

James Taylor commented on PHOENIX-1711:
---------------------------------------

Thanks for the review, [~gabriel.reid]. The patch is a bit on the raw side - just wanted to
see if it makes a significant difference before cleaning it up. I've fixed the swallowing
of that exception and gotten rid of the PArrayDataType change. I'll separate out the ConstraintViolationException
change into a different change list. I'm thinking along the same lines as you - if it improves
perf we can use this to speed up the generate case of UPSERT VALUES by caching the MutationPlan
and continually re-executing it. It's possible that the CsvUpsertExecutor wouldn't need to
change at all. I'm curious about one thing, though - do you think there's any overhead in
calling CallRunner.run() per row versus once per batch?

> Improve performance of CSV loader
> ---------------------------------
>
>                 Key: PHOENIX-1711
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-1711
>             Project: Phoenix
>          Issue Type: Bug
>            Reporter: James Taylor
>         Attachments: PHOENIX-1711.patch
>
>
> Here is a break-up of percentage execution time for some of the steps inthe mapper:
> csvParser: 18%
> csvUpsertExecutor.execute(ImmutableList.of(csvRecord)): 39%
> PhoenixRuntime.getUncommittedDataIterator(conn, true): 9%
> ´╗┐while (uncommittedDataIterator.hasNext()): 15%
> Read IO & custom processing: 19%
> See details here: http://s.apache.org/6rl



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message