phoenix-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "James Taylor (JIRA)" <>
Subject [jira] [Commented] (PHOENIX-1711) Improve performance of CSV loader
Date Mon, 09 Mar 2015 16:00:42 GMT


James Taylor commented on PHOENIX-1711:

Thanks for the review, [~gabriel.reid]. The patch is a bit on the raw side - just wanted to
see if it makes a significant difference before cleaning it up. I've fixed the swallowing
of that exception and gotten rid of the PArrayDataType change. I'll separate out the ConstraintViolationException
change into a different change list. I'm thinking along the same lines as you - if it improves
perf we can use this to speed up the generate case of UPSERT VALUES by caching the MutationPlan
and continually re-executing it. It's possible that the CsvUpsertExecutor wouldn't need to
change at all. I'm curious about one thing, though - do you think there's any overhead in
calling per row versus once per batch?

> Improve performance of CSV loader
> ---------------------------------
>                 Key: PHOENIX-1711
>                 URL:
>             Project: Phoenix
>          Issue Type: Bug
>            Reporter: James Taylor
>         Attachments: PHOENIX-1711.patch
> Here is a break-up of percentage execution time for some of the steps inthe mapper:
> csvParser: 18%
> csvUpsertExecutor.execute(ImmutableList.of(csvRecord)): 39%
> PhoenixRuntime.getUncommittedDataIterator(conn, true): 9%
> ´╗┐while (uncommittedDataIterator.hasNext()): 15%
> Read IO & custom processing: 19%
> See details here:

This message was sent by Atlassian JIRA

View raw message