kudu-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jean-Daniel Cryans <jdcry...@apache.org>
Subject Re: [kudu] import from hdfs
Date Wed, 16 Aug 2017 18:39:03 GMT
Huh this is confusing, how much memory did you say you have per node? You
mentioned 256GB but I'm not sure what it relates to anymore because I see
you gave 400GB to Kudu in there.

Also, why a single disk? Is HDFS using more than one?

On Tue, Aug 15, 2017 at 9:40 AM, Andrey Kuznetsov <Andrey_Kuznetsov@epam.com
> wrote:

> Hi Jean-Daniel,
>
> No problem, you can find screen in attachment,
>
> Could not provide the log due security reasons, sorry…
>
>
>
> Best regards,
>
> *ANDREY KUZNETSOV*
>
> *Software Engineering Team Leader, Assessment Global Discipline Head
> (Java)*
>
>
>
> *Office: *+7 482 263 00 70 *x* 42766 <+7%20482%20263%2000%2070;ext=42766>
>    *Cell: *+7 920 154 05 72 <+7%20920%20154%2005%2072>   *Email: *
> andrey_kuznetsov@epam.com
>
> *Tver,* *Russia *  *epam.com <http://www.epam.com/>*
>
>
>
> CONFIDENTIALITY CAUTION AND DISCLAIMER
> This message is intended only for the use of the individual(s) or
> entity(ies) to which it is addressed and contains information that is
> legally privileged and confidential. If you are not the intended recipient,
> or the person responsible for delivering the message to the intended
> recipient, you are hereby notified that any dissemination, distribution or
> copying of this communication is strictly prohibited. All unintended
> recipients are obliged to delete this message and destroy any printed
> copies.
>
>
>
> *From:* Jean-Daniel Cryans [mailto:jdcryans@apache.org]
> *Sent:* Thursday, August 10, 2017 6:55 PM
>
> *To:* user@kudu.apache.org
> *Cc:* Special SBER-BPOC Team <SpecialSBER-BPOCTeam@epam.com>
> *Subject:* Re: [kudu] import from hdfs
>
>
>
> Hi Andrey,
>
>
>
> Can you double check how much memory is actually given to Kudu? That's
> --memory_limit_hard_bytes. Providing us with a full kudu-tserver log could
> be useful, as long as it starts with this line "Tablet server non-default
> flags".
>
>
>
> Without more data about your situation it's going to be really hard to
> help you.
>
>
>
> Thx,
>
>
>
> J-D
>
>
>
> On Thu, Aug 10, 2017 at 4:46 AM, Andrey Kuznetsov <
> Andrey_Kuznetsov@epam.com> wrote:
>
> Hi Jean-Daniel,
>
> Nice to hear you)
>
>
>
> I use kudu 1.3, I hope kudu has enough memory (about 256Gb each node),
>
> I have played with threads parameter, but there are no a lot of
> differences -
>
> it is extremely slow…
>
>
>
> Best regards,
>
> *ANDREY KUZNETSOV*
>
> *Software Engineering Team Leader, Assessment Global Discipline Head
> (Java)*
>
>
>
> *Office: *+7 482 263 00 70 *x* 42766 <+7%20482%20263%2000%2070;ext=42766>
>    *Cell: *+7 920 154 05 72 <+7%20920%20154%2005%2072>   *Email: *
> andrey_kuznetsov@epam.com
>
> *Tver,* *Russia *  *epam.com <http://www.epam.com/>*
>
>
>
> CONFIDENTIALITY CAUTION AND DISCLAIMER
> This message is intended only for the use of the individual(s) or
> entity(ies) to which it is addressed and contains information that is
> legally privileged and confidential. If you are not the intended recipient,
> or the person responsible for delivering the message to the intended
> recipient, you are hereby notified that any dissemination, distribution or
> copying of this communication is strictly prohibited. All unintended
> recipients are obliged to delete this message and destroy any printed
> copies.
>
>
>
> *From:* Jean-Daniel Cryans [mailto:jdcryans@apache.org]
> *Sent:* Wednesday, August 9, 2017 10:52 PM
> *To:* user@kudu.apache.org
> *Cc:* Special SBER-BPOC Team <SpecialSBER-BPOCTeam@epam.com>
> *Subject:* Re: [kudu] import from hdfs
>
>
>
> Hi Andrey,
>
>
>
> Which version of Kudu and Impala are you using? Just that can make a huge
> difference.
>
>
>
> Apart from that, make sure Kudu has enough memory (no memory back
> pressure), you have enough maintenance manager threads (1/3 or 1/4 the
> number of disks), and that your partitioning favors good load distribution.
>
>
>
> But TBH writing to Parquet will remain faster than writing to Kudu,
> because Kudu isn't just dropping the rows into a file and has to do more
> than that.
>
>
>
> Hope this helps,
>
>
>
> J-D
>
>
>
> On Wed, Aug 9, 2017 at 9:05 AM, Andrey Kuznetsov <
> Andrey_Kuznetsov@epam.com> wrote:
>
> Hi folk,
>
> I have a problem with hdfs to kudu performance, I have created external
> table with CSV data and ran “insert as select”  from it to kudu-table and
> to parquet-table:
>
> Importing to parquet-table is 3x faster than to kudu – do you know some
> tips/tricks to increase performance of import?
>
> actually I am importing 8TB of data, so it is critical for me,
>
>
>
> Best regards,
>
> *ANDREY KUZNETSOV*
>
> *Software Engineering Team Leader, Assessment Global Discipline Head
> (Java)*
>
>
>
> *Office: *+7 482 263 00 70 *x* 42766 <+7%20482%20263%2000%2070;ext=42766>
>    *Cell: *+7 920 154 05 72 <+7%20920%20154%2005%2072>   *Email: *
> andrey_kuznetsov@epam.com
>
> *Tver,* *Russia *  *epam.com <http://www.epam.com/>*
>
>
>
> CONFIDENTIALITY CAUTION AND DISCLAIMER
> This message is intended only for the use of the individual(s) or
> entity(ies) to which it is addressed and contains information that is
> legally privileged and confidential. If you are not the intended recipient,
> or the person responsible for delivering the message to the intended
> recipient, you are hereby notified that any dissemination, distribution or
> copying of this communication is strictly prohibited. All unintended
> recipients are obliged to delete this message and destroy any printed
> copies.
>
>
>
>
>
>
>

Mime
View raw message