spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Yu <yuzhih...@gmail.com>
Subject Re: Spark streaming
Date Fri, 27 Mar 2015 14:43:56 GMT
jamborta :
Please also describe the format of your csv files.

Cheers

On Fri, Mar 27, 2015 at 6:42 AM, DW @ Gmail <deanwampler@gmail.com> wrote:

> Show us the code. This shouldn't happen for the simple process you
> described
>
> Sent from my rotary phone.
>
>
> > On Mar 27, 2015, at 5:47 AM, jamborta <jamborta@gmail.com> wrote:
> >
> > Hi all,
> >
> > We have a workflow that pulls in data from csv files, then originally
> setup
> > up of the workflow was to parse the data as it comes in (turn into
> array),
> > then store it. This resulted in out of memory errors with larger files
> (as a
> > result of increased GC?).
> >
> > It turns out if the data gets stored as a string first, then parsed, it
> > issues does not occur.
> >
> > Why is that?
> >
> > Thanks,
> >
> >
> >
> > --
> > View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/Spark-streaming-tp22255.html
> > Sent from the Apache Spark User List mailing list archive at Nabble.com.
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
> > For additional commands, e-mail: user-help@spark.apache.org
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
> For additional commands, e-mail: user-help@spark.apache.org
>
>

Mime
View raw message