sdap-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Joseph Jacob (JIRA)" <>
Subject [jira] [Updated] (SDAP-151) Determine parallelism automatically for Spark analytics
Date Wed, 16 Jan 2019 00:13:00 GMT


Joseph Jacob updated SDAP-151:
    Resolution: Fixed
        Status: Done  (was: In Progress)

* Removed spark configuration, added nparts configuration, and autocompute parallelism for
spark-based time series, time averaged map, correlation map, and climatological map.

> Determine parallelism automatically for Spark analytics
> -------------------------------------------------------
>                 Key: SDAP-151
>                 URL:
>             Project: Apache Science Data Analytics Platform
>          Issue Type: Improvement
>            Reporter: Joseph Jacob
>            Assignee: Joseph Jacob
>            Priority: Major
> Some of the built-in NEXUS analytics like TimeSeries and TimeAvgMap currently get the
desired parallelism from a job request parameter like "spark=mesos,16,32".  If that is omitted,
we currently default to "spark=local,1,1", which runs on a single core.  Instead we would
like to automatically determine the appropriate level of parallelism based on the job's input
data size.

This message was sent by Atlassian JIRA

View raw message