cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Stefania (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-11574) COPY FROM command in cqlsh throws error
Date Thu, 21 Apr 2016 06:25:25 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-11574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15251361#comment-15251361
] 

Stefania commented on CASSANDRA-11574:
--------------------------------------

bq. spark already took the available 1 core in that machine which Cassandra is getting zero
for that value. This is the main problem I guess. please let me know if this is issue.

No I don't think so. {{get_num_processes()}} will never return zero and it uses {{get_num_cores()}},
which relies on {{mp.cpu_count()}}, doc [here|https://docs.python.org/2/library/multiprocessing.html].
This returns the number of cores available on the system, I don't think it would know that
Spark has taken one.  Besides, even if the system only has one core, it should still work
with 1 process. We have an environment variable that we set to simulate 1-core machines in
our tests and Datastax tested COPY code on single core VMs as well. Also, it is the cores
of the machine that runs cqlsh that matter, not the machine that runs the Cassandra server
(just in case if this wasn't clear before). 

Are you still getting the exact same error with the two lines above? What about if you don't
call {{get_num_processes}} at all and fix {{num_processes}} to 1, does that work?

Full code here:

{code}
@staticmethod
    def get_num_processes(cap):
        """
        Pick a reasonable number of child processes. We need to leave at
        least one core for the parent or feeder process.
        """
        return max(1, min(cap, CopyTask.get_num_cores() - 1))

    @staticmethod
    def get_num_cores():
        """
        Return the number of cores if available. If the test environment variable
        is set, then return the number carried by this variable. This is to test single-core
        machine more easily.
        """
        try:
            num_cores_for_testing = os.environ.get('CQLSH_COPY_TEST_NUM_CORES', '')
            ret = int(num_cores_for_testing) if num_cores_for_testing else mp.cpu_count()
            printdebugmsg("Detected %d core(s)" % (ret,))
            return ret
        except NotImplementedError:
            printdebugmsg("Failed to detect number of cores, returning 1")
            return 1
{code}

> COPY FROM command in cqlsh throws error
> ---------------------------------------
>
>                 Key: CASSANDRA-11574
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-11574
>             Project: Cassandra
>          Issue Type: Bug
>          Components: CQL
>         Environment: Operating System: Ubuntu Server 14.04
> JDK: Oracle JDK 8 update 77
> Python: 2.7.6
>            Reporter: Mahafuzur Rahman
>            Assignee: Stefania
>             Fix For: 3.0.6
>
>
> Any COPY FROM command in cqlsh is throwing the following error:
> "get_num_processes() takes no keyword arguments"
> Example command: 
> COPY inboxdata (to_user_id,to_user_network,created_time,attachments,from_user_id,from_user_name,from_user_network,id,message,to_user_name,updated_time)
FROM 'inbox.csv';
> Similar commands worked parfectly in the previous versions such as 3.0.4



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message