Return-Path: X-Original-To: apmail-cassandra-commits-archive@www.apache.org Delivered-To: apmail-cassandra-commits-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id A0C0AD6F8 for ; Wed, 25 Jul 2012 16:17:35 +0000 (UTC) Received: (qmail 72115 invoked by uid 500); 25 Jul 2012 16:17:35 -0000 Delivered-To: apmail-cassandra-commits-archive@cassandra.apache.org Received: (qmail 72083 invoked by uid 500); 25 Jul 2012 16:17:35 -0000 Mailing-List: contact commits-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@cassandra.apache.org Delivered-To: mailing list commits@cassandra.apache.org Received: (qmail 71905 invoked by uid 99); 25 Jul 2012 16:17:35 -0000 Received: from issues-vm.apache.org (HELO issues-vm) (140.211.11.160) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 25 Jul 2012 16:17:35 +0000 Received: from isssues-vm.apache.org (localhost [127.0.0.1]) by issues-vm (Postfix) with ESMTP id 1ADE5142823 for ; Wed, 25 Jul 2012 16:17:35 +0000 (UTC) Date: Wed, 25 Jul 2012 16:17:35 +0000 (UTC) From: "Brandon Williams (JIRA)" To: commits@cassandra.apache.org Message-ID: <922206382.101695.1343233055112.JavaMail.jiratomcat@issues-vm> In-Reply-To: <329057108.92293.1343069555051.JavaMail.jiratomcat@issues-vm> Subject: [jira] [Updated] (CASSANDRA-4459) pig driver casts ints as bytearray MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/CASSANDRA-4459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brandon Williams updated CASSANDRA-4459: ---------------------------------------- Attachment: 4459-v2.txt Update with a comment explaining that IntegerType is wrong, but we're doing it anyway. Also switched all the IntegerTypes to Int32Types in the tests, which pass. I don't see any point in explicitly testing IntegerType as well until pig has a BigInteger. > pig driver casts ints as bytearray > ---------------------------------- > > Key: CASSANDRA-4459 > URL: https://issues.apache.org/jira/browse/CASSANDRA-4459 > Project: Cassandra > Issue Type: Bug > Environment: C* 1.1.2 embedded in DSE > Reporter: Cathy Daw > Assignee: Brandon Williams > Fix For: 1.1.3 > > Attachments: 4459-v2.txt, 4459.txt > > > we seem to be auto-mapping C* int columns to bytearray in Pig, and farther down I can't seem to find a way to cast that to int and do an average. > {code} > grunt> cassandra_users = LOAD 'cassandra://cqldb/users' USING CassandraStorage(); > grunt> dump cassandra_users; > (bobhatter,(act,22),(fname,bob),(gender,m),(highSchool,Cal High),(lname,hatter),(sat,500),(state,CA),{}) > (alicesmith,(act,27),(fname,alice),(gender,f),(highSchool,Tuscon High),(lname,smith),(sat,650),(state,AZ),{}) > > // notice sat and act columns are bytearray values > grunt> describe cassandra_users; > cassandra_users: {key: chararray,act: (name: chararray,value: bytearray),fname: (name: chararray,value: chararray), > gender: (name: chararray,value: chararray),highSchool: (name: chararray,value: chararray),lname: (name: chararray,value: chararray), > sat: (name: chararray,value: bytearray),state: (name: chararray,value: chararray),columns: {(name: chararray,value: chararray)}} > grunt> users_by_state = GROUP cassandra_users BY state; > grunt> dump users_by_state; > ((state,AX),{(aoakley,(highSchool,Phoenix High),(lname,Oakley),state,(act,22),(sat,500),(gender,m),(fname,Anne),{})}) > ((state,AZ),{(gjames,(highSchool,Tuscon High),(lname,James),state,(act,24),(sat,650),(gender,f),(fname,Geronomo),{})}) > ((state,CA),{(philton,(highSchool,Beverly High),(lname,Hilton),state,(act,37),(sat,220),(gender,m),(fname,Paris),{}),(jbrown,(highSchool,Cal High),(lname,Brown),state,(act,20),(sat,700),(gender,m),(fname,Jerry),{})}) > // Error - use explicit cast > grunt> user_avg = FOREACH users_by_state GENERATE cassandra_users.state, AVG(cassandra_users.sat); > grunt> dump user_avg; > 2012-07-22 17:15:04,361 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1045: Could not infer the matching function for org.apache.pig.builtin.AVG as multiple or none of them fit. Please use an explicit cast. > // Unable to cast as int > grunt> user_avg = FOREACH users_by_state GENERATE cassandra_users.state, AVG((int)cassandra_users.sat); > grunt> dump user_avg; > 2012-07-22 17:07:39,217 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1052: Cannot cast bag with schema sat: bag({name: chararray,value: bytearray}) to int > {code} > *Seed data in CQL* > {code} > CREATE KEYSPACE cqldb with > strategy_class = 'org.apache.cassandra.locator.SimpleStrategy' > and strategy_options:replication_factor=3; > use cqldb; > CREATE COLUMNFAMILY users ( > KEY text PRIMARY KEY, > fname text, lname text, gender varchar, > act int, sat int, highSchool text, state varchar); > insert into users (KEY, fname, lname, gender, act, sat, highSchool, state) > values (gjames, Geronomo, James, f, 24, 650, 'Tuscon High', 'AZ'); > insert into users (KEY, fname, lname, gender, act, sat, highSchool, state) > values (aoakley, Anne, Oakley, m , 22, 500, 'Phoenix High', 'AX'); > insert into users (KEY, fname, lname, gender, act, sat, highSchool, state) > values (jbrown, Jerry, Brown, m , 20, 700, 'Cal High', 'CA'); > insert into users (KEY, fname, lname, gender, act, sat, highSchool, state) > values (philton, Paris, Hilton, m , 37, 220, 'Beverly High', 'CA'); > select * from users; > {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira