cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sasha Dolgy (Created) (JIRA)" <j...@apache.org>
Subject [jira] [Created] (CASSANDRA-3745) contrib/PIG example fails when column metadata exists for CF
Date Sat, 14 Jan 2012 16:43:39 GMT
contrib/PIG example fails when column metadata exists for CF
------------------------------------------------------------

                 Key: CASSANDRA-3745
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3745
             Project: Cassandra
          Issue Type: Bug
          Components: Contrib
    Affects Versions: 1.0.6
            Reporter: Sasha Dolgy


I have a sandbox CF for prototyping and it has 17 Secondary Indexes defined.  When I would
run the contrib/PIG example, using pig 0.8.1 and even the pig 0.8.3 jar, with Cassandra 1.0.6,
I would receive the following error from the second line of the example script [ cols = FOREACH
rows GENERATE flatten(columns); ]:

2012-01-14 06:54:27,551 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1007: Found
duplicates in schema. : 18 columns. Please alias the columns with unique names.

I proceeded to drop all of the indexes, and tried again.  Same error.  On further inspection,
show schema showed that the metadata still existed on the CF from the indexes.  I ran the
following: 

update column family user with column_metadata = [];

I can now run the full contrib/pig example against my CF.  


*If I select another CF with 2 secondary indexes, the same behaviour persists:

2012-01-14 08:34:31,413 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1007: Found
duplicates in schema. : 3 columns. Please alias the columns with unique names.

grunt> describe users;
2012-01-14 08:36:58,227 [main] INFO  org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize
JVM Metrics with processName=JobTracker, sessionId= - already initialized
users: {key: bytearray,columns: {T: (name: chararray,value: bytearray,column_family: chararray,value:
bytearray,owner_id: chararray,value: bytearray)}}
grunt>

grunt> dump users;
<-- removed INFO/WARN output -->

HadoopVersion   PigVersion      UserId  StartedAt       FinishedAt      Features
0.20.2  0.8.1   sasha   2012-01-14 08:37:24     2012-01-14 08:37:43     UNKNOWN

Success!

Job Stats (time in seconds):
JobId   Alias   Feature Outputs
job_local_0001  users   MAP_ONLY        file:/tmp/temp-1366421017/tmp-1001688304,

Input(s):
Successfully read records from: "cassandra://sdo/entity_relations"

Output(s):
Successfully stored records in: "file:/tmp/temp-1366421017/tmp-1001688304"

Job DAG:
job_local_0001


(d1540edc-cb16-47dd-96e3-90e1657c2d77:a721966c6026ee85ef35f2108b75d3784b52bf1217f0b62564bdefe67b9504d9,{(content_id,d1540edc-cb16-47dd-96e3-90e1657c2d77:a721966c6026ee85ef35f2108b75d3784b52bf1217f0b62564bdefe67b9504d9),(owner_id,d1540edc-cb16-47dd-96e3-90e1657c2d77)})
grunt>

I have also tried this with PIG 0.9.1 but encounter https://issues.apache.org/jira/browse/CASSANDRA-3371


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message