accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Vesse <rve...@yarcdata.com>
Subject RE: Determine columns in a table?
Date Sun, 25 Mar 2012 22:24:52 GMT
That would work great if I had control over the data but I don't

I'm writing an Accumulo plugin for an ETL tool (Pentaho) so users are going to point it at
an arbitrary table in an arbitrary Accumulo instance and then I just have to pull the data
out knowing nothing about what is in there or any available secondary indexes.

I guess I'll probably go with something like scan the first 1000 rows to see what columns
there are.  Pentaho likes it if steps can expose what fields they will produce in advance
but the columns I declare don't have to be the full set, as long as I sample enough of the
table to get a good sense of what's in there steps further along the pipeline should be sufficiently
informed that they can be appropriately configured by a user

Rob

________________________________
From: John Vines [john.w.vines@ugov.gov]
Sent: 25 March 2012 12:40
To: accumulo-user@incubator.apache.org
Subject: RE: Determine columns in a table?


Another option for you would be to create a table for indexing your column information. That
way a quick scan can give you everything.

Sent from my phone, so pardon the typos and brevity.

On Mar 25, 2012 3:06 PM, "Robert Vesse" <rvesse@yarcdata.com<mailto:rvesse@yarcdata.com>>
wrote:
To clarify my question I'm not looking to see why a scan doesn't return data.  What I'm wanting
to know is what columns a full table scan (taking relevant scan authorizations into account)
will yield without doing the full table scan?

But it sounds like the only way is to do the table scan because in my usage scenario users
won't have HDFS access

Rob

________________________________
From: John Vines [john.w.vines@ugov.gov<mailto:john.w.vines@ugov.gov>]
Sent: 24 March 2012 16:27
To: accumulo-user@incubator.apache.org<mailto:accumulo-user@incubator.apache.org>
Subject: Re: Determine columns in a table?


Individual keys have their own visibility and it is possible to have keys with similar columns
to have different visibilities.

That said, we don't track the visibilities being used, so the only way is the mechanism Eric
suggested.

Sent from my phone, so pardon the typos and brevity.

On Mar 24, 2012 5:30 PM, "Robert Vesse" <rvesse@yarcdata.com<mailto:rvesse@yarcdata.com>>
wrote:
Obviously Accumulo is completely schema free but is there any easy way given a table name
and optionally one/more scan authorizations to determine what columns are visible to a user?

Or is the only way to do this by scanning the table?

Cheers

Rob

Rob Vesse -- YarcData.com<http://YarcData.com> -- A Division of Cray Inc
Software Engineer, Bay Area
m: 925.960.3941<tel:925.960.3941>  |  o: 925.264.4729<tel:925.264.4729> | @: rvesse@yarcdata.com<mailto:rvesse@yarcdata.com>
 |  Skype: rvesse
6210 Stoneridge Mall Rd  |  Suite 120  | Pleasanton CA, 94588



Mime
View raw message