accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David Medinets <david.medin...@gmail.com>
Subject Querying Accumulo From Inside Mapper
Date Tue, 17 Apr 2012 12:49:45 GMT
I am reading from a text file of linked IDs but I want to store the
lookup values inside Accumulo.

RDB FOO
------
FOO_ID <-- this is the autoincrement key
ALT_ID  <-- this is the natural key
NAME
AGE

RDB BAR
------
BAR_ID <-- this is the autoincrement key
TAG       <-- zero or more person

RDB LINK
------
FOO_ID
BAR_ID

* RDB is relational database table.

Inside Accumulo, I want to use the ALT_ID as the row id because there
is other data that uses it which will also be stored in the row. I
will process the FOO text file first to result in:

FOO
-------
ALT_ID  NAME   XXX
ALT_ID  AGE     XXX
FOO_ID ALT_ID  XXXX

Can I write to two Accumulo tables using one mapper? If I can, then I
can store the FOO_ID/ALT_ID record in a separate table.

Processing the BAR text file provides:

BAR
------
BAR_ID  TAG  XXXX

Then when I process the LINK table, I can query the FOO table to find
the ALT_ID. And query the BAR table to find the tag. Then combine the
information for the mutation:

FOO
------
ALT_ID TAG XXX

Is there a best practice to query from inside a mapper?

At the end of the work, I can delete the ALT_ID column (or table).

I know that this work is trivial using SQL, but <sigh> that's not an option.

Mime
View raw message