accumulo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David Medinets <david.medin...@gmail.com>
Subject Re: Search function
Date Thu, 07 May 2015 00:39:42 GMT
https://github.com/medined/D4M_Schema describes one way to handle the
secondary indexes and provides some prototype java code to experiment
with. There are other projects you can find. For example,
https://github.com/joshelser/cosmos.

On Wed, May 6, 2015 at 5:08 PM, Christopher <ctubbsii@apache.org> wrote:
> Since Accumulo is essentially a big sorted map, it is most efficient
> searching by the row. When you search by other fields, you are
> searching the entire data set, and filtering. That is usually not very
> efficient. The API provides a way to do this relatively easily by
> specifying family or family:qualifier, but it does not (as you've
> observed) make it easy to do this by Value.
>
> There are a few options:
>
> 1. You can configure the RegExFilter as a scan-time iterator. (This is
> going to be terribly inefficient.)
> 2. You can adopt adopt a secondary indexing strategy.
>
> I would do option #2. As you've described, your data is indexed by ID.
> If you need an index on whatever you're storing in the Value, you
> should make a new table (or new family/locality group) which stores
> your data sorted by that instead of ID. You can either just store the
> ID in this secondary index, and do two lookups (the secondary index to
> find the ID, then the main data once you have the ID), or you can
> store all the data a second time, ordered by the contents of your
> Value (this trade space for performance).
>
> There are more complex strategies, but these are the basics.
>
> --
> Christopher L Tubbs II
> http://gravatar.com/ctubbsii
>
>
> On Wed, May 6, 2015 at 10:10 AM, Revan1988 <andrealeoni88@gmail.com> wrote:
>> Hi,
>> I've got an other question about using Accumulo.
>>
>> My table is something like that:
>>
>> ID1 info:name JhonSmith
>> ID1 info:birth 1988-06-26
>> ID1 study:university ComputerEngineering
>> ID1 study:graduated Yes
>>
>> ID2 info:name GeorgeDuff
>> ID2 info:birth 1984-01-29
>> ID2 study:university Math
>> ID2 study:graduated Yes
>>
>> ...
>>
>>
>> I want all info about JhonSmith but with Java API I've found only method to
>> search by row, family or family:qualifier ...
>>
>> I need to search by Value and after to use its row (IDx) to search all other
>> entries that has the same row (IDx).
>>
>> for example i need all info about JhonSmith (birth, university, graduated
>> ...).
>>
>> I hope I explain my problem.
>> Sorry again for my bad english.
>>
>> ...and once again:
>> Thank you!!!
>>
>>
>>
>> -----
>> Andrea Leoni
>> Italy
>> Computer Engineer
>> --
>> View this message in context: http://apache-accumulo.1065345.n5.nabble.com/Search-function-tp14030.html
>> Sent from the Developers mailing list archive at Nabble.com.

Mime
View raw message