accumulo-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eric Newton (JIRA)" <>
Subject [jira] [Resolved] (ACCUMULO-1417) data storage efficiency
Date Fri, 18 Jul 2014 04:00:08 GMT


Eric Newton resolved ACCUMULO-1417.

    Resolution: Fixed

Code to ingest the Google Books ngrams was added.  I posted some numbers on the efficiency
of the ingest and storage [here|].

Other key-value stores can compare their numbers, if they like.  Beating compressed CSV's
was an unexpected result.

> data storage efficiency
> -----------------------
>                 Key: ACCUMULO-1417
>                 URL:
>             Project: Accumulo
>          Issue Type: Task
>            Reporter: Eric Newton
> David Medinets wrote the user's list:
> {quote}
> Are there any published numbers for the amount of disk space used by
> Accumulo versus other products? I'm thinking some dataset like dbpedia
> or something from If there is
> not such a comparison, what comparisons would you like to see? What
> about WordNet stored in CSV, MySQL, Cassandra, HBase, and Accumulo?
> WordNet is just a large set of CSV files so it would be a good
> candidate for this concept, I think.
> {quote}
> Good idea.

This message was sent by Atlassian JIRA

View raw message