avro-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Scott Carey <sc...@richrelevance.com>
Subject Re: storing avro messages in hbase
Date Sat, 12 Feb 2011 03:13:09 GMT
Storing hashes or pointers  to schemas or schema hashes is the typical way to deal with this.


http://www.quora.com/What-is-the-best-way-to-work-with-Avro-serialized-data-structures-in-a-database

http://www.javarants.com/2010/06/30/havrobase-a-searchable-evolvable-entity-store-on-top-of-hbase-and-solr/

Search-hadoop.com finds previous discussions on this topic:
http://search-hadoop.com/m/3iG061GVhHd2/HAvroBase&subj=Re+Versioning+of+an+array+of+a+record

http://search-hadoop.com/m/ZajsGoopYw/HAvroBase&subj=Re+question+about+completely+untagged+data+


http://search-hadoop.com/m/pz55F1beCEu1/HAvroBase&subj=Re+Setting+bytes+in+Java


In Hbase you can also play tricks with column names to match up schemas with versions —
append or prepend a version number to the column name and query with a pattern match on the
column.  You might need 0.92 and its coprocessors to use different deserializations per record
returned however.



On 2/11/11 6:32 PM, "Garrett Wu" <wugarrett@gmail.com<mailto:wugarrett@gmail.com>>
wrote:

If I use avro to store messages into cells in HBase, would I need to store the writer schema
along with it in every cell?

A problem that I foresee is that I might modify my schema and write new versions to some of
the cells in some rows of the table and then things would blow up unless I had stored the
writer schema in every cell.  Is there a better alternative?

Mime
View raw message