Return-Path: X-Original-To: apmail-accumulo-dev-archive@www.apache.org Delivered-To: apmail-accumulo-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 1861D11FBF for ; Mon, 19 May 2014 20:40:39 +0000 (UTC) Received: (qmail 77818 invoked by uid 500); 19 May 2014 20:40:39 -0000 Delivered-To: apmail-accumulo-dev-archive@accumulo.apache.org Received: (qmail 77788 invoked by uid 500); 19 May 2014 20:40:38 -0000 Mailing-List: contact dev-help@accumulo.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@accumulo.apache.org Delivered-To: mailing list dev@accumulo.apache.org Received: (qmail 77780 invoked by uid 99); 19 May 2014 20:40:38 -0000 Received: from tyr.zones.apache.org (HELO tyr.zones.apache.org) (140.211.11.114) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 19 May 2014 20:40:38 +0000 Received: by tyr.zones.apache.org (Postfix, from userid 65534) id 9F21A9360D1; Mon, 19 May 2014 20:40:38 +0000 (UTC) From: ryaneleary To: dev@accumulo.apache.org Reply-To: dev@accumulo.apache.org References: In-Reply-To: Subject: [GitHub] accumulo pull request: ACCUMULO-2825 Add RowEncodingIterator Content-Type: text/plain Message-Id: <20140519204038.9F21A9360D1@tyr.zones.apache.org> Date: Mon, 19 May 2014 20:40:38 +0000 (UTC) Github user ryaneleary commented on the pull request: https://github.com/apache/accumulo/pull/7#issuecomment-43555087 Sure. Here's an example of what I've done (by modifying the WholeRowIterator). Assume a table that contains 'documents' and various document information. Assume a table format like this (format: ```> = value```): ``` > (long) 12345 > (String) metaFieldValue1 > (String) metaFieldValue2 > (long) 23456 ``` For other reasons in the system, I have a protocol buffer definition that can be used for shipping this data around. It's definition looks like this: ``` message DocumentMessage { optional int64 timestamp = 1; optional string meta_field_1 = 2; optional string meta_field_2 = 3; } ``` I can define an iterator that is passed a generated protocol buffer class that can iterate over the whole row and automatically build the protocol buffer object. getTopValue() now returns a protocol buffer message that the application knows how to parse. This is nice for bulk loading and saves the client from having to build the message it would have had to build anyway. When iterating over the results returned by the scanner, an entire document at a time is returned instead of individual columns. I can provide a more concrete example/implementation of such a class if you'd like. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastructure@apache.org or file a JIRA ticket with INFRA. ---