hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Harsh J <ha...@cloudera.com>
Subject Re: Regarding rework in changing column family
Date Wed, 28 Nov 2012 09:36:51 GMT
Ram and Anoop have already said yes to that (that the programs will
have to be changed).

If you are crazy enough: I think you could possibly use the
co-processor, as far as writes go, via (RegionObserver#prePut), to
mutate the Put to duplicate the needed columns into another family as
well (via Put.add(…)), and perhaps run another periodic job until the
upgrade is complete, to prune or move the older/duplicate values out
via Deletes.

However, I can't imagine how you'd be able to handle reads gracefully,
if your reads already rely on a ColFam name (instead of iterating over
the whole result). The scenario would then be similar to a SQL query
breaking cause one of its tables altered a column name.

On Wed, Nov 28, 2012 at 1:41 PM, Ramasubramanian Narayanan
<ramasubramanian.narayanan@gmail.com> wrote:
> Ram,
>
> In Java code, we use the following syntax which contains column family as
> one of the parameter for the "Add Record" function..
>
> *addrecord(String TableName, String RowKey, String ColumnFamilyName, String
> Qualifier, String Value);*
>
>
> public static void addRecord(String tableName, String rowKey, String
> family, String qualifier, String value) throws Exception {
>                 try{
>                         HTable table = new HTable(conf, tableName);
>                         Put put = new Put(Bytes.toBytes(rowKey));
>                         put.add(Bytes.toBytes(family),
> Bytes.toBytes(qualifier), Bytes.toBytes(value));
>                         table.put(put);
>                         System.out.println("insert recored " + rowKey + "
> to table "+ tableName + " ok.");
>                 }catch(IOException e){
>                         e.printStackTrace();
>                 }
>     }
>
>
> So if we change the column family, do we need to change all the Java
> programs which uses the old column family for a field?
>
> regards,
> Rams
>
> On Wed, Nov 28, 2012 at 11:41 AM, ramkrishna vasudevan <
> ramkrishna.s.vasudevan@gmail.com> wrote:
>
>> I am afraid it has to be changed...Because for your puts to go to the
>> specified Col family the col family name should appear in your Puts that is
>> created by the client.
>>
>> Regards
>> Ram
>>
>> On Wed, Nov 28, 2012 at 11:18 AM, Ramasubramanian Narayanan <
>> ramasubramanian.narayanan@gmail.com> wrote:
>>
>> > Thanks Ram!!!
>> >
>> > My question is like this...
>> >
>> > suppose I have create a table with 100 columns with single column family
>> > 'cf1',
>> >
>> > now in production there are billions of records are there in that table
>> and
>> > there are mulitiple programs that is feeding into this table (let us take
>> > some 50 programs)...
>> >
>> > In this scenario, if I change the column family like first 40 columns let
>> > it be in 'cf1', the last 60 columns I want to move to new column family
>> > 'cf2', in this case, *do we need to change all 50 programs which are
>> > inserting into that table with 'cf1' for all columns?*
>> > *
>> > *
>> > regards,
>> > Rams
>> >
>> > On Wed, Nov 28, 2012 at 10:24 AM, ramkrishna vasudevan <
>> > ramkrishna.s.vasudevan@gmail.com> wrote:
>> >
>> > > As far as i see altering the table with the new columnfamily should be
>> > > easier.
>> > > -> disable the table
>> > > -> Issue modify table command with the new col family.
>> > > -> run a compaction.
>> > > Now after this when you start doing your puts, they should be in
>> > alignment
>> > > with the new schema defined for the table.  You may have to see one
>> thing
>> > > is how much your rate of puts is getting affected because now both of
>> > your
>> > > CFs will start flushing whenever a memstore flush happens.
>> > >
>> > > Hope this helps.
>> > >
>> > > Regards
>> > > Ram
>> > >
>> > > On Wed, Nov 28, 2012 at 10:10 AM, Ramasubramanian <
>> > > ramasubramanian.narayanan@gmail.com> wrote:
>> > >
>> > > > Hi,
>> > > >
>> > > > I have created table in hbase with one column family and planned to
>> > > > release for development (in pentaho).
>> > > >
>> > > > Suppose later after doing the data profiling in production if I feel
>> > that
>> > > > out of 600 columns 200 is not going to get used frequently I am
>> > planning
>> > > to
>> > > > group those into another column family.
>> > > >
>> > > > If I change the column family at later point of time I hope there
>> will
>> > a
>> > > > lots of rework that has to be done (either if we use java or
>> pentaho).
>> > Is
>> > > > my understanding is correct? Is there any other alternative available
>> > to
>> > > > overcome?
>> > > >
>> > > > Regards,
>> > > > Rams
>> > >
>> >
>>



-- 
Harsh J

Mime
View raw message