hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gesli, Nicole" <Nicole.Ge...@memorylane.com>
Subject Re: DATA UPLOADTION
Date Mon, 16 Jul 2012 18:00:50 GMT
If you are just trying to find certain text in the data files and you just want to do bulk
process to create reports once a day or so, and prefer to use Hive: you can create a table
with with single string column. You need to pre-process your data to replace the default column
delimiter in your data. Or, you can define a column delimiter that your data does not have.
That is to make sure that entire line data is assigned to the column but not cut in where
the column delimiter is. If your query will be different for each file type (flat files, logs,
xls,…) you can create different partitions for each file type. Dump your files into the
table (or table partition) folder(s). Or you can create external table(s) if your data is
already in HDFS. You can than do "like" (faster) or "rlike" search on the table.

-Nicole

From: Bejoy KS <bejoy_ks@yahoo.com<mailto:bejoy_ks@yahoo.com>>
Reply-To: "user@hive.apache.org<mailto:user@hive.apache.org>" <user@hive.apache.org<mailto:user@hive.apache.org>>,
"bejoy_ks@yahoo.com<mailto:bejoy_ks@yahoo.com>" <bejoy_ks@yahoo.com<mailto:bejoy_ks@yahoo.com>>
Date: Monday, July 16, 2012 12:50 AM
To: "user@hive.apache.org<mailto:user@hive.apache.org>" <user@hive.apache.org<mailto:user@hive.apache.org>>
Cc: "user@hbase.apache.org<mailto:user@hbase.apache.org>" <user@hbase.apache.org<mailto:user@hbase.apache.org>>
Subject: Re: DATA UPLOADTION

Hi Yogesh

If you are looking at some indexing and search kind of operation you can take a look at lucene.

Whether you are using hive or Hbase you cannot do any operation without having a table structure
defined for the data. So you need to create tables for each dataset and then only you can
go ahead and issue queries and generate reports on those data.
Regards
Bejoy KS

Sent from handheld, please excuse typos.
________________________________
From: <yogesh.kumar13@wipro.com<mailto:yogesh.kumar13@wipro.com>>
Date: Mon, 16 Jul 2012 06:21:15 +0000
To: <user@hive.apache.org<mailto:user@hive.apache.org>>
ReplyTo: user@hive.apache.org<mailto:user@hive.apache.org>
Cc: <user@hbase.apache.org<mailto:user@hbase.apache.org>>
Subject: RE: DATA UPLOADTION

Hello Debarshi,

Please suggest me what tool should I use for these operation over hadoop dfs.

Regards
Yogesh Kumar

________________________________
From: Debarshi Basak [debarshi.basak@tcs.com<mailto:debarshi.basak@tcs.com>]
Sent: Monday, July 16, 2012 11:25 AM
To: user@hive.apache.org<mailto:user@hive.apache.org>
Cc: user@hive.apache.org<mailto:user@hive.apache.org>; user@hbase.apache.org<mailto:user@hbase.apache.org>
Subject: Re: DATA UPLOADTION

Hive is not the right to go about it, if you are planning to do search kind of operations


Debarshi Basak
Tata Consultancy Services
Mailto: debarshi.basak@tcs.com<mailto:debarshi.basak@tcs.com>
Website: http://www.tcs.com
____________________________________________
Experience certainty. IT Services
Business Solutions
Outsourcing
____________________________________________

----- wrote: -----
To: <user@hive.apache.org<mailto:user@hive.apache.org>>
From: <yogesh.kumar13@wipro.com<mailto:yogesh.kumar13@wipro.com>>
Date: 07/16/2012 09:11AM
cc: <user@hbase.apache.org<mailto:user@hbase.apache.org>>
Subject: DATA UPLOADTION

Hi all,

I have data of Flat files, Log files, Images and .xls Files of around many G.B

I need to put operation like searching, Querying over that raw data.  and generating reports.
And its impossible to create tables manually for all to manage them. Is there any other way
out or how to manage them using Hive or Hbase.

Please suggest me how do I perform these operations over them, I want to use HADOOP DFS and
files has been uploaded on HDFS (Single user)


Thanks & Regards
Yogesh Kumar

Please do not print this email unless it is absolutely necessary.

The information contained in this electronic message and any attachments to this message are
intended for the exclusive use of the addressee(s) and may contain proprietary, confidential
or privileged information. If you are not the intended recipient, you should not disseminate,
distribute or copy this e-mail. Please notify the sender immediately and destroy all copies
of this message and any attachments.

WARNING: Computer viruses can be transmitted via email. The recipient should check this email
and any attachments for the presence of viruses. The company accepts no liability for any
damage caused by any virus transmitted by this email.

www.wipro.com

=====-----=====-----=====
Notice: The information contained in this e-mail
message and/or attachments to it may contain
confidential or privileged information. If you are
not the intended recipient, any dissemination, use,
review, distribution, printing or copying of the
information contained in this e-mail message
and/or attachments to it are strictly prohibited. If
you have received this communication in error,
please notify us by reply e-mail or telephone and
immediately and permanently delete the message
and any attachments. Thank you

Please do not print this email unless it is absolutely necessary.

The information contained in this electronic message and any attachments to this message are
intended for the exclusive use of the addressee(s) and may contain proprietary, confidential
or privileged information. If you are not the intended recipient, you should not disseminate,
distribute or copy this e-mail. Please notify the sender immediately and destroy all copies
of this message and any attachments.

WARNING: Computer viruses can be transmitted via email. The recipient should check this email
and any attachments for the presence of viruses. The company accepts no liability for any
damage caused by any virus transmitted by this email.

www.wipro.com
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message