apex-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Priyanka Gugale <pri...@apache.org>
Subject Re: Reading compressed file using FileSplitter
Date Mon, 03 Oct 2016 10:36:31 GMT
Hi Chiranjeevi,

There is no direct support in current operators to decompress data read
from file. But you can do it in following ways:
1. Extend AbstractBlockReader to use right STREAM type by implementing
`setupStream` function to initialize right stream reader class. e.g.
gzipInputStream if your input was in gzip format. Or in your case
2. Override `readBlock` from AbstractBlockReader and call decompress on
input data using snappy java api and then emit the data.

I would suggest the option one but what is achievable depends on which
snappy java library you use. Can you tell us which library you are using?


On Mon, Oct 3, 2016 at 2:42 PM, chiranjeevi vasupilli <chiru.vcj@gmail.com>

> Hi Priyanka,
> We are getting compressed file from source, which we need to read and
> decompress it. So that we can process the actual data.
> Can you please provide any reader/Operator which is readily available to decompress
> the data  while reading data in DataTorrent?
> On Mon, Oct 3, 2016 at 1:07 PM, Priyanka Gugale <priyag@apache.org> wrote:
>> Hi,
>> Do you want to read files in compressed form only or you want to your
>> program to decompress and read it?
>> If you want to read it in compressed format you can use FSInputModule
>> (which uses FileSplitter and block reader) directly to read your files.
>> If you want to uncompress while reading, there are other options you can
>> choose. I will explain in detail once you confirm this is what you are
>> trying to achieve.
>> -Priyanka
>> On Mon, Oct 3, 2016 at 12:38 PM, chiranjeevi vasupilli <
>> chiru.vcj@gmail.com> wrote:
>>> Hi Team,
>>> Can you please provide any reader/Operator which is capable of reading
>>> the compressed data in DataTorrent.
>>> I have a requirement to read .snappy files having cntl+A separaor using
>>> filesplitter ,can u please let me know how to do it?
>>> --
>>> thanks
>>> chiru
> --
> ur's
> chiru

View raw message