gobblin-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mohan <mohandoss...@gmail.com>
Subject Re: Zero byte file, need help on Gobbli
Date Mon, 13 Nov 2017 07:02:56 GMT
We are fetching data from Kafka topic every 5 min and load to HDFS while
loading some time it's creating zero byte file

bootstrap.with.offset=latest
converter.classes=com.mmk.gobblin.LogMessageToAvroConverter
data.publisher.final.dir=${env:DATA_DIR}
data.publisher.permissions=775
data.publisher.replace.final.dir=false
data.publisher.type=gobblin.publisher.TimePartitionedDataPublisher
extract.limit.enabled=true
extract.limit.time.limit=3
extract.limit.time.limit.timeunit=minutes
extract.limit.type=time
extract.namespace=mmk.extract.kafka
job.description=Gobblin job to extract Hotel Avail logs
job.lock.dir=${env:GOBBLIN_WORK_DIR}/${job.name}
job.name=sample_job
kafka.brokers=192.168.0.1:9092
launcher.type=MAPREDUCE
metrics.enabled=false
metrics.report.interval=60000
metrics.reporting.file.enabled=true
metrics.log.dir=/app/gobblin/0.9.0/logs
mr.job.root.dir=${env:GOBBLIN_WORK_DIR}/working
reset.on.offset.out.of.range=nearest
source.class=gobblin.source.extractor.extract.kafka.KafkaSimpleSource
state.store.dir=${env:GOBBLIN_WORK_DIR}/${job.name}/statestore
writer.builder.class=com.mmk.gobblin.ParquetDataWriterBuilder
writer.destination.type=HDFS
writer.dir.permissions=775
writer.file.path=logs
writer.file.permissions=644
writer.output.dir=${env:GOBBLIN_WORK_DIR}/${job.name}/output
writer.output.format=PARQUET
writer.staging.dir=${env:GOBBLIN_WORK_DIR}/${job.name}/staging
writer.partitioner.class=com.mmk.gobblin.writer.partitioner.MmkSchemaTimestampPartitioner


On Nov 13, 2017 11:42 AM, "Vicky Kak" <vicky.kak@gmail.com> wrote:

> Please explain your use case and attach the corresponding job
> configuration and gobblin log file if possible.
>
> On Mon, Nov 13, 2017 at 11:02 AM, Mohan <mohandoss.tr@gmail.com> wrote:
>
>> Some time I'm getting zero byte parquet file, could you please tell me is
>> there any reason and size of the data level
>>
>> What is the max range gobbling can without any issue.
>>
>
>

Mime
View raw message