hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Grant Overby (groverby)" <grove...@cisco.com>
Subject Re: External Table with unclosed orc files.
Date Tue, 14 Apr 2015 22:09:34 GMT
Submitting patches or test cases is tricky business for a Cisco employee.
I’ll put in the legal admin effort to get approval to do this. :/ The
majority of the issues I mentioned /should/ find their way to apache via
hortonworks.


Additional responses are inline.









On 4/14/15, 5:28 PM, "Gopal Vijayaraghavan" <gopalv@apache.org> wrote:

>
>>0.14 . Acid tables have been a real pain for us. We don¹t believe they
>>are
>>production ready. At least in our use cases, Tez crashes for assorted
>>reasons or only assigns 1 mapper to the partition. Having delta files and
>>no base files borks mapper assignments.
>
>Some of the chicken-egg problems for those were solved recently in
>HIVE-10114.
>
>Then TEZ-1993 is coming out in the next version of Tez, into which we¹re
>plugging in HIVE-7428 (no fix yet).
>
>Currently delta-only splits have 0 bytes as the ³file size², so it grouped
>together to make a 16Mb chunk (rather a huge single 0 sized split).
>
>Those patches are the effect of me shaving the yak from the ³1 mapper²
>issue.
>
>After which the writer has to follow up on HIVE-9933 to get the locality
>of files fixed.

I’ll look into this. If the 1 mapper issue is solved, that would be a huge
win for streaming for us.


>
>>name are left scattered about, borking queries. Latency is higher with
>>streaming than writing to an orc file in hdfs, forcing obscene quantities
>>of buckets and orc files smaller than any reasonable orc stripe / hdfs
>>block size. The compactor hangs seemingly at random for no reason we¹ve
>>been able to discern.
>
>I haven¹t seen these issues yet, but I am not dealing with a large volume
>insert rate, so haven¹t produced latency issues there.
>
>Since I work on Hive performance and I haven¹t seen too many bugs filed,
>so I haven¹t paid attention to the performance of ACID.
>
>Please file bugs when you find them, so that it appears on the radar for
>folks like me.
>
>I¹m poking about because I want a live stream into LLAP to work seamlessly
>& return sub-second query results when queried (pre-cache/stage & merge
>etc).

These files aren’t orc, but hive expects them to be, leading to errors.
They are made by using the hive streaming api.
root@twig13:~# hdfs dfs -ls -R
/apps/hive/warehouse/events.db/connection_events4/ | grep flush | head -n 1
-rw-r--r-- 3 storm hadoop 200 2015-04-09 17:12
/apps/hive/warehouse/events.db/connection_events4/dt=1428613200/delta_11714
703_11714802/bucket_00007_flush_length
root@twig13:~# hdfs dfs -ls -R
/apps/hive/warehouse/events.db/connection_events4/ | grep flush | wc -l
283

This may be addressed by 8966 which is in the 1.0.0 release. kill -9 to
the processing writing to hive is a near guaranteed way to leave these
orphaned flush files, but we have seen them on several occasions when
there is no indication that .close() was skipped.

Our insert rate is about 100k/s for a 4 box cluster. Storm, Kafka, Hdfs,
Hive, etc are ‘pancaked’ on this cluster. To keep up with this insert rate
we need somewhere between 64 and 128 buckets for streaming to support an
equal number of threads. We can keep up this same pace when writing orc
files directly to hdfs with only 8 threads and thus 8 orc files. The orc
files from streaming are on the order of 5mb a piece (15min insert-time
base partitions). Even if orc stripes this small isn’t a problem, it’s
still going to waste a lot of disk space due to hdfs block size.


>
>>An orc file without a footer is junk data (or, at least, the last stripe
>>is junk data). I suppose my question should have been 'what will the hive
>>query do when it encounters this? Skip the stripe / file? Error out the
>>query? Something else?¹
>
>It should throw an exception, because that¹s a corrupt ORC file.
>
>The trucking demo uses Storm without ACID - this is likely to get better
>once we use Apache Falcon to move the data around.
>
>Cheers,
>Gopal
>
>

I suppose the best thing to do then is to write the orc file outside the
of the partition directory then issue an mv when the file is closed?

>

Mime
View raw message