Ok, I've been able to reproduce this. I've filed https://issues.apache.org/jira/browse/ORC-569 .

.. Owen

On Fri, Nov 8, 2019 at 1:30 PM Owen O'Malley <owen.omalley@gmail.com> wrote:
   For bugs or feature requests, please file a jira: http://orc.apache.org/jira

If the positions list is empty, that is a bug. Can you describe how you generated the ORC file? Which version of the software was used?


On Fri, Nov 8, 2019 at 11:59 AM Andrew S. <luvshaknc@yahoo.com> wrote:
I'm starting to work with the ORC C++ API for a project I'm on.
In a very basic test, I am seeing an assertion which I think is a bug. As I'm not sure where to report bugs (and also, as I'm just started looking at this API and maybe I'm misusing the API) I thought I'd post here.

It is easy to reproduce: Take the Filecontents.cpp program which is the source for orc-contents example application and add the following code right before the while loop that calls rowReader->next(*batch):


Then, using the example file TestOrcFile.testSeek.orc as input to orc-contents, this will crash with an assertion because of dereferencing an iterator that's pointing to end(). Using other example data files works fine in this case.

Looking at PositionProvider::PositionProvider in InputStream.cc, it's assigning posns.begin() to the position variable. In this specific case for some of these assignments, the posns list is empty, and therefore the position variable ends up with an iterator that's pointing to end(). After that when it calls seekToRowGroup, it eventually gets to SeekableFileInputStream::seek() where it tries to call next() on that PositionProvider, tries to dereference the iterator and gives the assertion.

I'm on Win10 using VC 2017.

Any idea if this is a bug, or am I missing something?