If I understand correctly, in this test case, `result.to_array().equals(pa.array(range(1000), type=pa.uint32()))` is asserting that the selection vector has integer index values from [0, 1000), but I am looking for to obtain an array in the filtered record batch which should be an array of floats here. I know I can iterate indices in the selection vector and use it to retrieve each row in original record batch columns, but I am not certain if this is the right way to do it. For example, if I have multiple columns in the original record batch, do I need to iterate the selection vector multiple times to filter each of the column? Since this is a common task, I expect there is an easy/efficient API to do this.
Basically, I am looking for something like:
selection_vector = filter.evaluate(record_batch, pa.default_memory_pool())
filtered_column_arrays_in_record_batch = record_batch.filter(selection_vector) # what is the API for doing this?
I wonder if the filtering can be done without involving creating a projection expression. At the same time, if projector is expected to be used for doing this, what projector expressions should be used if I want to keep all the columns as they are but just with some rows filtered based on the criteria given?