incubator-blur-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Aaron McCurry (JIRA)" <>
Subject [jira] [Commented] (BLUR-220) Support for humongous Rows
Date Thu, 17 Oct 2013 12:30:44 GMT


Aaron McCurry commented on BLUR-220:

A row query is querying all the records within a single row.  So it is where all the rowids
are equal to one another.

And yes the requirement for them being back-to-back is strictly for performance.;a=blob;f=blur-query/src/main/java/org/apache/blur/lucene/search/;h=6939bd68e890e33b1e812769817e91837b502a17;hb=515b09a002cbbb67c1ed22af90303a5f69135eb0

I haven't had time to look at your code, will try to tonight.  By the sound of it you are
heading down the road that I was thinking about, but I fear at scale the time to merge the
filter segment will be very large.  So let me take a look and play with the code a bit.



> Support for humongous Rows
> --------------------------
>                 Key: BLUR-220
>                 URL:
>             Project: Apache Blur
>          Issue Type: Improvement
>          Components: Blur
>    Affects Versions: 0.3.0
>            Reporter: Aaron McCurry
>             Fix For: 0.3.0
>         Attachments: Blur_Query_Perf_Chart1.pdf,,,,,, test_results.txt,,
> One of the limitations of Blur is size of Rows stored, specifically the number of Records.
 The current updates are performed on Lucene is by deleting the document and re-adding to
the index.  Unfortunately when any update is perform on a Row in Blur, the entire Row has
to be re-read (if the RowMutationType is UPDATE_ROW) and then whatever modification needs
are made then it is reindexed in it's entirety.
> Due to all of this overhead, there is a realistic limit on the size of a given Row. 
It may vary based the kind of hardware that is being used, as the Row grows in size the indexing
(mutations) against that Row will slow.
> This issue is being created to discuss techniques on how to deal with this problem.

This message was sent by Atlassian JIRA

View raw message