incubator-blur-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aaron McCurry <amccu...@gmail.com>
Subject Re: Blur shell : command to view all data present in a table
Date Mon, 16 Dec 2013 12:54:34 GMT
Sorry for taking so long to respond.

On Fri, Dec 13, 2013 at 7:41 AM, Naresh Yadav <nyadav.ait@gmail.com> wrote:

> Hi aaron,
>
> I am little confused on problem of immediate visibility of data. My case i
> need guaranteed immediate visibility of index.
>

This is a normal behavior of Lucene based technologies (for the most part).
 There is a certain amount of time after the data is posted to an index
writer before the data can be searchable.  We are going to be trying to
improve this behavior in 0.3, but more than likely there will always be
some sort of delay.


> I tried with flag on the RowMutation object called waitForVisiblity and set
> it true then my same program for inserting
> 17000 rows started taking more than 5 minutes and even not completing
> fully, which was before taking 1minute. It starts throwing
> exception of All connections bad after 5-6 minutes..........If i run with
> waitForVisiblity=false it works fine in a minute.
>

With only 17,000 rows I would possibly try using the batch update version
of the mutate.  Depending on the size of your rows potentially using batch
sizes of a 1,000.  As far as the exception goes, if you could send the
stack trace back to the list when can try to fix/debug what's going on.  It
could have already been fixed in the unreleased 0.2.2.

I think that something like transactions would likely help in this
situation.  Meaning:

Load all your data.
Commit (or Rollback)
After commit everything is visible.

I have been thinking about adding something like this to Blur for awhile,
but with trying to get 0.2.2 production ready I haven't had time to work on
new features.



>
> Second question is regarding backups..i tried create snapshot and it was
> success.. I was eager to know if this i can see in windows
> filesystem and copy it to move to another machine and import(no command
> found for this) there in hdfs.
>

Snapshots merely freeze the index to a particular point in time and prevent
those files from being deleted.  In a future release there will be a way to
perform MapReduce over these snapshots, also you will be able to control
the index data through snapshots, and perform backups.  As for now, unless
you write some code to use them they aren't useful.


>
> Thanks
> Naresh
>
>
>
> On Fri, Dec 6, 2013 at 6:33 PM, Aaron McCurry <amccurry@gmail.com> wrote:
>
> > On Fri, Dec 6, 2013 at 7:54 AM, Naresh Yadav <nyadav.ait@gmail.com>
> wrote:
> >
> > > Hi,
> > >
> > > I have few doubts related to blur please help me on this :
> > >
> > > 1. Is there a way i can see all rows of data in a blur table ??? did
> not
> > > find any blur shell command..
> > >
> >
> > This will give you all the rows.
> >
> > query <tablename> *
> >
> >
> > >
> > > 2. Is delete of data possible with where clause as query (similar to
> > query
> > > command)?? I want to delete all data by matching two columns values
> > through
> > > blur shell..
> > >
> >
> > Not yet.  https://issues.apache.org/jira/browse/BLUR-130
> >
> > This shouldn't difficult to add.
> >
> >
> > >
> > > 3.After storing 17000 rows then i run queries to get each one then that
> > > returned only 16900 rows...After 5 mins i again run queries to get each
> > one
> > > then returned all 17000 rows.........Is there solution for this ?? In
> my
> > > cased just after inserting data, i need to immediately run query over
> it.
> > >
> >
> > There is a delay on visibility of data within Blur.  I believe the
> default
> > for a given table is 3 seconds, this can be configured by changing this
> > setting:
> >
> > blur.shard.time.between.refreshs=3000
> >
> > In the table properties, or in the blur-site.properties file.
> >
> > Be aware that decreasing this time will also decrease the speed in which
> > mutates can occur.  Also there is a flag on the RowMutation object called
> > waitForVisiblity if this is set to true the mutate command will not
> return
> > until the data is searchable.  NOTE: This will slow things down!  So only
> > do this if you have to wait.
> >
> > Hope this helps.
> >
> > Aaron
> >
> >
> > >
> > > Thanks,
> > > Naresh
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message