lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Shalin Shekhar Mangar" <shalinman...@gmail.com>
Subject Re: cron job update index
Date Wed, 17 Sep 2008 16:01:56 GMT
On Wed, Sep 17, 2008 at 9:14 PM, sunnyfr <johanna.34@gmail.com> wrote:

>
> Am I doing something wrong or not?
> Every time I start (manually) delta-import
> (.../dataimport?command=delta-import)
> and then I go back to check the statut : http://.../solr/books/dataimport,
> it's still running like it can't never ending :
>
> <str name="status">busy</str>
> <str name="importResponse">A command is still running...</str>
> −
> <lst name="statusMessages">
> <str name="Time Elapsed">0:13:54.194</str>
> <str name="Total Requests made to DataSource">881696</str>
> <str name="Total Rows Fetched">2418310</str>
> <str name="Total Documents Processed">125956</str>
> <str name="Total Documents Skipped">0</str>
> <str name="Delta Dump started">2008-09-17 17:24:07</str>
> <str name="Identifying Delta">2008-09-17 17:24:07</str>
> <str name="Deltas Obtained">2008-09-17 17:24:49</str>
> <str name="Building documents">2008-09-17 17:24:49</str>
> <str name="Total Changed Documents">390796</str>
> </lst>
>
>
> Even if I've just done a full-import, so, where can I check in stat or ???
> what did it just changed by delta-import ?
>
> Does it loop for checking cuz I have to admit I didn't put my
> parentDeltaQuery in data-config. Is it for that ?
>
>  <entity name="books"
>            pk="books.book_id"
>            transformer="RegexTransformer"
>            deltaQuery="SELECT book_id FROM book INNER JOIN user
> USING(user_id)
>                          WHERE book.modified >
> '${dataimporter.last_index_time}'
>                            OR user.modified  >
> '${dataimporter.last_index_time}'"
>            query="SELECT ..."
>  >
>  ....
>

When you give a delta-import command, first the deltaQuery is executed to
identify the primary keys of the rows which have changed since the last run
(through the use of the last_index_time variable. Then the main "query" is
executed for each primary key identified by the deltaQuery. This main query
is used to create the documents and index them.

In the output you pasted, it tells you the number of rows that have changed
and the number of documents it has indexed until now. Once it completes, it
will show the time at which it committed the changes to the index.

The parentDeltaQuery is needed only when you have nested entities and you
need to identify changes in the child entities.


>
> --
> View this message in context:
> http://www.nabble.com/cron-job-update-index-tp19520468p19535082.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
>


-- 
Regards,
Shalin Shekhar Mangar.

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message