Return-Path: X-Original-To: apmail-lucene-solr-user-archive@minotaur.apache.org Delivered-To: apmail-lucene-solr-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 5B96818A22 for ; Mon, 14 Dec 2015 17:30:03 +0000 (UTC) Received: (qmail 71294 invoked by uid 500); 14 Dec 2015 17:29:59 -0000 Delivered-To: apmail-lucene-solr-user-archive@lucene.apache.org Received: (qmail 71222 invoked by uid 500); 14 Dec 2015 17:29:59 -0000 Mailing-List: contact solr-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: solr-user@lucene.apache.org Delivered-To: mailing list solr-user@lucene.apache.org Received: (qmail 71211 invoked by uid 99); 14 Dec 2015 17:29:58 -0000 Received: from Unknown (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 14 Dec 2015 17:29:58 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id 49ECF1A032D for ; Mon, 14 Dec 2015 17:29:58 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 4.001 X-Spam-Level: **** X-Spam-Status: No, score=4.001 tagged_above=-999 required=6.31 tests=[HTML_MESSAGE=3, KAM_LAZY_DOMAIN_SECURITY=1, URIBL_BLOCKED=0.001] autolearn=disabled Received: from mx1-us-east.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id EfT2zkLkNZn1 for ; Mon, 14 Dec 2015 17:29:45 +0000 (UTC) Received: from DUB004-OMC4S1.hotmail.com (dub004-omc4s1.hotmail.com [157.55.2.76]) by mx1-us-east.apache.org (ASF Mail Server at mx1-us-east.apache.org) with ESMTPS id 7CCF842A65 for ; Mon, 14 Dec 2015 17:29:44 +0000 (UTC) Received: from DUB406-EAS340 ([157.55.2.71]) by DUB004-OMC4S1.hotmail.com over TLS secured channel with Microsoft SMTPSVC(7.5.7601.23008); Mon, 14 Dec 2015 09:29:37 -0800 X-TMN: [QaS2Ji92hWwOEZL0XL/NEfKnnhct6K1s] X-Originating-Email: [alkampfer@nablasoft.com] Message-ID: From: Gian Maria Ricci - aka Alkampfer To: Subject: Best practice for incremental Data Import Handler Date: Mon, 14 Dec 2015 18:29:34 +0100 MIME-Version: 1.0 Content-Type: multipart/related; boundary="----=_NextPart_000_00AE_01D1369D.61BC97D0" X-Mailer: Microsoft Outlook 16.0 Content-Language: it Thread-Index: AdE2lGqwAsK2ucOwQuKRyvb9pz9/hw== Sender: X-OriginalArrivalTime: 14 Dec 2015 17:29:37.0084 (UTC) FILETIME=[015527C0:01D13695] ------=_NextPart_000_00AE_01D1369D.61BC97D0 Content-Type: multipart/alternative; boundary="----=_NextPart_001_00AF_01D1369D.61BC97D0" ------=_NextPart_001_00AF_01D1369D.61BC97D0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Hi, I just want some feedback on best practice to run incremental DIH. During last years I always preferred to have dedicated application that pushes data inside ElasticSearch / Solr, but now I have a situation where we are forced to use DIH. I have several SQL Server database with a column of type timestamp (I'm trying to understand if it is possible to have a standard DateTime column). In the past I've written a super simple C# routine that executes these macro steps 1) Query solr to understand if the DIH is running (to avoid problem if multiple instances fired togheter) 2) Query solr to get the document with higher timestamp value 3) Launch DIH passing the higer timestamp value to do incremental population (Greater than or equal) 4) Monitor DIH and wait for it to finish. I never had problem with this approach, but actually I'm wondering if there is some better approach instead of having a custom routine that manage running DIH. Also I'm in a situation where we are not allowed to run C# code, so we should rewrite that simple program in Node.js or plain bash shell. My aim is not reimplementing the wheel J. Thanks for any suggestion you can give me. -- Gian Maria Ricci Cell: +39 320 0136949 ------=_NextPart_001_00AF_01D1369D.61BC97D0 Content-Type: text/html; charset="us-ascii" Content-Transfer-Encoding: quoted-printable

Hi,

 

I just want some feedback on best practice to run = incremental DIH. During last years I always preferred to have dedicated = application that pushes data inside ElasticSearch / Solr, but now I have = a situation where we are forced to use DIH.

 

I have several SQL Server database = with a column of type timestamp (I’m trying to understand if it is = possible to have a standard DateTime column).

 

In the past I’ve written a = super simple C# routine that executes these macro = steps

 

1)      = Query solr to = understand if the DIH is running (to avoid problem if multiple instances = fired togheter)

2)      = Query solr to get the = document with higher timestamp value

3)      = Launch DIH passing the = higer timestamp value to do incremental population (Greater than or = equal)

4)      = Monitor DIH and wait = for it to finish.

 

I never had problem with this approach, but actually = I’m wondering if there is some better approach instead of having a = custom routine that manage running DIH. Also I’m in a situation = where we are not allowed to run C# code, so we should rewrite that = simple program in Node.js or plain bash shell. My aim is not = reimplementing the wheel J.

 

Thanks for any suggestion you can give = me.

--
Gian Maria Ricci
Cell: +39 320 0136949

3D"https://ci5.googleusercontent.com/proxy/5oNMOYAeFXZ_LDKanNfoLRHC37=<= span = style=3D'font-size:12.0pt;font-family:"Arial",sans-serif;color:#1F497D;ms= o-fareast-language:IT'> 3D"https://ci3.googleusercontent.com/proxy/f-unQbmk6NtkHFspO5Y6x4jlIf==  3D"https://ci3.googleusercontent.com/proxy/gjapMzu3KEakBQUstx_-cN7gHJ= 3D"https://ci5.googleusercontent.com/proxy/iuDOD2sdaxRDvTwS8MO7-CcXch=<= span = style=3D'font-size:12.0pt;font-family:"Arial",sans-serif;color:#1F497D;ms= o-fareast-language:IT'> 3D"https://ci6.googleusercontent.com/proxy/EBJjfkBzcsSlAzlyR88y86YXcw=

 

------=_NextPart_001_00AF_01D1369D.61BC97D0-- ------=_NextPart_000_00AE_01D1369D.61BC97D0--