manifoldcf-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bert van Hoesel <bhoe...@scamander.com>
Subject Re: Oracle jdbc queried documents not read and not ingested into solr
Date Tue, 12 Feb 2013 10:15:33 GMT
Hi Karl,

To answer my own question (as usual :-X  just after the fact).

Was looking at manifoldcf.log in the in the wrong directory.

Solved the problem of a wrongly formed url.

Thanks.

Regards,

Bert.

On 02/12/2013 09:36 AM, Bert van Hoesel wrote:
Hi Karl,

Thanks for the response.

Nothing in the manifoldcf.log file, its completely empty. I even tried several properties
in the properties.xml (org.apache.manifoldcf.misc, org agents, jobs, perf and cache) set to
INFO or DEBUG. But the logfile keeps completely empty at all times.

Am I missing another setting to activate the log?

As for the data query. As far as I could get the info from the docs it should be alright.
The query was shown in the original email at the end (see below).  What other columns then
the described minimal columns IDCOLUMN, DATACOLUMN, URLCOLUMN and IDLIST should be present
in the query?

Regards,

Bert

On 02/11/2013 02:01 PM, Karl Wright wrote:

Is there anything in the manifoldCF log?

The other point is that the queries may run but unless you return the
right information in the right columns it isn't much use to
ManifoldCF.  For the seeding query there isn't much that can go wrong,
except maybe for the start/end time clauses.  Might want to look into
that.

Karl

On Mon, Feb 11, 2013 at 3:57 AM, Bert van Hoesel <bhoesel@scamander.com><mailto:bhoesel@scamander.com>
wrote:


Hi,

I have setup an Oracle JDBC repository and a Solr output. Obvious I want
to get the Oracle documents ingested into the Solr output. But when
running a job to get this done, nothing gets read and ingested into the
Solr output.

The Solr output is working since I can get documents ingested into the
Solr output when using a file system  repository.

The Oracle queries are correct as far as I can check. The seeding and
data queries work when issued directly to Oracle.

But the oracle - solr combination is not working. The Simple History
shows the job is started and the two external queries execute without
error. But no 'read' en 'ingest' actions show up in the history. The
document count in the job status shows the number of documents and the
number processed. These are the expected numbers.

As far as I can see in the Oracle database the queries are indeed
executed by the database. When copying and executing (after a little
edditing) the external queries from the history including shown bind
data to a direct Oracle interface they do retrieve the expected rows.

What am I missing to get the documents (rows) from Oracle ingested into
Solr?

Thanks in advance.

Regards,

Bert van Hoesel.

PS:
Please find below the seeding and data queries, both from definition and
history

SEED:
=====
select mi.menu_id as "$(IDCOLUMN)"
from   sn_menu_items mi
where  1=1
and    mi.menu_type     = 'W'
and    greatest(mi.dt_created, nvl(mi.dt_updated, mi.dt_created)) >
to_date( '1970/01/01:00:00:00', 'yyyy/mm/dd:hh24:mi:ss') +
round($(STARTTIME)/86400000)
and    greatest(mi.dt_created, nvl(mi.dt_updated, mi.dt_created)) <=
to_date( '1970/01/01:00:00:00', 'yyyy/mm/dd:hh24:mi:ss') +
round($(ENDTIME)/86400000)
and    greatest(mi.dt_created, nvl(mi.dt_updated, mi.dt_created)) >
sysdate - 200 /* just for testing */
connect by prior mi.menu_id = mi.top_menu_id
start with mi.menu_id in (185837,275)

DATA:
=====
select mi.menu_id "$(IDCOLUMN)"
,      'Thiz iz id: ' || to_char(mi.menu_id) "$(DATACOLUMN)"
,      '<a
href="http://www.somesite.com/xx/xx_wiki.ht_show?p_id='||mi.wiki_id||chr(38)||'p_fld='||mi.top_menu_id||'"<http://www.somesite.com/xx/xx_wiki.ht_show?p_id=%27%7C%7Cmi.wiki_id%7C%7Cchr%2838%29%7C%7C%27p_fld=%27%7C%7Cmi.top_menu_id%7C%7C%27>>'||mi.display_text||'</a>'
"$(URLCOLUMN)"
from   sn_menu_items mi
,      sn_wiki_item  wi
where  mi.menu_id    in $(IDLIST)
and    mi.wiki_id     = wi.id

The external queries from the historie:

SEED:
=====
select mi.menu_id as "lcf__id" from sn_menu_items mi wher...
e 1=1 and mi.menu_type = 'W' and greatest(mi.dt_...
created, nvl(mi.dt_updated, mi.dt_created)) > to_date( '1970...
/01/01:00:00:00', 'yyyy/mm/dd:hh24:mi:ss') + round(?/86400000...
) and greatest(mi.dt_created, nvl(mi.dt_updated, mi.dt_cr...
eated)) <= to_date( '1970/01/01:00:00:00', 'yyyy/mm/dd:hh24:m...
i:ss') + round(?/86400000) and greatest(mi.dt_created, nv...
l(mi.dt_updated, mi.dt_created)) > sysdate - 200 connect by...
prior mi.menu_id = mi.top_menu_id start with mi.menu_id in ...
(185837,275); arguments = (0,1360570171797)

DATA:
=====
select mi.menu_id "lcf__id" , 'Thiz iz id: ' || to_char...
(mi.menu_id) "lcf__data" , '<a href="http://www.some...
site.com/xx/xx_wiki.ht_show?p_id='||mi.wiki_id||chr(38)||'p...
_fld='||mi.top_menu_id||'"<http://www.some...site.com/xx/xx_wiki.ht_show?p_id=%27%7C%7Cmi.wiki_id%7C%7Cchr%2838%29%7C%7C%27p..._fld=%27%7C%7Cmi.top_menu_id%7C%7C%27>>'||mi.display_text||'</a>'
"lcf__u...
rl" from sn_menu_items mi , sn_wiki_item wi where ...
mi.menu_id in (?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?...
,?,?,?,?,?,?,?,?,?,?,?,?) and mi.wiki_id = wi.id; ar...
guments = ('185904','185641','185885','188488','184738','1856...
12','185853','185889','186158','185117','184723','185901','18...
6249','185263','185886','185229','190366','185103','185900','...
185892','184696','185613','185104','185903','184822','185116'...
,'185979','185866','185896','157508','185893','186185','185902')




Mime
View raw message