manifoldcf-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nigel Thomas <nigel.tho...@york.ac.uk>
Subject Re: Using Generic Database Repository Connection without supplying a valid URLCOLUMN
Date Sun, 30 Sep 2012 20:03:04 GMT
Hi Karl,

Thanks for this response, sorry, I may not have been clear about the
code change made. I only changed the connector to allow invalid URL,
and *not* as implied to hardcode the URLCOLUMN to the same value as
IDCOLUMN, my solution as you suggested was then to fake the SQL so as
to supply a URLCOLUMN with the same value as the IDCOLUMN.

The real issue is that it is using the URLCOLUMN as the key instead of
the IDCOLUMN when indexing, I have tried setting the SOLR mapping
explicitly, and removing this id mapping, neither helped.

The simple logs confirm that URL is used as the identifier and solr
logs confirms this as well.

Nigel
On 30 September 2012 19:46, Karl Wright <daddywri@gmail.com> wrote:
> Hi Nigel,
>
> The connector is designed to allow you to supply your own queries, so
> in theory as long as you know what you are doing you can "fake it out"
> pretty easily without modifying the source code.  For example, if you
> want the URL column to always have the same value as the ID column,
> just provide the appropriate AS clause in your query to make that
> happen, e.g.:
>
> SELECT myidfield AS $(URLCOLUMN), ....
>
> SQL has plenty of power to allow you to build hacks of this kind.
>
> However, the reason you might want to consider supplying a real URL is
> because presumably you will want to click on the results of a Solr
> search and have something useful happen.  So you usually need a URL
> anyway, although in some cases you may have a setup where that is not
> true.
>
> Karl
>
> On Sun, Sep 30, 2012 at 1:22 PM, Nigel Thomas <nigel.thomas@york.ac.uk> wrote:
>> Hello,
>>
>> I am using the Generic Database Repository Connection to fetch some
>> metadata some a Oracle data source into a SOLR instance using
>> ManifoldCF 0.6.
>>
>> Given my goal to just ingest metadata from SQL data source into SOLR
>> without the need to specify a document URL (URLCOLUMN) or data
>> (DATACOLUMN).
>>
>> While attempting to set this up, I found that when the URL field was
>> set to null or invalid url this failed the ingest completely (despite
>> having DATACOLUMN set to some string). If a valid URL was provided,
>> the ingest progressed, the supplied IDCOLUMN was not used, instead the
>> document URL was used as the default ID in SOLR (using solr 1.4).
>>
>> To work around both these issues I had edited to connector code to
>> allow a invalid URL and set the URL to be equal that of IDCOLUMN.
>>
>> My question is, am I using this connector of its intended purpose? I
>> guess the functionality I require is similar to that provided by
>> Solr's own data import handler
>> http://wiki.apache.org/solr/DataImportHandler#Goals, do they match
>> that of this connector?
>>
>> Thanks,
>>
>> Nigel Thomas

Mime
View raw message