lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Steve McKay <shubalub...@gmail.com>
Subject Re: LuceneIndex export to SQL-database
Date Thu, 16 Aug 2012 18:10:26 GMT
#keyword values need to be stored in a separate table that references the
main table so that you can have multiple #keyword rows per document row in
the main table. I have no idea how to make your script do this and I don't
even know if it can. You might want to check the documentation or
discussion list for your script to find out how to use it with multiValued
fields; if it does handle them, it might be with a somewhat different
method from what I described above.

Good luck!

On Wed, Aug 15, 2012 at 1:39 PM, ANNO61 <andreas.nowitzki@anno-edv.de>wrote:

> I am using lucene to produce several indexes from html-sites.
> To work with them i convert the lucene database into sql via a small
> programm. The main problem is that I take a small part of the collected
> datafields ( datasource, plainTextContent, title, description and keyword).
> But there are in most cases more than 1 position named #keyword, so I get
> only the first one. For my sql-database I want to use all values of
> #keyword, but how can this be done?
>
> Below you find the sricpt which is used for converting.
> Can anyone help me to create a solution?
> With kind regards
>
> Andreas
>
> #######################################################
> # Character encoding is UTF-8!!!!
> #
> #
> # This file specifies all necessary parameters in order to build a csv file
> or a
> # database table out of an lucene index. The Fields that should be
> transfered
> # can be specified, together with the database location.
> #
>
>
> #The path to the lucene index
>
> luceneIndexPath=r:\23._Neue_Einteilung_Indexer\2._Indexer2\2012\2012-08\index165\
>
> #These attributes will be considered for conversion
> #attribute2convert=urn:catwiesel:attribute:uri
> attribute2convert=
> http://www.semanticdesktop.org/ontologies/2007/01/19/nie#title
>
> attribute2convert=http://www.semanticdesktop.org/ontologies/2007/01/19/nie#plainTextContent
> attribute2convert=
> http://www.semanticdesktop.org/ontologies/2007/01/19/nie#description
> #attribute2convert=urn:dynaq:buzzwords
> attribute2convert=
> http://www.semanticdesktop.org/ontologies/2007/01/19/nie#dataSource
>
> attribute2convert=http://www.semanticdesktop.org/ontologies/2007/01/19/nie#keyword
>
>
> # This is for creating / appending / overwriting database tables
> dataBaseConversion=
> {
>     # The name of the database table that should be generated / appended /
> overwritten
>     tableName=165_082012
>     #true: the database table will be overwritten, if it exists. false: the
> data entries will be appended
>     overwriteIfExist=true
>
>     # These are the parameters for the connection. Yes, the password is NOT
> so secure here...I'm sorry for that
>     username=root
>     password=
>
>     # This is the connection string to the database. There, the database
> location and the database name is specified
>     # The connection string depends on your database.
>     # e.g. databaseURL=jdbc:mysql://[host:port]/[database]
>     databaseURL=jdbc:mysql://127.0.0.1:3306/luceneexport
>
>
>     # This is the driver for your database
>     # e.g. databaseDriver=org.hsqldb.jdbcDriver
>     # e.g. databaseDriver=com.mysql.jdbc.Driver
>     databaseDriver=com.mysql.jdbc.Driver
>
>     # This is the character that will be used to quote the table column
> names in the SQL statements. Examples are:
>     #No quoting (also could comment out the line):tableColumnsQuoteChar=
>     # ANSI-standard: tableColumnsQuoteChar="
>     # MySQL: tableColumnsQuoteChar=`
>     tableColumnsQuoteChar=`
>
>     # Further, the database type of each attribute has to be specified in
> order to create the database table. Also a
>     # new attribute name for the database column can be specified (note
> that
> dabases have sometimes restrictions for
>     # the length of column names). E.g.:
>     # urn:dynaq:buzzwords=
>     # {
>     #   columnType=TEXT
>     #   columnName=buzzwords
>     # }
>
>     urn:catwiesel:attribute:uri=
>     {
>        columnType=TEXT
>        columnName=uri
>     }
>     http://www.semanticdesktop.org/ontologies/2007/01/19/nie#dataSource=
>     {
>        columnType=TEXT
>        columnName=dataSource
>     }
>
> http://www.semanticdesktop.org/ontologies/2007/01/19/nie#plainTextContent=
>     {
>        columnType=LONGTEXT
>        columnName=plainTextContent
>     }
>     http://www.semanticdesktop.org/ontologies/2007/01/19/nie#description=
>     {
>        columnType=LONGTEXT
>        columnName=metadescription
>     }
>     http://www.semanticdesktop.org/ontologies/2007/01/19/nie#title=
>     {
>        columnType=TEXT
>        columnName=title
>     }
>
>     #urn:dynaq:buzzwords=
>     #{
>     #   columnType=TEXT
>     #   columnName=buzzwords
>     #}
>
>     http://www.semanticdesktop.org/ontologies/2007/01/19/nie#keyword=
>     {
>        columnType=LONGTEXT
>        columnName=metakeyword
>     }
>
>     #http://www.semanticdesktop.org/ontologies/2007/01/19/nie#mimeType=
>     #{
>     #   columnType=TEXT
>     #   columnName=mimeType
>     #}
>
>
>
> }
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/LuceneIndex-export-to-SQL-database-tp4001450.html
> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message