lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From ANNO61 <andreas.nowit...@anno-edv.de>
Subject LuceneIndex export to SQL-database
Date Wed, 15 Aug 2012 17:39:42 GMT
I am using lucene to produce several indexes from html-sites.
To work with them i convert the lucene database into sql via a small
programm. The main problem is that I take a small part of the collected
datafields ( datasource, plainTextContent, title, description and keyword).
But there are in most cases more than 1 position named #keyword, so I get
only the first one. For my sql-database I want to use all values of
#keyword, but how can this be done?

Below you find the sricpt which is used for converting.
Can anyone help me to create a solution? 
With kind regards 

Andreas

#######################################################
# Character encoding is UTF-8!!!!
#
#
# This file specifies all necessary parameters in order to build a csv file
or a
# database table out of an lucene index. The Fields that should be
transfered
# can be specified, together with the database location.
#


#The path to the lucene index
luceneIndexPath=r:\23._Neue_Einteilung_Indexer\2._Indexer2\2012\2012-08\index165\

#These attributes will be considered for conversion
#attribute2convert=urn:catwiesel:attribute:uri
attribute2convert=http://www.semanticdesktop.org/ontologies/2007/01/19/nie#title
attribute2convert=http://www.semanticdesktop.org/ontologies/2007/01/19/nie#plainTextContent

attribute2convert=http://www.semanticdesktop.org/ontologies/2007/01/19/nie#description
#attribute2convert=urn:dynaq:buzzwords
attribute2convert=http://www.semanticdesktop.org/ontologies/2007/01/19/nie#dataSource
attribute2convert=http://www.semanticdesktop.org/ontologies/2007/01/19/nie#keyword


# This is for creating / appending / overwriting database tables
dataBaseConversion=
{
    # The name of the database table that should be generated / appended /
overwritten
    tableName=165_082012
    #true: the database table will be overwritten, if it exists. false: the
data entries will be appended
    overwriteIfExist=true
    
    # These are the parameters for the connection. Yes, the password is NOT
so secure here...I'm sorry for that
    username=root
    password=
    
    # This is the connection string to the database. There, the database
location and the database name is specified
    # The connection string depends on your database.
    # e.g. databaseURL=jdbc:mysql://[host:port]/[database]
    databaseURL=jdbc:mysql://127.0.0.1:3306/luceneexport
        
        
    # This is the driver for your database
    # e.g. databaseDriver=org.hsqldb.jdbcDriver
    # e.g. databaseDriver=com.mysql.jdbc.Driver
    databaseDriver=com.mysql.jdbc.Driver
    
    # This is the character that will be used to quote the table column
names in the SQL statements. Examples are:
    #No quoting (also could comment out the line):tableColumnsQuoteChar=
    # ANSI-standard: tableColumnsQuoteChar="
    # MySQL: tableColumnsQuoteChar=`
    tableColumnsQuoteChar=`
    
    # Further, the database type of each attribute has to be specified in
order to create the database table. Also a
    # new attribute name for the database column can be specified (note that
dabases have sometimes restrictions for
    # the length of column names). E.g.:
    # urn:dynaq:buzzwords=
    # {
    #   columnType=TEXT
    #   columnName=buzzwords
    # }
    
    urn:catwiesel:attribute:uri=
    {
       columnType=TEXT
       columnName=uri
    }
    http://www.semanticdesktop.org/ontologies/2007/01/19/nie#dataSource=
    {
       columnType=TEXT
       columnName=dataSource
    }
   
http://www.semanticdesktop.org/ontologies/2007/01/19/nie#plainTextContent=
    {
       columnType=LONGTEXT
       columnName=plainTextContent
    }
    http://www.semanticdesktop.org/ontologies/2007/01/19/nie#description=
    {
       columnType=LONGTEXT
       columnName=metadescription
    }
    http://www.semanticdesktop.org/ontologies/2007/01/19/nie#title=
    {
       columnType=TEXT
       columnName=title
    }

    #urn:dynaq:buzzwords=
    #{
    #   columnType=TEXT
    #   columnName=buzzwords
    #}
    
    http://www.semanticdesktop.org/ontologies/2007/01/19/nie#keyword=
    {
       columnType=LONGTEXT
       columnName=metakeyword
    }

    #http://www.semanticdesktop.org/ontologies/2007/01/19/nie#mimeType=
    #{
    #   columnType=TEXT
    #   columnName=mimeType
    #}

   

}



--
View this message in context: http://lucene.472066.n3.nabble.com/LuceneIndex-export-to-SQL-database-tp4001450.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message