lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mark Wiltshire <m...@redalertconsultants.co.uk>
Subject Re: Help with delimited text
Date Wed, 06 Apr 2011 10:33:39 GMT
Thanks Ian,

	I have managed to do that and through Luke I get My expected results.

	Here is now my Index Code.

                StringTokenizer st = buildSubjectArea(dbConnection, oid);
                int tokenCount = 0;
                while (st.hasMoreTokens()){
                	tokenCount++;
                	String categoryPath = st.nextToken();
                    
                    if (categoryPath.length() != 0) {
                        ////doc.add(Field.Text("category", category));
                    	doc.add(new Field("category_path",categoryPath,Field.Store.YES,Field.Index.NOT_ANALYZED));
                        
                    }
                }	

	Now using Luke with KeyworkAnalyser if I enter

		category_path:/Top/My Prods*

	I get my expected results back.
  

	But I cannot get this working in my search code.
	I am using this field to filter the results, i.e.

	If I want Top Books I want to filter by /Top/Books*, if I want Top CD's I want to filter
by /Top/CD*

	Filter string is generated from mapping file which gives me a category path

	e.g. /Top/Books

	Searcher searcher = new IndexSearcher(FSDirectory.open(new File(index)));
	Analyzer analyzer = new StandardAnalyzer(Version.LUCENE_30);
	...
	Query  subjectFilterQuery = new TermQuery(new Term("category_path","/Top/Books*"));
	QueryWrapperFilter filter = new QueryWrapperFilter(subjectFilterQuery);
	TopDocs searchResult = searcher.search(query,filter,MAX_SEARCH_RESULTS_SIZE); 

	If I debug the subjectFilterQuery and write out

			subjectFilterQuery.toString()

	I See

		"subjectFilterQuery.toString()"	category_path:/TOP/CD*
	
	So it looks like the query is constructed correctly ?
	But this does not bring back any results ?
	You say I need to be consistent in Index and Query, am I missing something.

	Many thanks

Mark

On 6 Apr 2011, at 10:06, Ian Lea wrote:

You can add multiple values for a field to a single document.

Document doc = new Document();
String[] paths = whatever.split(",");
for (String p : paths) {
 doc.add(new Field("path", p, whatever ...);
}


For searching, assuming you only want to be able to wildcard on path
delimiters, you could index

/Top/My Prods/Book Prods/Text Books
/Top/My Prods/Book Prods
/Top/My Prods
/Top

which would let you search on any of them.

You'll want to pick or build an analyzer that behaves as you want wrt
case matching and not splitting on the /.  Sometime it can be easier
to replace a character e.g. / to _.  I think there is a lucene class
that can do that, maybe MappingCharFilter, if you don't want to do it
yourself.  You will of course need to be consistent and do the same
processing at index and search time.


--
Ian.



On Wed, Apr 6, 2011 at 7:55 AM, Mark Wiltshire
<mark@redalertconsultants.co.uk> wrote:
> To add more information
> 
>        I am then wanting to search this field using part or all of the path using wildcards
> 
>        i.e.
> 
>        Search category_path with /Top/My Prods*
> 
> 
> Hi java-users
> 
>        I need some help.
> 
>        I am indexing categories into a single field category_path
> 
>        Which may contain items such as
> 
>        /Top/Books,/Top/My Prods/Book Prods/Text Books, /Maths/Books/TextBooks
> 
>        i.e. category paths delimited by ,
> 
>        I want to store this field, so the Analyser tokenizes the document only on ','
charaters and not on the '/' characters
> 
>        I am adding the field to the index using
> 
>        Where the categoryPath is a String containing list of the items above.
> 
>        doc.add(new Field("category_path",categoryPath,Field.Store.YES,Field.Index.ANALYZED));
> 
>        I think I need to split the string my self, but how do I pass this to Lucene,
do I have to setup different fields ?
> 
>        I need to keep the full path in the index, as I want to use this when redirecting
users, when clicking on the results.
> 
>        Any help would be great.
> 
>        Many thanks
> 
> Regards
> 
> Mark

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org



Regards

Mark

You can view my current Wolters Kluwer tasks at:
https://secure.nozbe.com/nozbe/page/feed/get-markRAC.5625.wolters_kluwer

You can view my calendar at:		   You can subscribe to my calendar at:
http://ical.me.com/markjwiltshire/Work    webcal://ical.me.com/markjwiltshire/Work.ics







Mark Wiltshire							Red Alert Consultants LTD
Director - Technical Web Consultant						Flat 6
									  168 Tower Bridge Road
												    London
											             SE1 3LS
mark@redalertconsultants.co.uk
iPhone email: markjwiltshire@me.com
IM: markjwiltshire@yahoo.com			    Desk:     0208 247 1547
http://www.redalertconsultants.co.uk	    Mobile: 07973 252 403





Mime
View raw message