lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From wangjing <ppm10...@gmail.com>
Subject about .frq file format in doc
Date Wed, 27 Jun 2012 09:40:17 GMT
http://lucene.apache.org/core/3_6_0/fileformats.html#Frequencies

The .frq file contains the lists of documents which contain each term,
along with the frequency of the term in that document (except when
frequencies are omitted: IndexOptions.DOCS_ONLY).

BUT in my example,the frq record the frequency of the term in FIELD?

my app is
public void createLuceneIndex() {
		try {
			File file = new File(path);
			FSDirectory directory = FSDirectory.open(file);
			// 用来创建索引
			IndexWriterConfig conf = new IndexWriterConfig(Version.LUCENE_36,
					new StandardAnalyzer(Version.LUCENE_36));
			// 避免重复运用时每次都附加index导致查询出多条记录
			conf.setOpenMode(IndexWriterConfig.OpenMode.CREATE);
			IndexWriter iw = new IndexWriter(directory, conf);
			iw.setInfoStream(System.out);
			Document doc = new Document();
			// Store.YES 保存 可以查询 可以打印内容
			Field storeYes = new Field("content1", "my my my my storeyes test",
					Store.YES, Index.ANALYZED);
			// Store.NO 不保存 可以查询 不可打印内容 由于不保存内容所以节省空间
			Field storeNo = new Field("content2", "my storeno test", Store.YES,
					Index.ANALYZED);

			doc.add(storeYes);
			doc.add(storeNo);

			Document doc1 = new Document();
			Field storeYes1 = new Field("doc1content1",
					"doc1 my storeyes test", Store.YES, Index.ANALYZED);
			doc1.add(storeYes1);

			Document doc2 = new Document();
			Field storeYesdoc2 = new Field("doc2content1",
					"doc2 my storeyes test", Store.YES, Index.ANALYZED);
			Field storeNodoc2 = new Field("doc2content2",
					"doc2 my storeno test", Store.YES, Index.ANALYZED);
			doc2.add(storeYesdoc2);
			doc2.add(storeNodoc2);

			iw.addDocument(doc);
			iw.addDocument(doc1);
			iw.addDocument(doc2);
			System.out.println("doc :" + doc);
			System.out.println("doc1 :" + doc1);
			System.out.println("doc2 :" + doc2);
			iw.forceMerge(1);
			iw.close();
		} catch (CorruptIndexException e) {
			// TODO Auto-generated catch block
			e.printStackTrace();
		} catch (LockObtainFailedException e) {
			// TODO Auto-generated catch block
			e.printStackTrace();
		} catch (IOException e) {
			// TODO Auto-generated catch block
			e.printStackTrace();
		}
	}


THE HEX in .frq file is
00 04 01 01 01 01 01 03  03 03 03 05 05 05 05 05
05 05 05

in my  analysis
1   [0,4] [0,1] [0,1] [0,1] [0,1]
[0,1][1,1][1,1][1,1][1,1][2,1][2,1][2,1][2,1][2,1][2,1][2,1][2,1]
2   [0,4] [0,1] [0,1] [0,1] [0,1] [0,1]
[1,1][1,1][1,1][1,1][2,1][2,1][2,1][2,1][2,1][2,1][2,1][2,1]
3   [0,4][1][1][1][1][1][3][3][3][3][5] [5][5][5][5][5][5][5]

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message