lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Borkenhagen, Michael (ofd-ko zdfin)" <>
Subject AW: Using lucene with HSSF from Apache
Date Fri, 02 May 2003 07:23:55 GMT
You should have read the HSSF javadoc more thoroughly; I think that´s a
Question for POI users, but I´d like to help you anywy.
I´d extract the text form an Excel Sheet like this :

public Reader getText(File f) throws IOException {
 StringBuffer contentBuffer = new StringBuffer();
    HSSFWorkbook wb = new HSSFWorkbook(new FileInputStream(f));
    int numberOfSheets = wb.getNumberOfSheets();
    for (int i = 0; i < numberOfSheets; i++) {
      HSSFSheet sheet = wb.getSheetAt(i);
      int numberOfRows = sheet.getLastRowNum();
      for (int j = 0; j < numberOfRows; j++) {
        HSSFRow row = sheet.getRow(j);
        if (row != null) {      // empty lines : null :(
          Iterator it = row.cellIterator();
          while (it.hasNext()) {
            HSSFCell cell = (HSSFCell);
            int type = cell.getCellType();
            if (type == HSSFCell.CELL_TYPE_STRING) {
    String contentAsStr = contentBuffer.toString();
    // create a tmp output stream with the size of the content.
    ByteArrayOutputStream out = new ByteArrayOutputStream();
    ivContents = contentAsStr.getBytes();
    return new InputStreamReader(new ByteArrayInputStream(ivContents));


-----Ursprüngliche Nachricht-----
Von: Shoba Ramachandran []
Gesendet: Mittwoch, 30. April 2003 18:10
Betreff: Using lucene with HSSF from Apache


Has anyone tried to index xls and doc files?
I'm trying to do with HSSF from apache and using

This code returns me binary and printing it out gives
junk chracters. File indexed like this returns nothing
upon search. 

public static byte[] parse(File file) throws Exception
    POIFSFileSystem fs = new POIFSFileSystem(new
HSSFWorkbook wb = new HSSFWorkbook(fs);
byte[] xlsInfo = wb.getBytes();
    System.out.println("xls content :  "+
return xlsInfo;

Thanks in advance for your help

Do you Yahoo!?
The New Yahoo! Search - Faster. Easier. Bingo.

To unsubscribe, e-mail:
For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message