mahout-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Grant Ingersoll (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAHOUT-944) LuceneIndexToSequenceFiles (lucene2seq) utility
Date Thu, 06 Jun 2013 17:03:21 GMT

    [ https://issues.apache.org/jira/browse/MAHOUT-944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13677237#comment-13677237
] 

Grant Ingersoll commented on MAHOUT-944:
----------------------------------------

Hmm, I wonder if I should have squashed my local commits:
{quote}
Committed r1490329
W: 0a28b0f322ffe888553b9e2adf0b6f098b679f16 and refs/remotes/origin/trunk differ, using rebase:
:040000 040000 779e2a48da78d2f59f994c83eb1cb91a42b04d41 6e8221954eecd7ee27788976dc7b2665985cd7e6
M	integration
:100644 100644 492aa3aacbee4e33fb70a2e361d772a9d881ae04 09c5ae712a035af3eef2c3c56db708b8fa75e1b3
M	pom.xml
:040000 040000 39350289431946a74a7bd15fbf72947261055536 c7274b40f5de032b1668ed9d6f2d1fa24ff0a124
M	src
Current branch MAHOUT-944 is up to date.
# of revisions changed  
before:
 d668ddf606dbb0d046f0fe8e3eb97e06fcd4c406
9eafd07120a1810d778dfeb4502ba36b5b3eacfe
253a58c30d0a22150234975f782720248b51a8cb 

after:
 0a28b0f322ffe888553b9e2adf0b6f098b679f16
d668ddf606dbb0d046f0fe8e3eb97e06fcd4c406
9eafd07120a1810d778dfeb4502ba36b5b3eacfe
253a58c30d0a22150234975f782720248b51a8cb 
 If you are attempting to commit  merges, try running:
	 git rebase --interactive --preserve-merges  refs/remotes/origin/trunk 
Before dcommitting
{quote}
                
> LuceneIndexToSequenceFiles (lucene2seq) utility
> -----------------------------------------------
>
>                 Key: MAHOUT-944
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-944
>             Project: Mahout
>          Issue Type: New Feature
>          Components: Integration
>    Affects Versions: 0.5
>            Reporter: Frank Scholten
>            Assignee: Grant Ingersoll
>            Priority: Minor
>             Fix For: 0.8
>
>         Attachments: MAHOUT-944.patch, MAHOUT-944.patch, MAHOUT-944.patch, MAHOUT-944.patch,
MAHOUT-944.patch, MAHOUT-944.patch, MAHOUT-944.patch, MAHOUT-944.patch, MAHOUT-944.patch,
MAHOUT-944.patch, MAHOUT-944.patch, MAHOUT-944.patch
>
>
> Here is a lucene2seq tool I used in a project. It creates sequence files based on the
stored fields of a lucene index.
> The output from this tool can be then fed into seq2sparse and from there you can do text
clustering.
> Comes with Java bean configuration.
> Let me know what you think. Some CLI code can be added later on. I used this for a small-scale
project +- 100.000 docs. Is a MR version useful or is that overkill?
> See https://github.com/frankscholten/mahout/tree/lucene2seq for commits and review comments
from Simon Willnauer (Thanks Simon!)
> or the attached patch.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message