lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Walter Underwood <wun...@wunderwood.org>
Subject Re: indexing java byte code in classes / jars
Date Mon, 11 May 2015 14:53:39 GMT
How about Krugle?

http://opensearch.krugle.org/

Walter Underwood
wunder@wunderwood.org
http://observer.wunderwood.org/  (my blog)

On May 11, 2015, at 3:18 AM, Tomasz Borek <tomasz.borek@gmail.com> wrote:

> There's also Perl-backed ACK. http://beyondgrep.com/
> 
> Which does the job of searching code really well.
> 
> And I think at least once I came across something that stemmed from ACK and
> claimed it was faster/better... googling... aah! The Silver Searcher it
> was. :-)
> http://betterthanack.com/
> 
> pozdrawiam,
> LAFK
> 
> 2015-05-09 12:40 GMT+02:00 Mark <javamark@gmail.com>:
> 
>> Hi  Alexandre,
>> 
>> Solr & ASM is the extact poblem I'm looking to hack about with so I'm keen
>> to consider any code no matter how ugly or broken
>> 
>> Regards
>> 
>> Mark
>> 
>> On 9 May 2015 at 10:21, Alexandre Rafalovitch <arafalov@gmail.com> wrote:
>> 
>>> If you only have classes/jars, use ASM. I have done this before, have
>> some
>>> ugly code to share if you want.
>>> 
>>> If you have sources, javadoc 8 is a good way too. I am doing that now for
>>> solr-start.com, code on Github.
>>> 
>>> Regards,
>>>    Alex
>>> On 9 May 2015 7:09 am, "Mark" <javamark@gmail.com> wrote:
>>> 
>>>> To answer why bytecode - because mostly the use case I have is looking
>> to
>>>> index as much detail from jars/classes.
>>>> 
>>>> extract class names,
>>>> method names
>>>> signatures
>>>> packages / imports
>>>> 
>>>> I am considering using ASM in order to generate an analysis view of the
>>>> class
>>>> 
>>>> The sort of usecases I have would be method / signature searches.
>>>> 
>>>> For example;
>>>> 
>>>> 1) show any classes with a method named parse*
>>>> 
>>>> 2) show any classes with a method named parse that passes in a type
>>> *json*
>>>> 
>>>> ...etc
>>>> 
>>>> In the past I have written something to reverse out javadocs from just
>>> java
>>>> bytecode, using solr would move this idea considerably much more
>>> powerful.
>>>> 
>>>> Thanks for the suggestions so far
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> On 8 May 2015 at 21:19, Erik Hatcher <erik.hatcher@gmail.com> wrote:
>>>> 
>>>>> Oh, and sorry, I omitted a couple of details:
>>>>> 
>>>>> # creating the “java” core/collection
>>>>> bin/solr create -c java
>>>>> 
>>>>> # I ran this from my Solr source code checkout, so that
>>>>> SolrLogFormatter.class just happened to be handy
>>>>> 
>>>>>        Erik
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>>> On May 8, 2015, at 4:11 PM, Erik Hatcher <erik.hatcher@gmail.com>
>>>> wrote:
>>>>>> 
>>>>>> What kinds of searches do you want to run?  Are you trying to
>> extract
>>>>> class names, method names, and such and make those searchable?   If
>>>> that’s
>>>>> the case, you need some kind of “parser” to reverse engineer that
>>>>> information from .class and .jar files before feeding it to Solr,
>> which
>>>>> would happen before analysis.   Java itself comes with a javap
>> command
>>>> that
>>>>> can do this; whether this is the “best” way to go for your scenario
I
>>>> don’t
>>>>> know, but here’s an interesting example pasted below (using Solr
>> 5.x).
>>>>>> 
>>>>>> —
>>>>>> Erik Hatcher, Senior Solutions Architect
>>>>>> http://www.lucidworks.com
>>>>>> 
>>>>>> 
>>>>>> javap
>>>>> build/solr-core/classes/java/org/apache/solr/SolrLogFormatter.class >
>>>>> test.txt
>>>>>> bin/post -c java test.txt
>>>>>> 
>>>>>> now search for "coreInfoMap"
>>>>> http://localhost:8983/solr/java/browse?q=coreInfoMap
>>>>>> 
>>>>>> I tried to be cleverer and use the stdin option of bin/post, like
>>> this:
>>>>>> javap
>>>>> build/solr-core/classes/java/org/apache/solr/SolrLogFormatter.class |
>>>>> bin/post -c java -url http://localhost:8983/solr/java/update/extract
>>>>> -type text/plain -params "literal.id=SolrLogFormatter" -out yes -d
>>>>>> but something isn’t working right with the stdin detection like
>> that
>>>> (it
>>>>> does work to `cat test.txt | bin/post…` though, hmmm)
>>>>>> 
>>>>>> test.txt looks like this, `cat test.txt`:
>>>>>> Compiled from "SolrLogFormatter.java"
>>>>>> public class org.apache.solr.SolrLogFormatter extends
>>>>> java.util.logging.Formatter {
>>>>>> long startTime;
>>>>>> long lastTime;
>>>>>> java.util.Map<org.apache.solr.SolrLogFormatter$Method,
>>>>> java.lang.String> methodAlias;
>>>>>> public boolean shorterFormat;
>>>>>> java.util.Map<org.apache.solr.core.SolrCore,
>>>>> org.apache.solr.SolrLogFormatter$CoreInfo> coreInfoMap;
>>>>>> public java.util.Map<java.lang.String, java.lang.String>
>>> classAliases;
>>>>>> static java.lang.ThreadLocal<java.lang.String> threadLocal;
>>>>>> public org.apache.solr.SolrLogFormatter();
>>>>>> public void setShorterFormat();
>>>>>> public java.lang.String format(java.util.logging.LogRecord);
>>>>>> public void appendThread(java.lang.StringBuilder,
>>>>> java.util.logging.LogRecord);
>>>>>> public java.lang.String _format(java.util.logging.LogRecord);
>>>>>> public java.lang.String getHead(java.util.logging.Handler);
>>>>>> public java.lang.String getTail(java.util.logging.Handler);
>>>>>> public java.lang.String
>> formatMessage(java.util.logging.LogRecord);
>>>>>> public static void main(java.lang.String[]) throws
>>>> java.lang.Exception;
>>>>>> public static void go() throws java.lang.Exception;
>>>>>> static {};
>>>>>> }
>>>>>> 
>>>>>>> On May 8, 2015, at 3:31 PM, Mark <javamark@gmail.com> wrote:
>>>>>>> 
>>>>>>> I looking to use Solr search over the byte code in Classes and
>> Jars.
>>>>>>> 
>>>>>>> Does anyone know or have experience of Analyzers, Tokenizers,
and
>>>> Token
>>>>>>> Filters for such a task?
>>>>>>> 
>>>>>>> Regards
>>>>>>> 
>>>>>>> Mark
>>>>>> 
>>>>> 
>>>>> 
>>>> 
>>> 
>> 


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message