lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Raymond Xie <xie3208...@gmail.com>
Subject Re: How do I create a schema file for FIX data in Solr
Date Wed, 04 Apr 2018 01:09:59 GMT
I'm talking to the author to find out, thanks.

~~~sent from my cell phone, sorry if there is any typo

Adhyan Arizki <a.arizki@gmail.com> 于 2018年4月3日周二 下午1:38写道:

> Raymond,
>
> Seems you are having issue with the node environment. Likely the path isn't
> registered correctly judging from the error message. Note though, this is
> no longer related to Solr issue.
>
> On Tue, 3 Apr 2018, 23:00 Raymond Xie, <xie3208080@gmail.com> wrote:
>
> > Hi Rick,
> >
> > Following your suggestion I found
> https://github.com/SunGard-Labs/fix2json
> > which seems to be a fit;
> >
> > I followed the installation instruction and successfully installed the
> > fix2json on my Ubuntu host.
> >
> > sudo npm install -g fix2json
> >
> > I ran the same command as indicated in the git:
> >
> > fix2json -p dict/FIX50SP2.CME.xml XCME_MD_GE_FUT_20160315.gz
> >
> >
> > and I received error of:
> >
> > /usr/bin/env: ‘node’: No such file or directory
> >
> > It would be appreciated if you can point out what is missing here?
> >
> > Thank you again for your kind help.
> >
> >
> >
> > *------------------------------------------------*
> > *Sincerely yours,*
> >
> >
> > *Raymond*
> >
> > On Mon, Apr 2, 2018 at 9:30 AM, Raymond Xie <xie3208080@gmail.com>
> wrote:
> >
> > > Thank you Rick for the enlightening.
> > >
> > > I will get the FIX message parsed first and come back here later.
> > >
> > >
> > > *------------------------------------------------*
> > > *Sincerely yours,*
> > >
> > >
> > > *Raymond*
> > >
> > > On Mon, Apr 2, 2018 at 9:15 AM, Rick Leir <rleir@leirtech.com> wrote:
> > >
> > >> Google
> > >>    fix to json,
> > >> there are a few interesting leads.
> > >>
> > >> On April 2, 2018 12:34:44 AM EDT, Raymond Xie <xie3208080@gmail.com>
> > >> wrote:
> > >> >Thank you, Shawn, Rick and other readers,
> > >> >
> > >> >To Shawn:
> > >> >
> > >> >For  *8=FIX.4.4 9=653 35=RIO* as an example, in the FIX standard: 8
> > >> >means BeginString, in this example, its value is  FIX.4.4.9, and 9
> > >> >means
> > >> >body length, it is 653 for this message, 35 is RIO, meaning the
> message
> > >> >type is RIO, 122 stands for OrigSendingTime and has a format of
> > >> >UTCTimestamp
> > >> >
> > >> >You can refer to this page for details: https://www.onixs.biz
> > >> >/fix-dictionary/4.2/fields_by_tag.html
> > >> >
> > >> >All the values are explained as string type.
> > >> >
> > >> >All the tag numbers are from FIX standard so it doesn't change (in
my
> > >> >case)
> > >> >
> > >> >I expect a python program might be needed to parse the message and
> > >> >extract
> > >> >each tag's value, index is to be made on those extracted value as
> long
> > >> >as
> > >> >their field (tag) name.
> > >> >
> > >> >With index in place, ideally and naturally user will search for any
> > >> >keyword, however, in this case, most queries would be based on tag
37
> > >> >(Order ID) and 75 (Trade Date), there is another customized tag (not
> in
> > >> >the
> > >> >standard) Order Version to be queried on.
> > >> >
> > >> >I understand the parser creation would be a manual process, as long
> as
> > >> >I
> > >> >know or have a small sample program, I will do it myself and maybe
> > >> >adjust
> > >> >it as per need.
> > >> >
> > >> >To Rick:
> > >> >
> > >> >You mentioned creating JSON document, my understanding is a parser
> > >> >would be
> > >> >needed to generate that JSON document, do you have any existing
> example
> > >> >code?
> > >> >
> > >> >
> > >> >
> > >> >
> > >> >Thank you guys very much.
> > >> >
> > >> >
> > >> >
> > >> >
> > >> >
> > >> >
> > >> >
> > >> >
> > >> >
> > >> >*------------------------------------------------*
> > >> >*Sincerely yours,*
> > >> >
> > >> >
> > >> >*Raymond*
> > >> >
> > >> >On Sun, Apr 1, 2018 at 2:16 PM, Shawn Heisey <apache@elyograg.org>
> > >> >wrote:
> > >> >
> > >> >> On 4/1/2018 10:12 AM, Raymond Xie wrote:
> > >> >>
> > >> >>> FIX is a format standard of financial data. It contains lots
of
> tags
> > >> >in
> > >> >>> number with value for the tag, like 8=asdf, where 8 is the
tag and
> > >> >asdf is
> > >> >>> the tag's value. Each tag has its definition.
> > >> >>>
> > >> >>> The sample msg in FIX format was in the original question.
> > >> >>>
> > >> >>> All I need to do is to know how to paste the msg and get all
tag's
> > >> >value.
> > >> >>>
> > >> >>> I found so far a parser is what I need to start with., But
I am
> more
> > >> >>> concerning about how to create index in Solr on the extracted
> tag's
> > >> >value,
> > >> >>> that is the first step, the next would be to customize the
> dashboard
> > >> >for
> > >> >>> users to search with a value to find out which msg contains
that
> > >> >value in
> > >> >>> which tag and present users the whole msg as proof.
> > >> >>>
> > >> >>
> > >> >> Most of Solr's functionality is provided by Lucene.  Lucene is
a
> java
> > >> >API
> > >> >> that implements search functionality.  Solr bolts on some
> > >> >functionality on
> > >> >> top of Lucene, but doesn't really do anything to fundamentally
> change
> > >> >the
> > >> >> fact that you're dealing with a Lucene index.  So I'm going to
> mostly
> > >> >talk
> > >> >> about Lucene below.
> > >> >>
> > >> >> Lucene organizes data in a unit that we call a "document." An
easy
> > >> >analogy
> > >> >> for this is that it is a lot like a row in a single database table.
> > >> >It has
> > >> >> fields, each field has a type. Unless custom software is used,
> there
> > >> >is
> > >> >> really no support for data other than basic primitive types --
> > >> >numbers and
> > >> >> strings.  The only complex type that I can think of that Solr
> > >> >supports out
> > >> >> of the box is geospatial coordinates, and it might even support
> > >> >> multi-dimensional coordinates, but I'm not sure.  It's not all
that
> > >> >complex
> > >> >> -- the field just stores and manipulates multiple numbers instead
> of
> > >> >one.
> > >> >> The Lucene API does support a FEW things that Solr doesn't
> implement.
> > >> > I
> > >> >> don't think those are applicable to what you're trying to do.
> > >> >>
> > >> >> Let's look at the first part of the data that you included in
the
> > >> >first
> > >> >> message:
> > >> >>
> > >> >> 8=FIX.4.4 9=653 35=RIO
> > >> >>
> > >> >> Is "8" always a mixture of letters and numbers and periods? Is
"9"
> > >> >always
> > >> >> a number, and is it always a WHOLE number?  Is "35" always letters?
> > >> >> Looking deeper to data that I didn't quote ... is "122" always
a
> > >> >date/time
> > >> >> value?  Are the tag numbers always picked from a well-defined
set,
> or
> > >> >do
> > >> >> they change?
> > >> >>
> > >> >> Assuming that the answers in the previous paragraph are found
and a
> > >> >> configuration is created to deal with all of it ... how are you
> > >> >planning to
> > >> >> search it?  What kind of queries would you expect somebody to
make?
> > >> >That's
> > >> >> going to have a huge influence on how you configure things.
> > >> >>
> > >> >> Writing the schema is usually where people spend the most time
when
> > >> >> they're setting up Solr.
> > >> >>
> > >> >> Thanks,
> > >> >> Shawn
> > >> >>
> > >> >>
> > >>
> > >> --
> > >> Sorry for being brief. Alternate email is rickleir at yahoo dot com
> > >
> > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message