nifi-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Igor Kravzov <igork.ine...@gmail.com>
Subject Re: ReplaceText processor configuration help
Date Tue, 26 Apr 2016 21:15:57 GMT
HI Matt,

I actually tried "manual transformation" because what resulting JSON fields
should be. It works.
One problem I have is when "text" filed contains special characters like
new line or quotes.  These have to be escaped before going into actual
JSON.

For example line like

Who would you vote for?\nRetweet for Yoda

after EvaluateJson becomes

Who would you vote for?
Retweet for Yoda

So it needs to be escaped back. Any suggestion how to do that?


On Tue, Apr 26, 2016 at 12:11 PM, Matt Burgess <mattyb149@gmail.com> wrote:

> Yes, I think you'll be better off with Aldrin's suggestion of
> ReplaceText. Then you can put the value of the attribute(s) directly
> into the content.  For example, if you have two attributes "entities"
> and "users", and you want a JSON doc with those two objects inside,
> you can use ReplaceText with the following for replacement:
>
> {"entities": ${entities}, "users": ${users}}
>
> Note this "manually" transforms the JSON. Before we get the
> TransformJSON processor, this is a decent workaround if you know what
> the resulting JSON document should look like (and if you have
> attributes containing the desired values).
>
> If you're doing this to insert into Elasticsearch, you might want to
> handle entities and users separately and have "types" in ES for
> "entities" and "users". In that case you could use EvaluateJsonPath to
> get both attributes out, then wire the "success" relationship to two
> different ReplaceTexts, one to store the entities and one for users.
> Then you could add an attribute called "es.type" (for example), set to
> "entities" and "users" respectively. Then you can send both forks to a
> PutElasticsearch, setting the Type property to "${es.type}". That will
> put the entities documents into the entities type and the same for
> users. This will help with indexing versus one huge document.
>
> This process can be broken down into individual entities and users, if
> you'd like a separate ES document for each. In that case you'd likely
> need a SplitJson after the ReplaceText, pointing at the array of
> entity/user objects. Then you'll get a flow file per entity/user,
> meaning you'll get a separate ES doc for each entity and user,
> stored/indexed/categorized by its type.
>
> Does this help solve your use case? If not please let me know, I'm
> happy to help work through this :)
>
> Regards,
> Matt
>
> On Tue, Apr 26, 2016 at 11:51 AM, Igor Kravzov <igork.inexso@gmail.com>
> wrote:
> > I see.
> > But I think I found the problem. It's AttributesToJson escapes the
> result.
> >
> > On Apr 26, 2016 11:46 AM, "McDermott, Chris Kevin (MSDU -
> > STaTS/StorefrontRemote)" <chris.mcdermott@hpe.com> wrote:
> >>
> >> Hi Igor,
> >>
> >> jsonPath will return JSON as an unescaped String.
> >>
> >> Chris
> >>
> >> From: Igor Kravzov <igork.inexso@gmail.com<mailto:
> igork.inexso@gmail.com>>
> >> Reply-To: "users@nifi.apache.org<mailto:users@nifi.apache.org>"
> >> <users@nifi.apache.org<mailto:users@nifi.apache.org>>
> >> Date: Monday, April 25, 2016 at 2:27 PM
> >> To: "users@nifi.apache.org<mailto:users@nifi.apache.org>"
> >> <users@nifi.apache.org<mailto:users@nifi.apache.org>>
> >> Subject: Re: ReplaceText processor configuration help
> >>
> >> Hi Chris,
> >>
> >> How will it help in my situation?
> >>
> >> On Mon, Apr 25, 2016 at 1:50 PM, McDermott, Chris Kevin (MSDU -
> >> STaTS/StorefrontRemote)
> >> <chris.mcdermott@hpe.com<mailto:chris.mcdermott@hpe.com>> wrote:
> >> Igor,
> >>
> >> I think the jsonPath extension to the EL is going to be the ticket
> [1].  A
> >> patch is available if you are willing to build NiFi yourself to test it
> out.
> >>
> >> Cheers,
> >> Chris
> >>
> >> [1] https://issues.apache.org/jira/browse/NIFI-1660
> >>
> >>
> >> From: Igor Kravzov
> >> <igork.inexso@gmail.com<mailto:igork.inexso@gmail.com><mailto:
> igork.inexso@gmail.com<mailto:igork.inexso@gmail.com>>>
> >> Reply-To:
> >> "users@nifi.apache.org<mailto:users@nifi.apache.org><mailto:
> users@nifi.apache.org<mailto:users@nifi.apache.org>>"
> >> <users@nifi.apache.org<mailto:users@nifi.apache.org><mailto:
> users@nifi.apache.org<mailto:users@nifi.apache.org>>>
> >> Date: Monday, April 25, 2016 at 11:45 AM
> >> To:
> >> "users@nifi.apache.org<mailto:users@nifi.apache.org><mailto:
> users@nifi.apache.org<mailto:users@nifi.apache.org>>"
> >> <users@nifi.apache.org<mailto:users@nifi.apache.org><mailto:
> users@nifi.apache.org<mailto:users@nifi.apache.org>>>
> >> Subject: Re: ReplaceText processor configuration help
> >>
> >> Aldrin,
> >>
> >> The overall goal is to extract some subset of attributes from tweet's
> >> JSON, create a new JSON and ingest it into Elasticsearch for indexing.
> >> Hope this helps.
> >>
> >> On Mon, Apr 25, 2016 at 11:18 AM, Aldrin Piri
> >> <aldrinpiri@gmail.com<mailto:aldrinpiri@gmail.com><mailto:
> aldrinpiri@gmail.com<mailto:aldrinpiri@gmail.com>>>
> >> wrote:
> >> Igor,
> >>
> >> Thanks for the template.  It looks like the trouble is with
> >> AttributesToJSON converting the attribute, which in your case, is a JSON
> >> blob, into additional JSON and thus the escaping to ensure nothing is
> lost.
> >> Are you just trying to get that entity body out to a file?  If so, the
> >> AttributesToJSON is likely not needed and you should be able to use
> >> something like ReplaceText to write the attribute to the FlowFile body.
> >> Please let us know your overall goal and we can see if the right mix of
> >> components already exists or if we are running into a path that may need
> >> some additional functionality.
> >>
> >> Thanks!
> >> Aldrin
> >>
> >>
> >>
> >> On Mon, Apr 25, 2016 at 10:33 AM, Igor Kravzov
> >> <igork.inexso@gmail.com<mailto:igork.inexso@gmail.com><mailto:
> igork.inexso@gmail.com<mailto:igork.inexso@gmail.com>>>
> >> wrote:
> >> Hi Aldrin,
> >>
> >>
> >> Attached please find the template.  In this workflow I want to pull
> >> "entities" and "user" entries for Twitter JSON as entire structure. I
> only
> >> can do it if I set Return Type as JSON.
> >> Subsequently I use AttributesToJSON to create a new JSON file. But
> >> returning values for "entities" and "user" are escaped so I had to clean
> >> these before converting to JSON.
> >>
> >> Hope this helps.
> >>
> >> On Mon, Apr 25, 2016 at 10:15 AM, Aldrin Piri
> >> <aldrinpiri@gmail.com<mailto:aldrinpiri@gmail.com><mailto:
> aldrinpiri@gmail.com<mailto:aldrinpiri@gmail.com>>>
> >> wrote:
> >> Hi Igor,
> >>
> >> That should certainly be possible.  Would you mind opening up a ticket
> >> (https://issues.apache.org/jira/browse/NIFI) and providing a template
> of
> >> your flow that is causing the issue?
> >>
> >> Thanks!
> >>
> >> On Mon, Apr 25, 2016 at 10:09 AM, Igor Kravzov
> >> <igork.inexso@gmail.com<mailto:igork.inexso@gmail.com><mailto:
> igork.inexso@gmail.com<mailto:igork.inexso@gmail.com>>>
> >> wrote:
> >> Thanks Pierre. It worked. Looks like I was doing something wrong inside
> my
> >> workflow.
> >> Would not be it feasible to have an option for EvaluateJsonPath
> processor
> >> to have an option to return escaped or unescaped JSON result?
> >>
> >> On Mon, Apr 25, 2016 at 7:20 AM, Pierre Villard
> >> <pierre.villard.fr@gmail.com<mailto:pierre.villard.fr@gmail.com
> ><mailto:pierre.villard.fr@gmail.com<mailto:pierre.villard.fr@gmail.com>>>
> >> wrote:
> >> Hi Igor,
> >>
> >> Please use ReplaceText processors.
> >>
> >> 1.
> >> Search value : \\
> >> Replace value : Empty string set
> >>
> >> 2.
> >> Search value : "\{
> >> Replace value : \{
> >>
> >> 3.
> >> Search value : \}"
> >> Replace value : \}
> >>
> >> Template example attached.
> >>
> >> HTH
> >> Pierre
> >>
> >>
> >> 2016-04-24 20:12 GMT+02:00 Igor Kravzov
> >> <igork.inexso@gmail.com<mailto:igork.inexso@gmail.com><mailto:
> igork.inexso@gmail.com<mailto:igork.inexso@gmail.com>>>:
> >>
> >> I am not that good in regex. What would be the proper configuration to
> do
> >> the following;
> >>
> >>   1.  Remove backslash from text.
> >>   2.  Replace "{ with {
> >>   3.  replace }" with }
> >>
> >> Basically I need to clean escaped JSON.
> >>
> >> Like before:
> >>
> >>
> >>
> "{\"hashtags\":[{\"text\":\"Apple\",\"indices\":[45,51]}],\"urls\":[{\"url\":\"\",\"expanded_url\":\"\",\"display_url\":\"
> owler.us/abdLas\<http://owler.us/abdLas\><http://owler.us/abdLas%5C
> >",\"indices\":[64,87]}],\"user_mentions\":[],\"symbols\":[{\"text\":\"AAPL\",\"indices\":[88,93]}]}",
> >>
> >> after:
> >>
> >>
> >>
> {"hashtags":[{"text":"Apple","indices":[45,51]}],"urls":[{"url":"","expanded_url":"","display_url":"
> owler.us/abdLas<http://owler.us/abdLas><http://owler.us/abdLas
> >","indices":[64,87]}],"user_mentions":[],"symbols":[{"text":"AAPL","indices":[88,93]}]},
> >>
> >> Thanks in advance.
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >
>

Mime
View raw message