oodt-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Luke liu" <shuai...@usc.edu>
Subject RE: re: Question about OODT file manager
Date Thu, 06 Nov 2014 01:48:33 GMT
I just signed up on the wiki(i.e. https://cwiki.apache.org ) with the
following account detail:
	Account name: luke
	Full Name: Shuai Liu (Luke)
	Email: hanson311biz@gmail.com
	Password: *******

But I am not sure where I can add my notes to the following web article with
which I had trouble , I also tried to create a new article, but failed to do
it as I cannot find a place where I can edit, does this have something do
with my account that is not visible for the "edit" or "comments" action?
https://cwiki.apache.org/confluence/display/OODT/CAS-PGE+Learn+by+Example


Thanks
Luke
-----Original Message-----
From: Mattmann, Chris A (3980) [mailto:chris.a.mattmann@jpl.nasa.gov] 
Sent: Sunday, November 2, 2014 6:59 AM
To: Luke liu; dev@oodt.apache.org
Cc: 'Christian Alan Mattmann'; zhoujian@usc.edu; xiaoyanj@usc.edu; 'Zichuan
Wang'
Subject: Re: re: Question about OODT file manager

Yes Luke, making the instructions better would be much appreciated!

If you have an account on the wiki please share it, else sign up for an
Apache OODT wiki account and please share it with me or anyone else on
dev@oodt and we’ll add you.

Cheers,
Chris

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Chief Architect
Instrument Software and Science Data Systems Section (398) NASA Jet
Propulsion Laboratory Pasadena, CA 91109 USA
Office: 168-519, Mailstop: 168-527
Email: chris.a.mattmann@nasa.gov
WWW:  http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Associate Professor, Computer Science Department University of
Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++






-----Original Message-----
From: Luke liu <shuailiu@usc.edu>
Date: Sunday, November 2, 2014 at 1:32 AM
To: Chris Mattmann <Chris.A.Mattmann@jpl.nasa.gov>, "dev@oodt.apache.org"
<dev@oodt.apache.org>
Cc: Chris Mattmann <mattmann@usc.edu>, "zhoujian@usc.edu"
<zhoujian@usc.edu>, "xiaoyanj@usc.edu" <xiaoyanj@usc.edu>, 'Zichuan Wang'
<zichuanw@usc.edu>
Subject: RE: re: Question about OODT file manager

>Thanks Professor Mattmann, not running batch_stub was the main culprit 
>and there were some other issues such as missing jars; and sorry for 
>not confirming this right away, my laptop was actually crashing, and i 
>just had time to fix it; BTW, I was able to get the cas-pge example to 
>work, (even though I saw the workflow failed to pass the pre-condition 
>in the log, the combined file and some metadata files (i.e.3 files) 
>were still successfully ingested and placed in the output directory)
>
>BTW, i think there are a lot of mistakes in the documents, do you want 
>us to help correct the document(i.e.
>https://cwiki.apache.org/confluence/display/OODT/CAS-PGE+Learn+by+Examp
>le
>)?
>If possible, I would like to please share my notes with some problem 
>steps mentioned there.
>
>Anyway, thanks for your help and appreciated.
>
>Thanks
>Luke
>-----Original Message-----
>From: Mattmann, Chris A (3980) [mailto:chris.a.mattmann@jpl.nasa.gov]
>Sent: Saturday, November 1, 2014 10:48 AM
>To: Luke; dev@oodt.apache.org
>Cc: 'Christian Alan Mattmann'; zhoujian@usc.edu; xiaoyanj@usc.edu; 
>'Zichuan Wang'
>Subject: Re: re: Question about OODT file manager
>
>Dear Luke, just confirming, we solved this in class right? It had to do 
>with the batch stub not being turned on.
>
>Cheers,
>Chris
>
>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>Chris Mattmann, Ph.D.
>Chief Architect
>Instrument Software and Science Data Systems Section (398) NASA Jet 
>Propulsion Laboratory Pasadena, CA 91109 USA
>Office: 168-519, Mailstop: 168-527
>Email: chris.a.mattmann@nasa.gov
>WWW:  http://sunset.usc.edu/~mattmann/
>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>Adjunct Associate Professor, Computer Science Department University of 
>Southern California, Los Angeles, CA 90089 USA
>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>
>
>
>
>
>
>-----Original Message-----
>From: Luke <shuailiu@usc.edu>
>Date: Tuesday, October 28, 2014 at 12:52 PM
>To: Chris Mattmann <Chris.A.Mattmann@jpl.nasa.gov>, "dev@oodt.apache.org"
><dev@oodt.apache.org>
>Cc: Chris Mattmann <mattmann@usc.edu>, "zhoujian@usc.edu"
><zhoujian@usc.edu>, "xiaoyanj@usc.edu" <xiaoyanj@usc.edu>, 'Zichuan Wang'
><zichuanw@usc.edu>
>Subject: RE: re: Question about OODT file manager
>
>>Dear Professor Mattamnn,
>>Thanks a lot Professor Mattmann for the kind help, it is appreciated, 
>>sorry for getting back to you with my appreciation, I have been 
>>conducting tests with OODT based on your advice, but unfortunately I 
>>am having another problem....
>>
>>I am following the steps
>>(https://cwiki.apache.org/confluence/display/OODT/CAS-PGE+Learn+by+Exa
>>mpl
>>e
>>) to get a sense of how to get workflow to work.
>>The problem is that the File-Concatenator-PGE (by running the 
>>wmgr-client
>>command-line) does not seems to be invoked or executed, but I am 
>>seeing the tasks are getting stacked up in the workflow manager with 
>>status either "RSUBMIT" or "QUEUED", but they are not getting executed,
PFA:
>>workflow_monitor.jpg, please note, by default the workflow min pool 
>>size is 6; so here comes another problem, i have 6 submitted tasks 
>>with status RSUBMIT, but any new incoming tasks will be forwarded to 
>>the waiting QUEUE with status "QUEUED"...please refer to the 
>>workflow_monitor.jpg for details, where I have 3 QUEUED workflow task and
6 RSUMBITE tasks.
>>
>>Question 1): not sure why the workflow is not being executed, and 
>>hanging at the state of "RSUBMIT", after enabling the log level, I am 
>>seeing the following entry in the log, not sure if this has anything 
>>to do with the "hanging" problem where workflow is not getting 
>>executed and hanging at state of "RSUBMIT".
>>	Oct 28, 2014 3:35:07 AM
>>org.apache.oodt.cas.workflow.engine.IterativeWorkflowProcessorThread
>>safeCheckJobComplete
>>	WARNING: Exception checking completion status for job:
>>[2014-10-28T01:59:32.813-07:00]: Messsage: java.lang.Exception:
>>java.lang.NullPointerException
>>
>>Question 2): I think currently on my side any new incoming workflow 
>>task I am sending with the following command is being directed to the 
>>waiting "QUEUE" because of the min pool size (i.e. 6) (I can increase 
>>this to a larger number though),
>>			./wmgr-client --url http://localhost:9200
>--operation --sendEvent
>>--eventName fileconcatenator-pge --metaData --key RunID testNumber1
>>	If possible, I would like to please know if there is a way we can
>purge
>>the queue and get rid of those workflow tasks either in "RSUMBIT" and 
>>"QUEUED" I have already sent, please kindly help.
>>
>>Very sorry for troubling you with this, to be honest I find OODT a bit 
>>challenging to grasp within a short time frame, probably because there 
>>is no book like OODT in action like Solr.... and what I am doing is 
>>just trial and error blended with guess, but I don’t want to make a 
>>blind guess, it will be appreciated if you can please also shed some 
>>lights on where I can get more information logging or other way where 
>>I can troubleshoot. I think it might be worth tracking what is 
>>happening when workflow reach the status "RSUBMIT" and how to get a 
>>specific logging info specific to it...
>>
>>Again your advice and kind help will be appreciated usual.
>>
>>
>>Thanks
>>Luke
>>
>>> -----Original Message-----
>>> From: Mattmann, Chris A (3980) 
>>> [mailto:chris.a.mattmann@jpl.nasa.gov]
>>> Sent: 2014年10月26日 22:18
>>> To: Luke; 'Zichuan Wang'
>>> Cc: 'Christian Alan Mattmann'; zhoujian@usc.edu; xiaoyanj@usc.edu; 
>>> dev@oodt.apache.org
>>> Subject: Re: re: Question about OODT file manager
>>> 
>>> Hi Luke,
>>> 
>>> Thanks and sorry it’s taken me a while to reply. Here are some 
>>>details
>>>below:
>>> 
>>> 
>>> -----Original Message-----
>>> From: Luke <shuailiu@usc.edu>
>>> Date: Sunday, October 26, 2014 at 6:19 PM
>>> To: Chris Mattmann <Chris.A.Mattmann@jpl.nasa.gov>, 'Zichuan Wang'
>>> <zichuanw@usc.edu>
>>> Cc: Chris Mattmann <mattmann@usc.edu>, "zhoujian@usc.edu"
>>> <zhoujian@usc.edu>, "xiaoyanj@usc.edu" <xiaoyanj@usc.edu>, 
>>> "dev@oodt.apache.org" <dev@oodt.apache.org>
>>> Subject: RE: re: Question about OODT file manager
>>> 
>>> >Hi Professor Mattmann and OODT DEV,
>>> >
>>> >Sorry to trouble you with this email, our team has been struggling 
>>> >in the oodt to send json files to solr.
>>> >One of the difficulties is still getting OODT workflow to call the 
>>> >poster.py in etllib.
>>> 
>>> Sorry that you’re having difficulty let me try and help.
>>> 
>>> >
>>> >I am not sure if my understanding is correct with OODT requirement, 
>>> >I hope you can please kindly advice and help with our confusion.
>>> >
>>> >a set of goals in my mind with OODT is as follows, please kindly 
>>> >confirm and clarify:
>>> >
>>> >1)
>>> >Get the File-Manager up and running.
>>> 
>>> Yep, hopefully as installed via OODT RADIX.
>>> 
>>> >2)
>>> >send all json files with command wmgr-client to the fileManager
>>>server.
>>> >(I believe we can achieve it with a bash script or probably  python 
>>> >that calls the command line sequentially with each json file name 
>>> >as
>>>an
>>> >argument?!)
>>> 
>>> Suggestion:
>>> 
>>> 1. Use the OODT crawler and file manager to crawl/index the JSON 
>>>files (in  place data transfer).
>>> 2. Take a look at CAS-PGE, it will help you write a workflow task 
>>>that will wrap  ETLlib and the poster command.
>>> 3. Once you are confident with #2, whip up a script that pages 
>>>through all of  your indexed JSON files, and then for each one, 
>>>submits a workflow event (you  may need to look into aggregating 
>>>them) that calls your CAS-PGE wrapped  poster task from ETLlib.
>>> 
>>> >3)
>>> >Once we have json files sent and stored in the File-Manager, we 
>>> >need
>>>to
>>> >get workflow-manager up and running, and we can create a workflow
>>>that
>>> >send those jsons file from the file manager to solr.
>>> 
>>> See above.
>>> 
>>> >4)
>>> >Create a workflow according to
>>> >Workflow2 User Guide
>>> 
>>>><https://cwiki.apache.org/confluence/display/OODT/Workflow2+User+Gui
>>>>de>
>>> >>>>>>>>>>> here comes the problem…..
>>> >         I am not sure how to create a workflow task which can call
>>>the
>>> >poster.py in python etllib, it looks like we need to create our own 
>>> >java  class that extend <TaskInstance> which is an abstract Java 
>>> >class with one abstract method that has the following signature:
>>> >
>>> >
>>> >protectedabstract ResultsState performExecution(ControlMetadata 
>>> >crtlMetadata);
>>> >         However, the detail of where to find the corresponding 
>>> >libs and where to put our implementation in workflow manager is 
>>> >being neglected  in that page.  I am not sure if we should use 
>>> >TaskInstance, but it seems the workflow has to have an interface 
>>> >thru which it can call the python code i.e. poster.py. and it looks 
>>> >like we need to embody the TaskInstance::performExecution by 
>>> >injecting the code  that calls the poster.py and return the
resultState.
>>> >
>>> >
>>> >It would be greatly appreciated if you could please shed some 
>>> >lights and advice how we can get a task instance to call the 
>>> >poster.py. BTW,
>>>I
>>> >am  also not sure if my understanding is correct, please kindly
>>>correct
>>> >it if inappropriate. Your help will be appreciated as usual.
>>> >
>>> >
>>> >
>>> >Thanks
>>> >Luke
>>> 
>>> Thanks Luke, see above. Let me know if it helps.
>>> 
>>> Cheers!
>>> 
>>> Chris
>>> 
>>> >
>>> >From: Mattmann, Chris A (3980) 
>>> >[mailto:chris.a.mattmann@jpl.nasa.gov]
>>> >
>>> >Sent: 2014年10月25日
>>> > 13:34
>>> >To: Zichuan Wang
>>> >Cc: Christian Alan Mattmann; Luke; zhoujian@usc.edu; 
>>> >xiaoyanj@usc.edu
>>> >Subject: Re: 回复: Question about OODT file manager
>>> >
>>> >
>>> >
>>> >Please cc
>>> >dev@oodt.apache.org <mailto:dev@oodt.apache.org> I will reply in
>>>detail
>>> >soon
>>> >
>>> >Sent from my iPhone
>>> 
>>> 
>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>> ++
>>> Chris Mattmann, Ph.D.
>>> Chief Architect
>>> Instrument Software and Science Data Systems Section (398) NASA Jet 
>>> Propulsion Laboratory Pasadena, CA 91109 USA
>>> Office: 168-519, Mailstop: 168-527
>>> Email: chris.a.mattmann@nasa.gov
>>> WWW:  http://sunset.usc.edu/~mattmann/
>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>> ++
>>> Adjunct Associate Professor, Computer Science Department University 
>>> of Southern California, Los Angeles, CA 90089 USA
>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>> ++
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> >
>>> >
>>> >On Oct 25, 2014, at 1:26 PM, "Zichuan Wang" <zichuanw@usc.edu> wrote:
>>> >
>>> >
>>> >Dear Professor,
>>> >
>>> >
>>> >
>>> >Could please also explain how I can crawl all JSON file name under 
>>> >a specific directory using CAS-PGE? I’ll work through this example 
>>> >https://cwiki.apache.org/confluence/display/OODT/CAS-PGE+Learn+by+E
>>> >xam
>>> p
>>> >le,  but it doesn’t mention anything about crawling, instead it 
>>> >manually set the Input files paths...
>>> >
>>> >
>>> >
>>> >
>>> >--
>>> >
>>> >Zichuan Wang
>>> >
>>> >University of Southern California, Department of Computer Science
>>> >
>>> >
>>> >
>>> >
>>> >在 2014年10月25日 星期六,下午12:10,Zichuan Wang
>>> >写道:
>>> >
>>> >Dear Professor,
>>> >
>>> >
>>> >
>>> >In assignment 2 specification I noticed that you mentioned OODT 
>>> >File Manager, but from my understanding, we are using ETLLib poster 
>>> >which talks directly to Solr. So how can we use OODT File Manager 
>>> >in this assignment?
>>> >
>>> >
>>> >
>>> >--
>>> >
>>> >Zichuan Wang
>>> >
>>> >University of Southern California, Department of Computer Science
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >
>>
>
>



Mime
View raw message