singa-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ngin Yun Chuan (JIRA)" <>
Subject [jira] [Commented] (SINGA-399) Rafiki cannot test rebuilt image
Date Sun, 28 Oct 2018 03:34:00 GMT


Ngin Yun Chuan commented on SINGA-399:

Hi Zhu Lei,

Regarding your email with the following issue:
I want to add a new function in the code. However, after I modify the code and
rebuild the Rafiki_predictor image from the predictor.Dockerfile and run the
code, I find Rafiki does not run my new code instead it seems that Rafiki is still running
the original
It's an issue we have encountered numerous times. Since Docker Hub also contains `rafikiai/predictor:0.0.4`
(, when we run `scripts/` with a locally built
`rafikiai/predictor:0.0.4`, it seems to use Docker Hub's version. Currently I have been resolving
it by incrementing the version in `` to the next version i.e. 0.0.5 in your working
directory, as long as e.g. `rafikiai/predictor:0.0.5` has not been pushed to Docker Hub. In
the future, we should update the scripts to allow use of locally-built images even with such
a version conflict.

> Rafiki cannot test rebuilt image
> --------------------------------
>                 Key: SINGA-399
>                 URL:
>             Project: Singa
>          Issue Type: Bug
>            Reporter: Zhu Lei
>            Priority: Major
>         Attachments: rafiki-1.PNG, rafiki-2.PNG, rafiki-3.PNG, rafiki-4.PNG
> After downloading the newest rafiki code, at commit 7b3b04e15c62233e515c4d82051cd5dfb799215f,
with comments "Add more error handling to notify user of invalid train job; compact exceptions",
I ran "bash ./scripts/" to build the new admin, advisor, predictor and worker
images. I got the images shown in attached image 'rafiki-1.PNG'. Then I run "bash ./script/"
to build the containers as shown in the attached image 'rafiki-2.PNG'. Finally when I ran
the example. I got the error in attached image 'rafiki-3.PNG'.
> And I find very surprising that the images of admin, advisor, predictor and worker I
built just now, become some images built weeks ago, shown in attached image 'rafiki-4.PNG'.
Could you kindly provide me some explanations on why this happens? I really do not understand
why this happened.
> And finally, when I run "bash ./script/" and leave the swarm and repeat my previous
procedure again, now there is no errors. The only thing difference between the two runs I
think is only the images are different. So the current code of rafiki does not support newly
build images, that is my speculation.

This message was sent by Atlassian JIRA

View raw message