Deploy script #2
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
I think the best option would be to deploy by rpm and run directly on sundries. The following steps are needed:
Optionally:
Ah, I've just realized that I don't have rights to create a new release...
You should have the rights now.
Also you don't necessarily need a review ticket. We can just build and tag the rpm in the infra repo.
Also I think the sundries are RHEL7 boxes, so using the infra repo should be easier
I have created a release and packaged it, I've set up a test repository on COPR: https://copr.fedorainfracloud.org/coprs/mattia/review_stats/
I can't find any information about an infra repo: how can I build and tag the package into it?
You can build directly there, I think only a few people from the Infra team can build against it. Basically it is just a Koji target (for example epel7-infra https://koji.fedoraproject.org/koji/buildtargetinfo?targetID=185) that we use to build package against and then generate a package repo that we use on the boxes in the infra (https://infrastructure.fedoraproject.org/cgit/ansible.git/tree/files/common/rhel-infra-tags.repo).
If you can do a scratch build of your package in koji and create a infra ticket with the link to the build asking for the package to be built against the epel7-infra target that should do it.
Hope that makes things more clear.
Opened https://pagure.io/fedora-infrastructure/issue/8718
I have tried to deploy this in staging [0][1][2] but it seems that the app requires python-systemd which is not available for python3 on rhel7.
Do we really need bindings to systemd ? It might be easier to just drop that requirement or make it optional.
[0] - https://infrastructure.fedoraproject.org/cgit/ansible.git/commit/?id=ba005c4364522007726a0fb80f954b21d9367830
[1] - https://infrastructure.fedoraproject.org/cgit/ansible.git/commit/?id=38904071a3a05bf16dd98a6e1293ea0da8ddff08
[2] - https://infrastructure.fedoraproject.org/cgit/ansible.git/commit/?id=41221aa4f084b7d4aca079ebd60852049f3eabfb
To drop python-systemd requirement I will need to modify the script to make logging to journal optional.
If we decide so, we will loose the opportunity to log run information (and errors) to journal. Is there any other method to inform infrastructure that something wrong happened while running a script?
Currently is also possible to send emails when the script encounter an error, but it can become annoying since it sends one email every log message sent... or we can continue to have cron sending emails just like it works now.
I can try to drop python-systemd and use the method described here, but it refers to python2 and it cannot be tested locally...
Yes I think the current way is that the cronjob will send emails only on errors. So if everything runs correctly no emails, otherwise we get an email with the errors.
Yeah I am not how up to date this is TBH.
I made logging to python-systemd optional by
-lflag and rebuilt the package as version 5.0.3:https://koji.fedoraproject.org/koji/taskinfo?taskID=42192505
I'm going to open another ticket to have it tagged in infra-repo: https://pagure.io/fedora-infrastructure/issue/8723
@mattia I think you need to remove https://pagure.io/Fedora-Infra/review_stats/blob/master/f/setup.py#_23
Otherwise it will still try to find the dependency
I've put a conditional in specfile:
that will remove that line on EPEL7 in the
%setupstep, so that the same code/specfile can be eventually be used in future on systems with python3-systemd.It seems that the Openshift deployment is failing because the
/etc/review-stats/config.cfgcan't be found. Maybe it only needs to the ansible playbook to be runned again?Just ran the playbook again, let see if that helps :-)
I think the cronjob file specified for staging is wrong.

Ok The issue was with the Dockerfile entrypoint [0], now the cronjob starts but fails because it does not have write permission to /tmp/review-stats.
This is only temporary and now we need to request a nfs persistent volume in OpenShift so that we can store the output of the cronjob there, and then mount that nfs directory on the sundries to be served.
I ll file the infra ticket for that.
[0] - https://pagure.io/Fedora-Infra/review_stats/c/165b4d43e8602c3f5680df2f4c2a5d0d4d4a4c9d?branch=master
@cverna the config file in ansible is wrong, this patch should make the script running.
@mattia ah cool, I ll update that :)
Ok so the cron job runs and the static files are generated on the sundries box. Now the cron that rsync the files from the sundries to the proxies is failing because of permission issues.
I ll investage that further monday :)
Thanks.
As far as I understand, we have some leftovers of the old deployment method that conflicts with the new. And the sync script moves files with root:root permission instead of apache:apache.
I made two patches, see if they're correct (I also don't think the sync script is needed on prod, because files are generated directly on sundries by the old script, but I left it for now).
Forgive me, I'm totally confused on how things are working.
From what I understand, the review-stat openshift app is run (for staging) by
roles/openshift-apps/review-stats; here the app image is built, the script is run and put the output in the openshift persistent volume. Then what? Where's the cron script that sync the persistent volume to sundries?So, everything related to staging should be removed from
roles/review-stats/build. (?)Looking in
/srv/web/review-statson sundries01.stg.phx2.fedoraproject.org I can see pages from the new script updated every hour, but I suppose these aren't coming from the openshift app, but from the old deployment... however, https://stg.fedoraproject.org/PackageReviewStatus/ is not updates since 21/3 so I don't understand from where those pages come from.Sorry I meant to look at this during this week. But I have been busy with other stuff :(
So the persistent volume in OpenShift is an nfs shared drive, this is the same drive that is mounted on the sundries so we don't need a cron script for that.
I wanted to have everything working in staging before moving to prod and then delete the old role.
No these are coming from Openshift, we have a cron job that sync the files from the sundries to all of our proxies to be served by httpd. This is what is failing currently.
Does that help ?
The permissions of the file generated are too restricitive I think
I think instead of 600 the html should be at least 644. I am not sure if this is because of how these files are created by the app.
Thanks. I think this was caused by using
shutil.copy2which preserves file permissions while copying from the temporary file to destination. I've changed the script to useshutil.copyfile.It is now working, I made the script to set permissions to 644 on the produced files. The static content is copied with proper permissions, so no need to make changes there.
Cool I ll try to switch production later today or tomorrow 😄
Thanks for working on this.
Ok I have deployed this to production and it working fine https://fedoraproject.org/PackageReviewStatus/.
I let you announce it on the different mailing list if you want :-)
Thank you! I will write something on devel soon.
One last question: is there any change to get access to os-master*, at least in staging, to be able to deploy and test any changes? Or to automatically redeploy the app whenever a new build is done?
I did few changes yesterday on staging branch and I was able to rebuild the image from os web UI, I thought it was sufficient for the new code to run, but I think I had got to redeploy it, also? In that case, I can't find anything in the web UI...
so yeah I think you should be able to rebuild from the web console, you can also login using the
occommand tool see https://fedora-infra-docs.readthedocs.io/en/latest/dev-guide/openshift.html#command-line-interfaceThe cluster was in a bad shape so I guess some of the failure you have experienced comes from that. In the UI the cron jobs are in the cluster console. You should be able to access it, clicking on the "Application Console" drop down menu at the top. Then under workload there will be info about the cronjobs.
Should I request permissions by opening an infra ticket (at least for staging)? Or I will need to ask infra team to deploy the app every time I create a new build?
We don't give people rights to deploymentconfigs, because that means it could be changed out of ansible and mess things up.
That said, if you run the playbook it should do a rollout at the end... or if you do that a lot we can make a special playbook for it.
Thank you for all the support in getting this done! Everything works well, so I'm going to close the ticket.
Metadata Update from @mattia:
Yeah also that deploymentconfig is not useful, I created to debug the permission issues we had with the file generated. I will delete it.
What runs in OpenShift is a cronjob every hours at HH:00 so all you need to do is trigger a new build (you can do that from the UI) and wait for the next run of the cronjob.
Hope that clarifies some things 😄
Thanks for working on this 🎆