Pagure staging is not accessible #13128

Closed
opened 2026-02-09 06:10:06 +00:00 by amedvede · 6 comments

Description of request

When you trying to access the page it tells Service Unavailable The server is temporarily unable to service your request due to maintenance downtime or capacity problems. Please try again later.
There is no info on the status page about it, and zabbix shows that everything is alright.

### Description of request When you trying to access the [page](https://stg.pagure.io/dashboard/projects) it tells `Service Unavailable The server is temporarily unable to service your request due to maintenance downtime or capacity problems. Please try again later.` There is no info on the status page about it, and [zabbix](https://zabbix.fedoraproject.org/zabbix.php?name=&evaltype=0&tags%5B0%5D%5Btag%5D=&tags%5B0%5D%5Boperator%5D=0&tags%5B0%5D%5Bvalue%5D=&show_tags=3&tag_name_format=0&tag_priority=&state=-1&filter_name=&filter_show_counter=0&filter_custom_time=0&sort=name&sortorder=ASC&show_details=0&action=latest.view&hostids%5B%5D=11026) shows that everything is alright.
zlopez self-assigned this 2026-02-09 14:20:52 +00:00
Owner

So I'm checking what is a problem here and the httpd is running OK, but the Anubis is not running a failing on start. I also noticed some kerberos error, but that was resolved by reset of sssd.

I will look into anubis and see if I can figure out why it's not running.

So I'm checking what is a problem here and the `httpd` is running OK, but the `Anubis` is not running a failing on start. I also noticed some kerberos error, but that was resolved by reset of `sssd`. I will look into anubis and see if I can figure out why it's not running.
Owner

So from what I was able to find, this is happening till 30th January and here is the output from anubis systemd unit.

Feb 09 14:27:44 pagure-stg02.fedoraproject.org systemd[1]: Starting Anubis Container...
Feb 09 14:27:44 pagure-stg02.fedoraproject.org podman[884489]: time="2026-02-09T14:27:44Z" level=error msg="invalid internal status, try resetting the pause process with \"/usr/bin/podman system migrate\": could not find any running process: no such process"
Feb 09 14:27:44 pagure-stg02.fedoraproject.org podman[884502]: time="2026-02-09T14:27:44Z" level=error msg="invalid internal status, try resetting the pause process with \"/usr/bin/podman system migrate\": could not find any running process: no such process"
Feb 09 14:27:44 pagure-stg02.fedoraproject.org systemd[1]: Started Anubis Container.
Feb 09 14:27:44 pagure-stg02.fedoraproject.org podman[884515]: time="2026-02-09T14:27:44Z" level=error msg="invalid internal status, try resetting the pause process with \"/usr/bin/podman system migrate\": could not find any running process: no such process"
Feb 09 14:27:44 pagure-stg02.fedoraproject.org systemd[1]: anubis.service: Main process exited, code=exited, status=1/FAILURE
Feb 09 14:27:44 pagure-stg02.fedoraproject.org systemd[1]: anubis.service: Failed with result 'exit-code'.

I tried to redeploy it by running ansible-playbook /srv/web/infra/ansible/playbooks/groups/pagure.yml -l pagure_stg -t anubis, but it didn't help.

So from what I was able to find, this is happening till 30th January and here is the output from anubis systemd unit. ``` Feb 09 14:27:44 pagure-stg02.fedoraproject.org systemd[1]: Starting Anubis Container... Feb 09 14:27:44 pagure-stg02.fedoraproject.org podman[884489]: time="2026-02-09T14:27:44Z" level=error msg="invalid internal status, try resetting the pause process with \"/usr/bin/podman system migrate\": could not find any running process: no such process" Feb 09 14:27:44 pagure-stg02.fedoraproject.org podman[884502]: time="2026-02-09T14:27:44Z" level=error msg="invalid internal status, try resetting the pause process with \"/usr/bin/podman system migrate\": could not find any running process: no such process" Feb 09 14:27:44 pagure-stg02.fedoraproject.org systemd[1]: Started Anubis Container. Feb 09 14:27:44 pagure-stg02.fedoraproject.org podman[884515]: time="2026-02-09T14:27:44Z" level=error msg="invalid internal status, try resetting the pause process with \"/usr/bin/podman system migrate\": could not find any running process: no such process" Feb 09 14:27:44 pagure-stg02.fedoraproject.org systemd[1]: anubis.service: Main process exited, code=exited, status=1/FAILURE Feb 09 14:27:44 pagure-stg02.fedoraproject.org systemd[1]: anubis.service: Failed with result 'exit-code'. ``` I tried to redeploy it by running `ansible-playbook /srv/web/infra/ansible/playbooks/groups/pagure.yml -l pagure_stg -t anubis`, but it didn't help.
Owner

Tried to reboot the machine (in case some update is hanging there), but that didn't helped as well.

Tried to reboot the machine (in case some update is hanging there), but that didn't helped as well.
Owner

Something was strange with the container image.

I did a new pull:

[root@pagure-stg02 tmp]# sudo -u anubis podman pull ghcr.io/techarohq/anubis:latest               
Trying to pull ghcr.io/techarohq/anubis:latest...                                                 
Getting image source signatures                                                                   
Copying blob 267f0d71ccc2 done   |                                                                
Copying blob d270249e0778 done   |                                                                
Copying blob 250c06f7c38e skipped: already exists                                                 
Copying config 0807bcc631 done   | 
Writing manifest to image destination
0807bcc631f077563e59f108a79a92ba1e09cc7772049c3cd6c27b568887ef0f

And after that just doing a 'systemctl restart anubis' worked?

So, I am unsure what happened here.

Something was strange with the container image. I did a new pull: ``` [root@pagure-stg02 tmp]# sudo -u anubis podman pull ghcr.io/techarohq/anubis:latest Trying to pull ghcr.io/techarohq/anubis:latest... Getting image source signatures Copying blob 267f0d71ccc2 done | Copying blob d270249e0778 done | Copying blob 250c06f7c38e skipped: already exists Copying config 0807bcc631 done | Writing manifest to image destination 0807bcc631f077563e59f108a79a92ba1e09cc7772049c3cd6c27b568887ef0f ``` And after that just doing a 'systemctl restart anubis' worked? So, I am unsure what happened here.
Owner

I tried that as well, but not as anubis used, so maybe that was the problem.

Anyway the issue is now resolved, closing this as fixed.

I tried that as well, but not as anubis used, so maybe that was the problem. Anyway the issue is now resolved, closing this as fixed.
Author

Thanks @zlopez and @kevin for fix

Thanks @zlopez and @kevin for fix
Sign in to join this conversation.
No milestone
No assignees
3 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
infra/tickets#13128
No description provided.