httpd on pkgs01 regularly dumps core #12670

Closed
opened 2025-07-17 14:50:27 +00:00 by nphilipp · 11 comments
Member

We discussed it in this morning’s (08:15 UTC) Infra/RelEng standup call, I think @zlopez mentioned it first: the httpd process on pkgs01.rdu3 regularly dumps core (every odd minute or three):

[root@pkgs01 pagure][PROD-RDU3]# coredumpctl list --reverse | head
TIME                            PID   UID   GID SIG COREFILE  EXE
Thu 2025-07-17 14:16:30 GMT  1267788  1000 1494358698  11 present   /usr/sbin/httpd
Thu 2025-07-17 14:14:46 GMT  1267162  1000 1494358698  11 present   /usr/sbin/httpd
Thu 2025-07-17 14:13:19 GMT  1266272  1000 1494358698   6 present   /usr/sbin/httpd
Thu 2025-07-17 14:10:02 GMT  1265106  1000 1494358698  11 present   /usr/sbin/httpd
Thu 2025-07-17 14:09:00 GMT  1264831  1000 1494358698   6 present   /usr/sbin/httpd
Thu 2025-07-17 14:08:43 GMT  1264737  1000 1494358698  11 present   /usr/sbin/httpd
Thu 2025-07-17 14:07:36 GMT  1264147  1000 1494358698  11 present   /usr/sbin/httpd
Thu 2025-07-17 14:06:31 GMT  1263119  1000 1494358698  11 present   /usr/sbin/httpd
Thu 2025-07-17 14:06:19 GMT  1263060  1000 1494358698  11 present   /usr/sbin/httpd

Here’s an exemplary one:

[root@pkgs01 pagure][PROD-RDU3]# coredumpctl info 1267788
           PID: 1267788 (httpd)
           UID: 1000 (pagure)
           GID: 1494358698 (xpackager)
        Signal: 11 (SEGV)
     Timestamp: Thu 2025-07-17 14:16:29 GMT (15min ago)
  Command Line: pagure          -DFOREGROUND
    Executable: /usr/sbin/httpd
 Control Group: /
         Slice: -.slice
       Boot ID: d6921eff650c4d358b02b8be1d34b576
    Machine ID: 2cb1691b08bd4a8fa022e3bf3a140ff3
      Hostname: pkgs01.rdu3.fedoraproject.org
       Storage: /var/lib/systemd/coredump/core.httpd.1000.d6921eff650c4d358b02b8be1d34b576.1267788.17>
       Message: Process 1267788 (httpd) of user 1000 dumped core.
                
                Stack trace of thread 1269789:
                #0  0x00007fe279df7b53 PyErr_SetObject (libpython3.6m.so.1.0)
                #1  0x00007fe279e02c21 PyErr_SetString (libpython3.6m.so.1.0)
                #2  0x00007fe279e10be2 PyException_SetTraceback (libpython3.6m.so.1.0)
                #3  0x00007fe279d61dc1 _PyErr_FormatFromCause (libpython3.6m.so.1.0)
                #4  0x00007fe279dc8679 call_function.cold.3302 (libpython3.6m.so.1.0)
                #5  0x00007fe279e902b8 _PyEval_EvalFrameDefault (libpython3.6m.so.1.0)
                #6  0x00007fe279e6c188 fast_function (libpython3.6m.so.1.0)
                #7  0x00007fe279e8f657 call_function (libpython3.6m.so.1.0)
                #8  0x00007fe279e902b8 _PyEval_EvalFrameDefault (libpython3.6m.so.1.0)
                #9  0x00007fe279e6c188 fast_function (libpython3.6m.so.1.0)
                #10 0x00007fe279e8f657 call_function (libpython3.6m.so.1.0)
                #11 0x00007fe279e902b8 _PyEval_EvalFrameDefault (libpython3.6m.so.1.0)
                #12 0x00007fe279dec154 _PyEval_EvalCodeWithName (libpython3.6m.so.1.0)
                #13 0x00007fe279e6c370 fast_function (libpython3.6m.so.1.0)
                #14 0x00007fe279e8f657 call_function (libpython3.6m.so.1.0)
                #15 0x00007fe279e90ef5 _PyEval_EvalFrameDefault (libpython3.6m.so.1.0)
                #16 0x00007fe279decac5 _PyEval_EvalCodeWithName (libpython3.6m.so.1.0)
                #17 0x00007fe279e8e42f function_call (libpython3.6m.so.1.0)
                #18 0x00007fe279df522b PyObject_Call (libpython3.6m.so.1.0)
                #19 0x00007fe279e92120 _PyEval_EvalFrameDefault (libpython3.6m.so.1.0)
                #20 0x00007fe279e6c188 fast_function (libpython3.6m.so.1.0)
                #21 0x00007fe279e8f657 call_function (libpython3.6m.so.1.0)
                #22 0x00007fe279e902b8 _PyEval_EvalFrameDefault (libpython3.6m.so.1.0)
                #23 0x00007fe279e6c188 fast_function (libpython3.6m.so.1.0)
                #24 0x00007fe279e8f657 call_function (libpython3.6m.so.1.0)
                #25 0x00007fe279e902b8 _PyEval_EvalFrameDefault (libpython3.6m.so.1.0)
                #26 0x00007fe279ded682 _PyFunction_FastCallDict (libpython3.6m.so.1.0)
                #27 0x00007fe279dee45e _PyObject_FastCallDict (libpython3.6m.so.1.0)
                #28 0x00007fe279e00290 method_call (libpython3.6m.so.1.0)
                #29 0x00007fe279df522b PyObject_Call (libpython3.6m.so.1.0)
                #30 0x00007fe279f00872 t_bootstrap (libpython3.6m.so.1.0)
                #31 0x00007fe279ea6a14 pythread_wrapper (libpython3.6m.so.1.0)
                #32 0x00007fe28c4d51ca start_thread (libpthread.so.0)
                #33 0x00007fe28bf2b8d3 __clone (libc.so.6)
                
                Stack trace of thread 1267788:
                #0  0x00007fe28c4db6e5 pthread_cond_timedwait@@GLIBC_2.3.2 (libpthread.so.0)
                #1  0x00007fe279deb68c take_gil (libpython3.6m.so.1.0)
                #2  0x00007fe279e9665e _PyEval_EvalFrameDefault (libpython3.6m.so.1.0)
                #3  0x00007fe279dec154 _PyEval_EvalCodeWithName (libpython3.6m.so.1.0)
                #4  0x00007fe279ded87a _PyFunction_FastCallDict (libpython3.6m.so.1.0)
                #5  0x00007fe279dee45e _PyObject_FastCallDict (libpython3.6m.so.1.0)
                #6  0x00007fe279e0f588 PyObject_CallFunctionObjArgs (libpython3.6m.so.1.0)
                #7  0x00007fe279e6aa47 collect (libpython3.6m.so.1.0)
                #8  0x00007fe279ee27fb _PyGC_CollectNoFail (libpython3.6m.so.1.0)
                #9  0x00007fe279efc839 PyImport_Cleanup (libpython3.6m.so.1.0)
                #10 0x00007fe279dc402c Py_EndInterpreter (libpython3.6m.so.1.0)
                #11 0x00007fe27a2441e7 Interpreter_dealloc (mod_wsgi_python3.so)
                #12 0x00007fe279dd926e free_keys_object (libpython3.6m.so.1.0)
                #13 0x00007fe27a24d9ba wsgi_python_child_cleanup (mod_wsgi_python3.so)
                #14 0x00007fe28c70dd6e apr_pool_destroy (libapr-1.so.0)
                #15 0x00007fe27a258f95 wsgi_start_process (mod_wsgi_python3.so)
                #16 0x00007fe27a25b924 wsgi_manage_process (mod_wsgi_python3.so)
                #17 0x00007fe28c71025f apr_proc_other_child_alert (libapr-1.so.0)
                #18 0x00007fe2811e17aa event_run (mod_mpm_event.so)
                #19 0x0000562dc4d112ee ap_run_mpm (httpd)
                #20 0x0000562dc4d097a3 main (httpd)
                #21 0x00007fe28bf2c7e5 __libc_start_main (libc.so.6)
                #22 0x0000562dc4d098ee _start (httpd)
                
                Stack trace of thread 1269790:
                #0  0x00007fe28c0272df __select (libc.so.6)
                #1  0x00007fe279ee276a time_sleep (libpython3.6m.so.1.0)
                #2  0x00007fe279e8f52d call_function (libpython3.6m.so.1.0)
                #3  0x00007fe279e902b8 _PyEval_EvalFrameDefault (libpython3.6m.so.1.0)
                #4  0x00007fe279e6c188 fast_function (libpython3.6m.so.1.0)
                #5  0x00007fe279e8f657 call_function (libpython3.6m.so.1.0)
                #6  0x00007fe279e902b8 _PyEval_EvalFrameDefault (libpython3.6m.so.1.0)
                #7  0x00007fe279e6c188 fast_function (libpython3.6m.so.1.0)
                #8  0x00007fe279e8f657 call_function (libpython3.6m.so.1.0)
                #9  0x00007fe279e902b8 _PyEval_EvalFrameDefault (libpython3.6m.so.1.0)
                #10 0x00007fe279ded682 _PyFunction_FastCallDict (libpython3.6m.so.1.0)
                #11 0x00007fe279dee45e _PyObject_FastCallDict (libpython3.6m.so.1.0)
                #12 0x00007fe279e00290 method_call (libpython3.6m.so.1.0)
                #13 0x00007fe279df522b PyObject_Call (libpython3.6m.so.1.0)
                #14 0x00007fe279f00872 t_bootstrap (libpython3.6m.so.1.0)
                #15 0x00007fe279ea6a14 pythread_wrapper (libpython3.6m.so.1.0)
                #16 0x00007fe28c4d51ca start_thread (libpthread.so.0)
                #17 0x00007fe28bf2b8d3 __clone (libc.so.6)
                
                Stack trace of thread 1269788:
                #0  0x00007fe28c031477 epoll_wait (libc.so.6)
                #1  0x00007fe271fa95d8 pyepoll_poll (select.cpython-36m-x86_64-linux-gnu.so)
                #2  0x00007fe279e8f8d1 call_function (libpython3.6m.so.1.0)
                #3  0x00007fe279e902b8 _PyEval_EvalFrameDefault (libpython3.6m.so.1.0)
                #4  0x00007fe279e6c188 fast_function (libpython3.6m.so.1.0)
                #5  0x00007fe279e8f657 call_function (libpython3.6m.so.1.0)
                #6  0x00007fe279e902b8 _PyEval_EvalFrameDefault (libpython3.6m.so.1.0)
                #7  0x00007fe279e6c188 fast_function (libpython3.6m.so.1.0)
                #8  0x00007fe279e8f657 call_function (libpython3.6m.so.1.0)
                #9  0x00007fe279e902b8 _PyEval_EvalFrameDefault (libpython3.6m.so.1.0)
                #10 0x00007fe279dec154 _PyEval_EvalCodeWithName (libpython3.6m.so.1.0)
                #11 0x00007fe279e6c370 fast_function (libpython3.6m.so.1.0)
                #12 0x00007fe279e8f657 call_function (libpython3.6m.so.1.0)
                #13 0x00007fe279e90ef5 _PyEval_EvalFrameDefault (libpython3.6m.so.1.0)
                #14 0x00007fe279ded682 _PyFunction_FastCallDict (libpython3.6m.so.1.0)
                #15 0x00007fe279dee45e _PyObject_FastCallDict (libpython3.6m.so.1.0)
                #16 0x00007fe279e00290 method_call (libpython3.6m.so.1.0)
                #17 0x00007fe279df522b PyObject_Call (libpython3.6m.so.1.0)
                #18 0x00007fe279e92120 _PyEval_EvalFrameDefault (libpython3.6m.so.1.0)
                #19 0x00007fe279e6c188 fast_function (libpython3.6m.so.1.0)
                #20 0x00007fe279e8f657 call_function (libpython3.6m.so.1.0)
                #21 0x00007fe279e902b8 _PyEval_EvalFrameDefault (libpython3.6m.so.1.0)
                #22 0x00007fe279e6c188 fast_function (libpython3.6m.so.1.0)
                #23 0x00007fe279e8f657 call_function (libpython3.6m.so.1.0)
                #24 0x00007fe279e902b8 _PyEval_EvalFrameDefault (libpython3.6m.so.1.0)
                #25 0x00007fe279ded682 _PyFunction_FastCallDict (libpython3.6m.so.1.0)
                #26 0x00007fe279dee45e _PyObject_FastCallDict (libpython3.6m.so.1.0)
                #27 0x00007fe279e00290 method_call (libpython3.6m.so.1.0)
                #28 0x00007fe279df522b PyObject_Call (libpython3.6m.so.1.0)
                #29 0x00007fe279f00872 t_bootstrap (libpython3.6m.so.1.0)
                #30 0x00007fe279ea6a14 pythread_wrapper (libpython3.6m.so.1.0)
                #31 0x00007fe28c4d51ca start_thread (libpthread.so.0)
                #32 0x00007fe28bf2b8d3 __clone (libc.so.6)
                
                Stack trace of thread 1270890:
                #0  0x00007fe28bffa178 __nanosleep (libc.so.6)
                #1  0x00007fe28bffa07e sleep (libc.so.6)
                #2  0x00007fe27a24daf9 wsgi_reaper_thread (mod_wsgi_python3.so)
                #3  0x00007fe28c4d51ca start_thread (libpthread.so.0)
                #4  0x00007fe28bf2b8d3 __clone (libc.so.6)
                
                Stack trace of thread 1267789:
                #0  0x00007fe28c0272df __select (libc.so.6)
                #1  0x00007fe28c719919 apr_sleep (libapr-1.so.0)
                #2  0x00007fe27a24a75b wsgi_monitor_thread (mod_wsgi_python3.so)
                #3  0x00007fe28c4d51ca start_thread (libpthread.so.0)
                #4  0x00007fe28bf2b8d3 __clone (libc.so.6)
                
                Stack trace of thread 1267790:
                #0  0x00007fe28c0272df __select (libc.so.6)
                #1  0x00007fe28c719919 apr_sleep (libapr-1.so.0)
                #2  0x00007fe27a24d43a wsgi_deadlock_thread (mod_wsgi_python3.so)
                #3  0x00007fe28c4d51ca start_thread (libpthread.so.0)
                #4  0x00007fe28bf2b8d3 __clone (libc.so.6)
                
                Stack trace of thread 1269791:
                #0  0x00007fe28c4dda46 do_futex_wait.constprop.1 (libpthread.so.0)
                #1  0x00007fe28c4ddb38 __new_sem_wait_slow.constprop.0 (libpthread.so.0)
                #2  0x00007fe279deadf2 PyThread_acquire_lock_timed (libpython3.6m.so.1.0)
                #3  0x00007fe279e698f4 lock_PyThread_acquire_lock (libpython3.6m.so.1.0)
                #4  0x00007fe279e8f8d1 call_function (libpython3.6m.so.1.0)
                #5  0x00007fe279e902b8 _PyEval_EvalFrameDefault (libpython3.6m.so.1.0)
                #6  0x00007fe279dec154 _PyEval_EvalCodeWithName (libpython3.6m.so.1.0)
                #7  0x00007fe279e6c370 fast_function (libpython3.6m.so.1.0)
                #8  0x00007fe279e8f657 call_function (libpython3.6m.so.1.0)
                #9  0x00007fe279e902b8 _PyEval_EvalFrameDefault (libpython3.6m.so.1.0)
                #10 0x00007fe279dec154 _PyEval_EvalCodeWithName (libpython3.6m.so.1.0)
                #11 0x00007fe279ded8df _PyFunction_FastCallDict (libpython3.6m.so.1.0)
                #12 0x00007fe279dee45e _PyObject_FastCallDict (libpython3.6m.so.1.0)
                #13 0x00007fe279e00290 method_call (libpython3.6m.so.1.0)
                #14 0x00007fe279dee43c _PyObject_FastCallDict (libpython3.6m.so.1.0)
                #15 0x00007fe279ec13f0 calliter_iternext (libpython3.6m.so.1.0)
                #16 0x00007fe279e90646 _PyEval_EvalFrameDefault (libpython3.6m.so.1.0)
                #17 0x00007fe279decac5 _PyEval_EvalCodeWithName (libpython3.6m.so.1.0)
                #18 0x00007fe279e8e42f function_call (libpython3.6m.so.1.0)
                #19 0x00007fe279df522b PyObject_Call (libpython3.6m.so.1.0)
                #20 0x00007fe279e92120 _PyEval_EvalFrameDefault (libpython3.6m.so.1.0)
                #21 0x00007fe279e6c188 fast_function (libpython3.6m.so.1.0)
                #22 0x00007fe279e8f657 call_function (libpython3.6m.so.1.0)
                #23 0x00007fe279e902b8 _PyEval_EvalFrameDefault (libpython3.6m.so.1.0)
                #24 0x00007fe279e6c188 fast_function (libpython3.6m.so.1.0)
                #25 0x00007fe279e8f657 call_function (libpython3.6m.so.1.0)
                #26 0x00007fe279e902b8 _PyEval_EvalFrameDefault (libpython3.6m.so.1.0)
                #27 0x00007fe279ded682 _PyFunction_FastCallDict (libpython3.6m.so.1.0)
                #28 0x00007fe279dee45e _PyObject_FastCallDict (libpython3.6m.so.1.0)
                #29 0x00007fe279e00290 method_call (libpython3.6m.so.1.0)
                #30 0x00007fe279df522b PyObject_Call (libpython3.6m.so.1.0)
                #31 0x00007fe279f00872 t_bootstrap (libpython3.6m.so.1.0)
                #32 0x00007fe279ea6a14 pythread_wrapper (libpython3.6m.so.1.0)
                #33 0x00007fe28c4d51ca start_thread (libpthread.so.0)
                #34 0x00007fe28bf2b8d3 __clone (libc.so.6)

The SIGSEGV is delivered to the offending thread, whose backtrace always seems to have this structure:

pthread_cond_timedwait@@GLIBC_2.3.2 => take_gil => … => [Python Garbage Collector related] => … => PyImport_Cleanup => Py_EndInterpreter => … => wsgi_python_child_cleanup => apr_pool_destroy

I take it as: httpd winds down a worker, mod_wsgi cleans up the Python (sub-)interpreter, and garbage collection trips over something that doesn’t exist any longer.

I don’t know why this didn’t happen before the DC move, so I can only guess that maybe we had this problem before, a configuration hot-fix was applied which didn’t end up in Ansible, so the fix was lost in the move. Or it happened and we didn’t notice?

Anyway, let’s collect more info here.

We discussed it in this morning’s (08:15 UTC) Infra/RelEng standup call, I think @zlopez mentioned it first: the `httpd` process on `pkgs01.rdu3` regularly dumps core (every odd minute or three): ``` [root@pkgs01 pagure][PROD-RDU3]# coredumpctl list --reverse | head TIME PID UID GID SIG COREFILE EXE Thu 2025-07-17 14:16:30 GMT 1267788 1000 1494358698 11 present /usr/sbin/httpd Thu 2025-07-17 14:14:46 GMT 1267162 1000 1494358698 11 present /usr/sbin/httpd Thu 2025-07-17 14:13:19 GMT 1266272 1000 1494358698 6 present /usr/sbin/httpd Thu 2025-07-17 14:10:02 GMT 1265106 1000 1494358698 11 present /usr/sbin/httpd Thu 2025-07-17 14:09:00 GMT 1264831 1000 1494358698 6 present /usr/sbin/httpd Thu 2025-07-17 14:08:43 GMT 1264737 1000 1494358698 11 present /usr/sbin/httpd Thu 2025-07-17 14:07:36 GMT 1264147 1000 1494358698 11 present /usr/sbin/httpd Thu 2025-07-17 14:06:31 GMT 1263119 1000 1494358698 11 present /usr/sbin/httpd Thu 2025-07-17 14:06:19 GMT 1263060 1000 1494358698 11 present /usr/sbin/httpd ``` Here’s an exemplary one: ``` [root@pkgs01 pagure][PROD-RDU3]# coredumpctl info 1267788 PID: 1267788 (httpd) UID: 1000 (pagure) GID: 1494358698 (xpackager) Signal: 11 (SEGV) Timestamp: Thu 2025-07-17 14:16:29 GMT (15min ago) Command Line: pagure -DFOREGROUND Executable: /usr/sbin/httpd Control Group: / Slice: -.slice Boot ID: d6921eff650c4d358b02b8be1d34b576 Machine ID: 2cb1691b08bd4a8fa022e3bf3a140ff3 Hostname: pkgs01.rdu3.fedoraproject.org Storage: /var/lib/systemd/coredump/core.httpd.1000.d6921eff650c4d358b02b8be1d34b576.1267788.17> Message: Process 1267788 (httpd) of user 1000 dumped core. Stack trace of thread 1269789: #0 0x00007fe279df7b53 PyErr_SetObject (libpython3.6m.so.1.0) #1 0x00007fe279e02c21 PyErr_SetString (libpython3.6m.so.1.0) #2 0x00007fe279e10be2 PyException_SetTraceback (libpython3.6m.so.1.0) #3 0x00007fe279d61dc1 _PyErr_FormatFromCause (libpython3.6m.so.1.0) #4 0x00007fe279dc8679 call_function.cold.3302 (libpython3.6m.so.1.0) #5 0x00007fe279e902b8 _PyEval_EvalFrameDefault (libpython3.6m.so.1.0) #6 0x00007fe279e6c188 fast_function (libpython3.6m.so.1.0) #7 0x00007fe279e8f657 call_function (libpython3.6m.so.1.0) #8 0x00007fe279e902b8 _PyEval_EvalFrameDefault (libpython3.6m.so.1.0) #9 0x00007fe279e6c188 fast_function (libpython3.6m.so.1.0) #10 0x00007fe279e8f657 call_function (libpython3.6m.so.1.0) #11 0x00007fe279e902b8 _PyEval_EvalFrameDefault (libpython3.6m.so.1.0) #12 0x00007fe279dec154 _PyEval_EvalCodeWithName (libpython3.6m.so.1.0) #13 0x00007fe279e6c370 fast_function (libpython3.6m.so.1.0) #14 0x00007fe279e8f657 call_function (libpython3.6m.so.1.0) #15 0x00007fe279e90ef5 _PyEval_EvalFrameDefault (libpython3.6m.so.1.0) #16 0x00007fe279decac5 _PyEval_EvalCodeWithName (libpython3.6m.so.1.0) #17 0x00007fe279e8e42f function_call (libpython3.6m.so.1.0) #18 0x00007fe279df522b PyObject_Call (libpython3.6m.so.1.0) #19 0x00007fe279e92120 _PyEval_EvalFrameDefault (libpython3.6m.so.1.0) #20 0x00007fe279e6c188 fast_function (libpython3.6m.so.1.0) #21 0x00007fe279e8f657 call_function (libpython3.6m.so.1.0) #22 0x00007fe279e902b8 _PyEval_EvalFrameDefault (libpython3.6m.so.1.0) #23 0x00007fe279e6c188 fast_function (libpython3.6m.so.1.0) #24 0x00007fe279e8f657 call_function (libpython3.6m.so.1.0) #25 0x00007fe279e902b8 _PyEval_EvalFrameDefault (libpython3.6m.so.1.0) #26 0x00007fe279ded682 _PyFunction_FastCallDict (libpython3.6m.so.1.0) #27 0x00007fe279dee45e _PyObject_FastCallDict (libpython3.6m.so.1.0) #28 0x00007fe279e00290 method_call (libpython3.6m.so.1.0) #29 0x00007fe279df522b PyObject_Call (libpython3.6m.so.1.0) #30 0x00007fe279f00872 t_bootstrap (libpython3.6m.so.1.0) #31 0x00007fe279ea6a14 pythread_wrapper (libpython3.6m.so.1.0) #32 0x00007fe28c4d51ca start_thread (libpthread.so.0) #33 0x00007fe28bf2b8d3 __clone (libc.so.6) Stack trace of thread 1267788: #0 0x00007fe28c4db6e5 pthread_cond_timedwait@@GLIBC_2.3.2 (libpthread.so.0) #1 0x00007fe279deb68c take_gil (libpython3.6m.so.1.0) #2 0x00007fe279e9665e _PyEval_EvalFrameDefault (libpython3.6m.so.1.0) #3 0x00007fe279dec154 _PyEval_EvalCodeWithName (libpython3.6m.so.1.0) #4 0x00007fe279ded87a _PyFunction_FastCallDict (libpython3.6m.so.1.0) #5 0x00007fe279dee45e _PyObject_FastCallDict (libpython3.6m.so.1.0) #6 0x00007fe279e0f588 PyObject_CallFunctionObjArgs (libpython3.6m.so.1.0) #7 0x00007fe279e6aa47 collect (libpython3.6m.so.1.0) #8 0x00007fe279ee27fb _PyGC_CollectNoFail (libpython3.6m.so.1.0) #9 0x00007fe279efc839 PyImport_Cleanup (libpython3.6m.so.1.0) #10 0x00007fe279dc402c Py_EndInterpreter (libpython3.6m.so.1.0) #11 0x00007fe27a2441e7 Interpreter_dealloc (mod_wsgi_python3.so) #12 0x00007fe279dd926e free_keys_object (libpython3.6m.so.1.0) #13 0x00007fe27a24d9ba wsgi_python_child_cleanup (mod_wsgi_python3.so) #14 0x00007fe28c70dd6e apr_pool_destroy (libapr-1.so.0) #15 0x00007fe27a258f95 wsgi_start_process (mod_wsgi_python3.so) #16 0x00007fe27a25b924 wsgi_manage_process (mod_wsgi_python3.so) #17 0x00007fe28c71025f apr_proc_other_child_alert (libapr-1.so.0) #18 0x00007fe2811e17aa event_run (mod_mpm_event.so) #19 0x0000562dc4d112ee ap_run_mpm (httpd) #20 0x0000562dc4d097a3 main (httpd) #21 0x00007fe28bf2c7e5 __libc_start_main (libc.so.6) #22 0x0000562dc4d098ee _start (httpd) Stack trace of thread 1269790: #0 0x00007fe28c0272df __select (libc.so.6) #1 0x00007fe279ee276a time_sleep (libpython3.6m.so.1.0) #2 0x00007fe279e8f52d call_function (libpython3.6m.so.1.0) #3 0x00007fe279e902b8 _PyEval_EvalFrameDefault (libpython3.6m.so.1.0) #4 0x00007fe279e6c188 fast_function (libpython3.6m.so.1.0) #5 0x00007fe279e8f657 call_function (libpython3.6m.so.1.0) #6 0x00007fe279e902b8 _PyEval_EvalFrameDefault (libpython3.6m.so.1.0) #7 0x00007fe279e6c188 fast_function (libpython3.6m.so.1.0) #8 0x00007fe279e8f657 call_function (libpython3.6m.so.1.0) #9 0x00007fe279e902b8 _PyEval_EvalFrameDefault (libpython3.6m.so.1.0) #10 0x00007fe279ded682 _PyFunction_FastCallDict (libpython3.6m.so.1.0) #11 0x00007fe279dee45e _PyObject_FastCallDict (libpython3.6m.so.1.0) #12 0x00007fe279e00290 method_call (libpython3.6m.so.1.0) #13 0x00007fe279df522b PyObject_Call (libpython3.6m.so.1.0) #14 0x00007fe279f00872 t_bootstrap (libpython3.6m.so.1.0) #15 0x00007fe279ea6a14 pythread_wrapper (libpython3.6m.so.1.0) #16 0x00007fe28c4d51ca start_thread (libpthread.so.0) #17 0x00007fe28bf2b8d3 __clone (libc.so.6) Stack trace of thread 1269788: #0 0x00007fe28c031477 epoll_wait (libc.so.6) #1 0x00007fe271fa95d8 pyepoll_poll (select.cpython-36m-x86_64-linux-gnu.so) #2 0x00007fe279e8f8d1 call_function (libpython3.6m.so.1.0) #3 0x00007fe279e902b8 _PyEval_EvalFrameDefault (libpython3.6m.so.1.0) #4 0x00007fe279e6c188 fast_function (libpython3.6m.so.1.0) #5 0x00007fe279e8f657 call_function (libpython3.6m.so.1.0) #6 0x00007fe279e902b8 _PyEval_EvalFrameDefault (libpython3.6m.so.1.0) #7 0x00007fe279e6c188 fast_function (libpython3.6m.so.1.0) #8 0x00007fe279e8f657 call_function (libpython3.6m.so.1.0) #9 0x00007fe279e902b8 _PyEval_EvalFrameDefault (libpython3.6m.so.1.0) #10 0x00007fe279dec154 _PyEval_EvalCodeWithName (libpython3.6m.so.1.0) #11 0x00007fe279e6c370 fast_function (libpython3.6m.so.1.0) #12 0x00007fe279e8f657 call_function (libpython3.6m.so.1.0) #13 0x00007fe279e90ef5 _PyEval_EvalFrameDefault (libpython3.6m.so.1.0) #14 0x00007fe279ded682 _PyFunction_FastCallDict (libpython3.6m.so.1.0) #15 0x00007fe279dee45e _PyObject_FastCallDict (libpython3.6m.so.1.0) #16 0x00007fe279e00290 method_call (libpython3.6m.so.1.0) #17 0x00007fe279df522b PyObject_Call (libpython3.6m.so.1.0) #18 0x00007fe279e92120 _PyEval_EvalFrameDefault (libpython3.6m.so.1.0) #19 0x00007fe279e6c188 fast_function (libpython3.6m.so.1.0) #20 0x00007fe279e8f657 call_function (libpython3.6m.so.1.0) #21 0x00007fe279e902b8 _PyEval_EvalFrameDefault (libpython3.6m.so.1.0) #22 0x00007fe279e6c188 fast_function (libpython3.6m.so.1.0) #23 0x00007fe279e8f657 call_function (libpython3.6m.so.1.0) #24 0x00007fe279e902b8 _PyEval_EvalFrameDefault (libpython3.6m.so.1.0) #25 0x00007fe279ded682 _PyFunction_FastCallDict (libpython3.6m.so.1.0) #26 0x00007fe279dee45e _PyObject_FastCallDict (libpython3.6m.so.1.0) #27 0x00007fe279e00290 method_call (libpython3.6m.so.1.0) #28 0x00007fe279df522b PyObject_Call (libpython3.6m.so.1.0) #29 0x00007fe279f00872 t_bootstrap (libpython3.6m.so.1.0) #30 0x00007fe279ea6a14 pythread_wrapper (libpython3.6m.so.1.0) #31 0x00007fe28c4d51ca start_thread (libpthread.so.0) #32 0x00007fe28bf2b8d3 __clone (libc.so.6) Stack trace of thread 1270890: #0 0x00007fe28bffa178 __nanosleep (libc.so.6) #1 0x00007fe28bffa07e sleep (libc.so.6) #2 0x00007fe27a24daf9 wsgi_reaper_thread (mod_wsgi_python3.so) #3 0x00007fe28c4d51ca start_thread (libpthread.so.0) #4 0x00007fe28bf2b8d3 __clone (libc.so.6) Stack trace of thread 1267789: #0 0x00007fe28c0272df __select (libc.so.6) #1 0x00007fe28c719919 apr_sleep (libapr-1.so.0) #2 0x00007fe27a24a75b wsgi_monitor_thread (mod_wsgi_python3.so) #3 0x00007fe28c4d51ca start_thread (libpthread.so.0) #4 0x00007fe28bf2b8d3 __clone (libc.so.6) Stack trace of thread 1267790: #0 0x00007fe28c0272df __select (libc.so.6) #1 0x00007fe28c719919 apr_sleep (libapr-1.so.0) #2 0x00007fe27a24d43a wsgi_deadlock_thread (mod_wsgi_python3.so) #3 0x00007fe28c4d51ca start_thread (libpthread.so.0) #4 0x00007fe28bf2b8d3 __clone (libc.so.6) Stack trace of thread 1269791: #0 0x00007fe28c4dda46 do_futex_wait.constprop.1 (libpthread.so.0) #1 0x00007fe28c4ddb38 __new_sem_wait_slow.constprop.0 (libpthread.so.0) #2 0x00007fe279deadf2 PyThread_acquire_lock_timed (libpython3.6m.so.1.0) #3 0x00007fe279e698f4 lock_PyThread_acquire_lock (libpython3.6m.so.1.0) #4 0x00007fe279e8f8d1 call_function (libpython3.6m.so.1.0) #5 0x00007fe279e902b8 _PyEval_EvalFrameDefault (libpython3.6m.so.1.0) #6 0x00007fe279dec154 _PyEval_EvalCodeWithName (libpython3.6m.so.1.0) #7 0x00007fe279e6c370 fast_function (libpython3.6m.so.1.0) #8 0x00007fe279e8f657 call_function (libpython3.6m.so.1.0) #9 0x00007fe279e902b8 _PyEval_EvalFrameDefault (libpython3.6m.so.1.0) #10 0x00007fe279dec154 _PyEval_EvalCodeWithName (libpython3.6m.so.1.0) #11 0x00007fe279ded8df _PyFunction_FastCallDict (libpython3.6m.so.1.0) #12 0x00007fe279dee45e _PyObject_FastCallDict (libpython3.6m.so.1.0) #13 0x00007fe279e00290 method_call (libpython3.6m.so.1.0) #14 0x00007fe279dee43c _PyObject_FastCallDict (libpython3.6m.so.1.0) #15 0x00007fe279ec13f0 calliter_iternext (libpython3.6m.so.1.0) #16 0x00007fe279e90646 _PyEval_EvalFrameDefault (libpython3.6m.so.1.0) #17 0x00007fe279decac5 _PyEval_EvalCodeWithName (libpython3.6m.so.1.0) #18 0x00007fe279e8e42f function_call (libpython3.6m.so.1.0) #19 0x00007fe279df522b PyObject_Call (libpython3.6m.so.1.0) #20 0x00007fe279e92120 _PyEval_EvalFrameDefault (libpython3.6m.so.1.0) #21 0x00007fe279e6c188 fast_function (libpython3.6m.so.1.0) #22 0x00007fe279e8f657 call_function (libpython3.6m.so.1.0) #23 0x00007fe279e902b8 _PyEval_EvalFrameDefault (libpython3.6m.so.1.0) #24 0x00007fe279e6c188 fast_function (libpython3.6m.so.1.0) #25 0x00007fe279e8f657 call_function (libpython3.6m.so.1.0) #26 0x00007fe279e902b8 _PyEval_EvalFrameDefault (libpython3.6m.so.1.0) #27 0x00007fe279ded682 _PyFunction_FastCallDict (libpython3.6m.so.1.0) #28 0x00007fe279dee45e _PyObject_FastCallDict (libpython3.6m.so.1.0) #29 0x00007fe279e00290 method_call (libpython3.6m.so.1.0) #30 0x00007fe279df522b PyObject_Call (libpython3.6m.so.1.0) #31 0x00007fe279f00872 t_bootstrap (libpython3.6m.so.1.0) #32 0x00007fe279ea6a14 pythread_wrapper (libpython3.6m.so.1.0) #33 0x00007fe28c4d51ca start_thread (libpthread.so.0) #34 0x00007fe28bf2b8d3 __clone (libc.so.6) ``` The SIGSEGV is delivered to the offending thread, whose backtrace always seems to have this structure: pthread_cond_timedwait@@GLIBC_2.3.2 => take_gil => … => [Python Garbage Collector related] => … => PyImport_Cleanup => Py_EndInterpreter => … => wsgi_python_child_cleanup => apr_pool_destroy I take it as: `httpd` winds down a worker, `mod_wsgi` cleans up the Python (sub-)interpreter, and garbage collection trips over something that doesn’t exist any longer. I don’t know why this didn’t happen before the DC move, so I can only guess that maybe we had this problem before, a configuration hot-fix was applied which didn’t end up in Ansible, so the fix was lost in the move. Or it happened and we didn’t notice? Anyway, let’s collect more info here.
Author
Member

I was curious and checked on pagure02:

[root@pagure02 ~][PROD]# coredumpctl --reverse | head
No coredumps found.

Perhaps comparing differences between the src.fp.o and pagure.io instances can give some insight in how to fix it.

I was curious and checked on `pagure02`: ``` [root@pagure02 ~][PROD]# coredumpctl --reverse | head No coredumps found. ``` Perhaps comparing differences between the `src.fp.o` and `pagure.io` instances can give some insight in how to fix it.
Owner

Metadata Update from @kevin:

  • Issue priority set to: Waiting on Assignee (was: Needs Review)
  • Issue tagged with: medium-gain, medium-trouble, ops
**Metadata Update from @kevin**: - Issue priority set to: Waiting on Assignee (was: Needs Review) - Issue tagged with: medium-gain, medium-trouble, ops
Owner

@nphilipp The src.fp.o didn't produced coredumps till I added CoreDumpDirectory /tmp to httpd dist-git conf. But you can still see the errors being reported in httpd error log.

@nphilipp The src.fp.o didn't produced coredumps till I added `CoreDumpDirectory /tmp` to httpd dist-git conf. But you can still see the errors being reported in httpd error log.
Author
Member

It seems that pkgs01.iad2 dumped core before the DC move, we just didn’t notice:

[root@log01 30][PROD-RDU3]# pwd
/var/log/hosts/pkgs01.iad2.fedoraproject.org/2025/06/30
[root@log01 30][PROD-RDU3]# xzgrep systemd-coredump messages.log.xz | tail -n 5
Jun 30 14:21:13 pkgs01 systemd-coredump[1782251]: Removed old coredump core.httpd.1000.f6c697a505ae4311887ed864512cdd0a.1649482.1751282904000000.lz4.
Jun 30 14:21:14 pkgs01 systemd-coredump[1782251]: Process 1780105 (httpd) of user 1000 dumped core.#012#012Stack trace of thread 1781734:#012#0  0x00007fc9ae3afa12 _PyEval_EvalCodeWithName (libpython3.6m.so.1.0)#012#1  0x00007fc9ae42f2f0 fast_function (libpython3.6m.so.1.0)#012#2  0x00007fc9ae452537 call_function (libpython3.6m.so.1.0)#012#3  0x00007fc9ae453198 _PyEval_EvalFrameDefault (libpython3.6m.so.1.0)#012#4  0x00007fc9ae42f108 fast_function (libpython3.6m.so.1.0)#012#5  0x00007fc9ae452537 call_function (libpython3.6m.so.1.0)#012#6  0x00007fc9ae453198 _PyEval_EvalFrameDefault (libpython3.6m.so.1.0)#012#7  0x00007fc9ae3af0a4 _PyEval_EvalCodeWithName (libpython3.6m.so.1.0)#012#8  0x00007fc9ae42f2f0 fast_function (libpython3.6m.so.1.0)#012#9  0x00007fc9ae452537 call_function (libpython3.6m.so.1.0)#012#10 0x00007fc9ae453dd8 _PyEval_EvalFrameDefault (libpython3.6m.so.1.0)#012#11 0x00007fc9ae3afa15 _PyEval_EvalCodeWithName (libpython3.6m.so.1.0)#012#12 0x00007fc9ae45130f function_call (libpython3.6m.so.1.0)#012#13 0x00007fc9ae3b822b PyObject_Call (libpython3.6m.so.1.0)#012#14 0x00007fc9ae455005 _PyEval_EvalFrameDefault (libpython3.6m.so.1.0)#012#15 0x00007fc9ae42f108 fast_function (libpython3.6m.so.1.0)#012#16 0x00007fc9ae452537 call_function (libpython3.6m.so.1.0)#012#17 0x00007fc9ae453198 _PyEval_EvalFrameDefault (libpython3.6m.so.1.0)#012#18 0x00007fc9ae42f108 fast_function (libpython3.6m.so.1.0)#012#19 0x00007fc9ae452537 call_function (libpython3.6m.so.1.0)#012#20 0x00007fc9ae453198 _PyEval_EvalFrameDefault (libpython3.6m.so.1.0)#012#21 0x00007fc9ae3b05d2 _PyFunction_FastCallDict (libpython3.6m.so.1.0)#012#22 0x00007fc9ae3b13ae _PyObject_FastCallDict (libpython3.6m.so.1.0)#012#23 0x00007fc9ae3c32c0 method_call (libpython3.6m.so.1.0)#012#24 0x00007fc9ae3b822b PyObject_Call (libpython3.6m.so.1.0)#012#25 0x00007fc9ae4c3672 t_bootstrap (libpython3.6m.so.1.0)#012#26 0x00007fc9ae469914 pythread_wrapper (libpython3.6m.so.1.0)#012#27 0x00007fc9c0a971ca n/a (/usr/lib64/libpthread-2.28.so (deleted))
Jun 30 14:21:14 pkgs01 systemd[1]: systemd-coredump@19564-1782250-0.service: Succeeded.
Jun 30 14:23:35 pkgs01 systemd-coredump[1783019]: Process 1781599 (httpd) of user 1000 dumped core.#012#012Stack trace of thread 1781766:#012#0  0x00007fc9c050352f n/a (/usr/lib64/libc-2.28.so (deleted))#012#1  0x00007fc9c054ba2c n/a (/usr/lib64/libc-2.28.so (deleted))
Jun 30 14:23:35 pkgs01 systemd[1]: systemd-coredump@19565-1783018-0.service: Succeeded.
[root@log01 30][PROD-RDU3]#

I don’t find anything in the pagure02 logs, but notice that there are only pagure0[12].vpn.* directories (neither .iad2. nor .rdu3.).

It seems that `pkgs01.iad2` dumped core before the DC move, we just didn’t notice: ``` [root@log01 30][PROD-RDU3]# pwd /var/log/hosts/pkgs01.iad2.fedoraproject.org/2025/06/30 [root@log01 30][PROD-RDU3]# xzgrep systemd-coredump messages.log.xz | tail -n 5 Jun 30 14:21:13 pkgs01 systemd-coredump[1782251]: Removed old coredump core.httpd.1000.f6c697a505ae4311887ed864512cdd0a.1649482.1751282904000000.lz4. Jun 30 14:21:14 pkgs01 systemd-coredump[1782251]: Process 1780105 (httpd) of user 1000 dumped core.#012#012Stack trace of thread 1781734:#012#0 0x00007fc9ae3afa12 _PyEval_EvalCodeWithName (libpython3.6m.so.1.0)#012#1 0x00007fc9ae42f2f0 fast_function (libpython3.6m.so.1.0)#012#2 0x00007fc9ae452537 call_function (libpython3.6m.so.1.0)#012#3 0x00007fc9ae453198 _PyEval_EvalFrameDefault (libpython3.6m.so.1.0)#012#4 0x00007fc9ae42f108 fast_function (libpython3.6m.so.1.0)#012#5 0x00007fc9ae452537 call_function (libpython3.6m.so.1.0)#012#6 0x00007fc9ae453198 _PyEval_EvalFrameDefault (libpython3.6m.so.1.0)#012#7 0x00007fc9ae3af0a4 _PyEval_EvalCodeWithName (libpython3.6m.so.1.0)#012#8 0x00007fc9ae42f2f0 fast_function (libpython3.6m.so.1.0)#012#9 0x00007fc9ae452537 call_function (libpython3.6m.so.1.0)#012#10 0x00007fc9ae453dd8 _PyEval_EvalFrameDefault (libpython3.6m.so.1.0)#012#11 0x00007fc9ae3afa15 _PyEval_EvalCodeWithName (libpython3.6m.so.1.0)#012#12 0x00007fc9ae45130f function_call (libpython3.6m.so.1.0)#012#13 0x00007fc9ae3b822b PyObject_Call (libpython3.6m.so.1.0)#012#14 0x00007fc9ae455005 _PyEval_EvalFrameDefault (libpython3.6m.so.1.0)#012#15 0x00007fc9ae42f108 fast_function (libpython3.6m.so.1.0)#012#16 0x00007fc9ae452537 call_function (libpython3.6m.so.1.0)#012#17 0x00007fc9ae453198 _PyEval_EvalFrameDefault (libpython3.6m.so.1.0)#012#18 0x00007fc9ae42f108 fast_function (libpython3.6m.so.1.0)#012#19 0x00007fc9ae452537 call_function (libpython3.6m.so.1.0)#012#20 0x00007fc9ae453198 _PyEval_EvalFrameDefault (libpython3.6m.so.1.0)#012#21 0x00007fc9ae3b05d2 _PyFunction_FastCallDict (libpython3.6m.so.1.0)#012#22 0x00007fc9ae3b13ae _PyObject_FastCallDict (libpython3.6m.so.1.0)#012#23 0x00007fc9ae3c32c0 method_call (libpython3.6m.so.1.0)#012#24 0x00007fc9ae3b822b PyObject_Call (libpython3.6m.so.1.0)#012#25 0x00007fc9ae4c3672 t_bootstrap (libpython3.6m.so.1.0)#012#26 0x00007fc9ae469914 pythread_wrapper (libpython3.6m.so.1.0)#012#27 0x00007fc9c0a971ca n/a (/usr/lib64/libpthread-2.28.so (deleted)) Jun 30 14:21:14 pkgs01 systemd[1]: systemd-coredump@19564-1782250-0.service: Succeeded. Jun 30 14:23:35 pkgs01 systemd-coredump[1783019]: Process 1781599 (httpd) of user 1000 dumped core.#012#012Stack trace of thread 1781766:#012#0 0x00007fc9c050352f n/a (/usr/lib64/libc-2.28.so (deleted))#012#1 0x00007fc9c054ba2c n/a (/usr/lib64/libc-2.28.so (deleted)) Jun 30 14:23:35 pkgs01 systemd[1]: systemd-coredump@19565-1783018-0.service: Succeeded. [root@log01 30][PROD-RDU3]# ``` I don’t find anything in the pagure02 logs, but notice that there are only `pagure0[12].vpn.*` directories (neither `.iad2.` nor `.rdu3.`).
Author
Member

I don’t find anything in the pagure02 logs, …

Hmm, probably also because it doesn’t have CoreDumpDirectory /tmp.

> I don’t find anything in the pagure02 logs, … Hmm, probably also because it doesn’t have `CoreDumpDirectory /tmp`.
Owner

@nphilipp Is this still something we need to resolve?

@nphilipp Is this still something we need to resolve?
Member

We ran across this as part of backlog refinement during today's infra weekly and apparently it's still happening. Does anybody know what could be the cause? Have you had a chance to look deeper @nphilipp?

We ran across this as part of backlog refinement during today's infra weekly and apparently it's still happening. Does anybody know what could be the cause? Have you had a chance to look deeper @nphilipp?

Hi @nphilipp, @patrikp and @zlopez. I looked into the stack traces provided and they point to a mod_wsgi subinterpreter teardown crash.

I've submitted a PR to force Pagure into the global application group, which is the recommended fix to stop these specific core dumps: infra/ansible#3171

Also a link to the documentation from which I got the solution:
https://modwsgi.readthedocs.io/en/master/user-guides/application-issues.html#python-simplified-gil-state-api

Hi @nphilipp, @patrikp and @zlopez. I looked into the stack traces provided and they point to a mod_wsgi subinterpreter teardown crash. I've submitted a PR to force Pagure into the global application group, which is the recommended fix to stop these specific core dumps: https://forge.fedoraproject.org/infra/ansible/pulls/3171 Also a link to the documentation from which I got the solution: https://modwsgi.readthedocs.io/en/master/user-guides/application-issues.html#python-simplified-gil-state-api
Owner

I merged the pr and deployed it.

I will keep an eye on it today and see how it does.

Thanks for the pr @victorkoycheff !

I merged the pr and deployed it. I will keep an eye on it today and see how it does. Thanks for the pr @victorkoycheff !

Always a pleasure 🫡

Let's see

Always a pleasure 🫡 Let's see
Owner

I've seen 0 coredumps so far, so I think this is solved. Thanks again!

I've seen 0 coredumps so far, so I think this is solved. Thanks again!
kevin closed this issue 2026-03-11 23:53:38 +00:00
Sign in to join this conversation.
No milestone
No project
No assignees
5 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
infra/tickets#12670
No description provided.