ERROR on pagure - dist-git #13099

Closed
opened 2026-02-02 10:49:49 +00:00 by zlopez · 6 comments
Owner

Description of request

The dist-git recently started to produce a large amount of errors like these:

Process Details
---------------
host:     pkgs01.rdu3.fedoraproject.org
PID:      2952109
name:     celery
command:  /usr/bin/python3.6 /usr/bin/celery worker -A pagure.lib.tasks_services --loglevel=info -Q pagure_logcom

Message type:       ERROR
Location:           /usr/lib/python3.6/site-packages/celery/app/trace.py:256
Module:             trace
Function:           _log_error
Time:               2026-02-02 10:45:32,391


URL:    * Worker *
args:   
form:   -
user:   -


Message:
--------

Task %(name)s[%(id)s] %(description)s: %(exc)s


Callstack that lead to the logging statement
--------------------------------------------

Traceback (most recent call last):
  File "/usr/lib64/python3.6/site-packages/sqlalchemy/engine/base.py", line 1244, in _execute_context
    cursor, statement, parameters, context
  File "/usr/lib64/python3.6/site-packages/sqlalchemy/engine/default.py", line 552, in do_execute
    cursor.execute(statement, parameters)
psycopg2.DatabaseError: could not receive data from server: Connection timed out


The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/celery/app/trace.py", line 385, in trace_task
    R = retval = fun(*args, **kwargs)
  File "/usr/lib/python3.6/site-packages/celery/app/trace.py", line 648, in __protected_call__
    return self.run(*args, **kwargs)
  File "/usr/lib/python3.6/site-packages/pagure/lib/tasks_utils.py", line 36, in decorated_function
    return function(self, session, *args, **kwargs)
  File "/usr/lib/python3.6/site-packages/pagure/lib/tasks_services.py", line 202, in log_commit_send_notifications
    session, name, user=username, namespace=namespace
  File "/usr/lib/python3.6/site-packages/pagure/lib/query.py", line 2874, in _get_project
    return query.one()
  File "/usr/lib64/python3.6/site-packages/sqlalchemy/orm/query.py", line 3275, in one
    ret = self.one_or_none()
  File "/usr/lib64/python3.6/site-packages/sqlalchemy/orm/query.py", line 3244, in one_or_none
    ret = list(self)
  File "/usr/lib64/python3.6/site-packages/sqlalchemy/orm/query.py", line 3317, in __iter__
    return self._execute_and_instances(context)
  File "/usr/lib64/python3.6/site-packages/sqlalchemy/orm/query.py", line 3342, in _execute_and_instances
    result = conn.execute(querycontext.statement, self._params)
  File "/usr/lib64/python3.6/site-packages/sqlalchemy/engine/base.py", line 988, in execute
    return meth(self, multiparams, params)
  File "/usr/lib64/python3.6/site-packages/sqlalchemy/sql/elements.py", line 287, in _execute_on_connection
    return connection._execute_clauseelement(self, multiparams, params)
  File "/usr/lib64/python3.6/site-packages/sqlalchemy/engine/base.py", line 1107, in _execute_clauseelement
    distilled_params,
  File "/usr/lib64/python3.6/site-packages/sqlalchemy/engine/base.py", line 1248, in _execute_context
    e, statement, parameters, cursor, context
  File "/usr/lib64/python3.6/site-packages/sqlalchemy/engine/base.py", line 1466, in _handle_dbapi_exception
    util.raise_from_cause(sqlalchemy_exception, exc_info)
  File "/usr/lib64/python3.6/site-packages/sqlalchemy/util/compat.py", line 383, in raise_from_cause
    reraise(type(exception), exception, tb=exc_tb, cause=cause)
  File "/usr/lib64/python3.6/site-packages/sqlalchemy/util/compat.py", line 128, in reraise
    raise value.with_traceback(tb)
  File "/usr/lib64/python3.6/site-packages/sqlalchemy/engine/base.py", line 1244, in _execute_context
    cursor, statement, parameters, context
  File "/usr/lib64/python3.6/site-packages/sqlalchemy/engine/default.py", line 552, in do_execute
    cursor.execute(statement, parameters)
sqlalchemy.exc.DatabaseError: (psycopg2.DatabaseError) could not receive data from server: Connection timed out

[SQL: SELECT projects.id AS projects_id, projects.user_id AS projects_user_id, projects.namespace AS projects_namespace, projects.name AS projects_name, projects.description AS projects_description, projects.url AS projects_url, projects._settings AS projects__settings, projects.hook_token AS projects_hook_token, projects.avatar_email AS projects_avatar_email, projects.is_fork AS projects_is_fork, projects.read_only AS projects_read_only, projects.parent_id AS projects_parent_id, projects._priorities AS projects__priorities, projects.default_priority AS projects_default_priority, projects._milestones AS projects__milestones, projects._milestones_keys AS projects__milestones_keys, projects._quick_replies AS projects__quick_replies, projects._reports AS projects__reports, projects._notifications AS projects__notifications, projects._close_status AS projects__close_status, projects._block_users AS projects__block_users, projects.mirrored_from AS projects_mirrored_from, projects.mirrored
 _from_last_log AS projects_mirrored_from_last_log, projects.date_created AS projects_date_created, projects.date_modified AS projects_date_modified, projects.private AS projects_private, projects.repospanner_region AS projects_repospanner_region 
FROM projects 
WHERE projects.name = %(name_1)s AND projects.namespace = %(namespace_1)s AND projects.is_fork = false]
[parameters: {'name_1': 'python-xeddsa', 'namespace_1': 'rpms'}]
(Background on this error at: http://sqlalche.me/e/4xp6)

I tried to check it out on Friday, but didn't found the root cause and it stopped alerting during the investigation.

### Description of request The dist-git recently started to produce a large amount of errors like these: ``` Process Details --------------- host: pkgs01.rdu3.fedoraproject.org PID: 2952109 name: celery command: /usr/bin/python3.6 /usr/bin/celery worker -A pagure.lib.tasks_services --loglevel=info -Q pagure_logcom Message type: ERROR Location: /usr/lib/python3.6/site-packages/celery/app/trace.py:256 Module: trace Function: _log_error Time: 2026-02-02 10:45:32,391 URL: * Worker * args: form: - user: - Message: -------- Task %(name)s[%(id)s] %(description)s: %(exc)s Callstack that lead to the logging statement -------------------------------------------- Traceback (most recent call last): File "/usr/lib64/python3.6/site-packages/sqlalchemy/engine/base.py", line 1244, in _execute_context cursor, statement, parameters, context File "/usr/lib64/python3.6/site-packages/sqlalchemy/engine/default.py", line 552, in do_execute cursor.execute(statement, parameters) psycopg2.DatabaseError: could not receive data from server: Connection timed out The above exception was the direct cause of the following exception: Traceback (most recent call last): File "/usr/lib/python3.6/site-packages/celery/app/trace.py", line 385, in trace_task R = retval = fun(*args, **kwargs) File "/usr/lib/python3.6/site-packages/celery/app/trace.py", line 648, in __protected_call__ return self.run(*args, **kwargs) File "/usr/lib/python3.6/site-packages/pagure/lib/tasks_utils.py", line 36, in decorated_function return function(self, session, *args, **kwargs) File "/usr/lib/python3.6/site-packages/pagure/lib/tasks_services.py", line 202, in log_commit_send_notifications session, name, user=username, namespace=namespace File "/usr/lib/python3.6/site-packages/pagure/lib/query.py", line 2874, in _get_project return query.one() File "/usr/lib64/python3.6/site-packages/sqlalchemy/orm/query.py", line 3275, in one ret = self.one_or_none() File "/usr/lib64/python3.6/site-packages/sqlalchemy/orm/query.py", line 3244, in one_or_none ret = list(self) File "/usr/lib64/python3.6/site-packages/sqlalchemy/orm/query.py", line 3317, in __iter__ return self._execute_and_instances(context) File "/usr/lib64/python3.6/site-packages/sqlalchemy/orm/query.py", line 3342, in _execute_and_instances result = conn.execute(querycontext.statement, self._params) File "/usr/lib64/python3.6/site-packages/sqlalchemy/engine/base.py", line 988, in execute return meth(self, multiparams, params) File "/usr/lib64/python3.6/site-packages/sqlalchemy/sql/elements.py", line 287, in _execute_on_connection return connection._execute_clauseelement(self, multiparams, params) File "/usr/lib64/python3.6/site-packages/sqlalchemy/engine/base.py", line 1107, in _execute_clauseelement distilled_params, File "/usr/lib64/python3.6/site-packages/sqlalchemy/engine/base.py", line 1248, in _execute_context e, statement, parameters, cursor, context File "/usr/lib64/python3.6/site-packages/sqlalchemy/engine/base.py", line 1466, in _handle_dbapi_exception util.raise_from_cause(sqlalchemy_exception, exc_info) File "/usr/lib64/python3.6/site-packages/sqlalchemy/util/compat.py", line 383, in raise_from_cause reraise(type(exception), exception, tb=exc_tb, cause=cause) File "/usr/lib64/python3.6/site-packages/sqlalchemy/util/compat.py", line 128, in reraise raise value.with_traceback(tb) File "/usr/lib64/python3.6/site-packages/sqlalchemy/engine/base.py", line 1244, in _execute_context cursor, statement, parameters, context File "/usr/lib64/python3.6/site-packages/sqlalchemy/engine/default.py", line 552, in do_execute cursor.execute(statement, parameters) sqlalchemy.exc.DatabaseError: (psycopg2.DatabaseError) could not receive data from server: Connection timed out [SQL: SELECT projects.id AS projects_id, projects.user_id AS projects_user_id, projects.namespace AS projects_namespace, projects.name AS projects_name, projects.description AS projects_description, projects.url AS projects_url, projects._settings AS projects__settings, projects.hook_token AS projects_hook_token, projects.avatar_email AS projects_avatar_email, projects.is_fork AS projects_is_fork, projects.read_only AS projects_read_only, projects.parent_id AS projects_parent_id, projects._priorities AS projects__priorities, projects.default_priority AS projects_default_priority, projects._milestones AS projects__milestones, projects._milestones_keys AS projects__milestones_keys, projects._quick_replies AS projects__quick_replies, projects._reports AS projects__reports, projects._notifications AS projects__notifications, projects._close_status AS projects__close_status, projects._block_users AS projects__block_users, projects.mirrored_from AS projects_mirrored_from, projects.mirrored _from_last_log AS projects_mirrored_from_last_log, projects.date_created AS projects_date_created, projects.date_modified AS projects_date_modified, projects.private AS projects_private, projects.repospanner_region AS projects_repospanner_region FROM projects WHERE projects.name = %(name_1)s AND projects.namespace = %(namespace_1)s AND projects.is_fork = false] [parameters: {'name_1': 'python-xeddsa', 'namespace_1': 'rpms'}] (Background on this error at: http://sqlalche.me/e/4xp6) ``` I tried to check it out on Friday, but didn't found the root cause and it stopped alerting during the investigation.
Author
Owner

I tried to look into it a little more as it's annoying with sending the e-mails, but I don't see any high load on db01, that could cause this. It's also happening every few minutes.

I tried to look into it a little more as it's annoying with sending the e-mails, but I don't see any high load on `db01`, that could cause this. It's also happening every few minutes.
Member

The query is from the pagure.lib.query._get_project(), it shouldn't be a big/long query. Maybe there's a limit on the number of connections that is reached here?

The query is from the `pagure.lib.query._get_project()`, it shouldn't be a big/long query. Maybe there's a limit on the number of connections that is reached here?
Owner

There's also some other ones now going to admin (although these are the most common).

I don't see why it would timeout... there's no connection limits being hit, no network problems I can see.

Also, these all seem to come from the celery workers. Perhaps there's some limit there?

There's also some other ones now going to admin (although these are the most common). I don't see why it would timeout... there's no connection limits being hit, no network problems I can see. Also, these all seem to come from the celery workers. Perhaps there's some limit there?
Author
Owner

There could be also some limit on tasks in postgres side as well.

There could be also some limit on tasks in postgres side as well.
Owner

I'm not sure what it would be. It's never hitting max connections (It would log that and I checked for it).

I'm not sure what it would be. It's never hitting max connections (It would log that and I checked for it).
Member

I've lowered the DB_POOL_RECYCLE in pagure.cfg (see commit 1e56a66), it seems to have fixed these issues.

I've lowered the `DB_POOL_RECYCLE` in `pagure.cfg` (see commit `1e56a66`), it seems to have fixed these issues.
Sign in to join this conversation.
No milestone
No project
No assignees
3 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
infra/tickets#13099
No description provided.