Mailing list archive seems to not update its search index #13355
Labels
No labels
announcement
anubis
authentication
aws
backlog
blocked
bodhi
ci
cloud
communishift
copr
database
day-to-day
dc-move
deprecated
dev
discourse
dns
downloads
easyfix
epel
firmitas
forgejo_migration
Gain
High
Gain
Low
Gain
Medium
gitlab
greenwave
hardware
help wanted
high-trouble
koji
koschei
lists
low-trouble
medium-trouble
mirrorlists
monitoring
Needs investigation
odcs
OpenShift
ops
outage
packager_workflow_blocker
pagure
permissions
Priority
Needs Review
Priority
Next Meeting
Priority
🔥 URGENT 🔥
Priority
Waiting on Assignee
Priority
Waiting on External
Priority
Waiting on Reporter
rabbitmq
release-monitoring
releng
request-for-resources
s390x
security
SMTP
sprint-0
sprint-1
src.fp.o
staging
unfreeze
waiverdb
websites-general
wiki
Backlog Status
Needs Review
Backlog Status
Ready
chore
documentation
points
01
points
02
points
03
points
05
points
08
points
13
Priority
High
Priority
Low
Priority
Medium
Sprint Status
Blocked
Sprint Status
Done
Sprint Status
In Progress
Sprint Status
Review
Sprint Status
To Do
Technical Debt
Work Item
Bug
Work Item
Epic
Work Item
Spike
Work Item
Task
Work Item
User Story
No milestone
No project
No assignees
2 participants
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
infra/tickets#13355
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Description of request
Is the
develmailing list search not updating some index?This search for "gcc snapshots in Koji and copr" is the exact title of a thread from 7 days ago, and it only finds a load of results from 3+ months ago: https://lists.fedoraproject.org/archives/search?q=gcc+snapshots+in+Koji+and+copr&page=1&mlist=devel%40lists.fedoraproject.org&sort=date-desc
The thread is present in the archive it's just not searchable.
So I checked out the mailman and the index of the list should be updated every hour and every month as well.
So I'm trying to generate the index for
develmailing list manually and there is around 333877 mails to be indexed. It's possible that the indexing library has issue with that big mailing list. I will wait for it to finish and see if there will be any error.Running the indexing manually for the list fixed the index.
But looking at the log of indexer I found this:
So I will try to find what is causing this and try to fix it.
After some digging I found out what is causing this. It's an issue in xapian_haystack, the indexing fails when there is e-mail that is too long for it. And as the hourly indexing is trying to index all new e-mails it fails every time. I assume it started failing when it got to a mailing list with the mail that causes this issue during indexing.
I will introduce patch to
xapian_haystack, that should just skip those mails during processing, till this is resolved upstream.The patch is now deployed on the machine and I'm manually re-indexing all the mailing lists again, so we have everything in place. I will close this once the re-indexing is finished, so all the lists are fixed. Anyway the
develmailing list index is fixed.The re-indexing is now finished. Closing this as done.