[Kinoite] Plasma Login issue with rawhide rebase #684
Labels
No labels
a11y
btrfs
Closed As
Can't fix
Closed As
Deferred to upstream
Closed As
Duplicate
Closed As
Fixed
Closed As
Invalid
Closed As
Won't fix
default-apps
easyfix
epel
experience
flatpak
installation
kinoite
marketing
meeting
meeting-request
meta
mobile
need-info
need-work
nvidia
packaging
pending-action
updates
x11
No milestone
No project
No assignees
7 participants
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
kde/tracker#684
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
As part of the KDE 6.6 test week - I rebased my Fedora Silverblue 43 device to Kinoite Rawhide.
The update process completely perfectly normal however when I reboot the device, I'm greeted with a blank screen/UEFI boot splash until I change to tty2.
I can still start plasma using
startplasma-waylandand use it as normal including locking my screen however logging out does take me back to tty2.I then rebased to Kinoite 43 which worked as intended which led me to believe this might be an issue with plasmalogin.
Logs below that were generated if they are of any use.
Metadata Update from @siosm:
It's an unfortunately known issue that I haven't been able to figure out yet. It does not get triggered on fresh installed so there is something that breaks on update.
I need help to debug this issue. Here is how I would proceed and how one may start looking at this:
in parallel:
Then take both
/etcfrom those systems and compare it.This might be related to https://src.fedoraproject.org/rpms/plasma-setup/blob/rawhide/f/plasma-setup.spec#_109-114.
Is there a mechanism to deal with transitions with rebases?
quick and dirty diff, will investigate later
https://gist.github.com/renner0e/d6aa10e4457c9eab56fbfdb2d472a022
Will this need a Beta Freeze Exception with the freeze tomorrow?
Yes, someone needs to file a BZ against plasma-login-manager and request an FE. We have no idea right now how to fix it.
I'll get it raised.
I created two VMs:
One installed from Rawhide Kinoite F44
One installed from the latest Kinoite F43.
I then rebased the F43 VM to Rawhide F44. It immediately showed the issue.
Upgraded system that is failing is having some issues with plasmalogin.service and user@967.service (uid 967 being "plasmalogin"):
The problem is almost assuredly this:
IE the plasmalogin and plasma-setup users, despite being defined in /lib/passwd are not present in /etc/shadow.
On the working system they ARE present in /etc/shadow, presumably put there by plasma-setup. The problem being on the rebased system plasma-setup never ran. nothing created them, and PAM is failing.
Beta Freeze Bug: https://bugzilla.redhat.com/show_bug.cgi?id=2440208
Beta Freeze Exception (Please vote): https://pagure.io/fedora-qa/blocker-review/issue/2037
Adding the missing entries to /etc/shadow and /etc/gshadow by hand resulted in plasma-setup running (is this correct on a rebase?) and then the login manager functioning as expected.
So it is in fact the missing /etc/shadow and /etc/gshadow entries causing the issue.
This may well be to do with the shadow-utils 4.19 update then. There were quite a lot of issues with that update, but one that may well be relevant here is that 4.19 no longer does SELinux context fixups when operating on a different root with
-R. See e.g. https://bugzilla.redhat.com/show_bug.cgi?id=2435621#c10 for another case where this caused issues.Thanks everyone for the investigation.
I can confirm as well that this fixes the issue on my test system.
Now I guess the question is why does those entries do not get created on the system for plasmalogin when they were created for all the other users?
There isn't unfortunately. But we can add scripts to run on boot if we figure out how to fix it.
We'll need one at least for the plasma-setup thing, since otherwise it won't get disabled properly.
Looks like the entry for plasma-setup is not created either but this one does not matter much. This points to an issue in systemd-sysusers or shadow-utils indeed.
Indeed, I don't have
/etc/plasma-setup-doneeither, but that looks like another issue. I don't know how we would create this one. Maybe we add it to all existing F42 & F43 Kinoite systems via an update.systemd-sysusers does nothing, since from the point of view of systemd-sysusers, the user already exists, because NSS says so (it's in /lib/passwd).
Also note this is a general issue, there's also a pcssd (something like that) user that has the same problem, which is probably a bug waiting to happen.
This is a general "the image added a user that didn't exist before" issue, the plasmalogin thing is just a symptom.
Is there a reason that the config only uses NSS extrafiles for passwd and not shadow?
Having a /lib/shadow and a /lib/gshadow file seems like it would solve the issue completely.
Support for files beyond passwd & group was not in the initial version of nss-altfiles. It only got enabled in https://src.fedoraproject.org/rpms/nss-altfiles/pull-request/5. We could switch it on for shadow & gshadow but that would be a significant change this late in the cycle. This would need to happen in authselect and maybe for all bootc systems in general as well.
Unfortunately the real fix would be to move away from nss-altfiles completely but that requires work and testing as well:
Looks similar to https://github.com/bootc-dev/bootc/issues/1179
At this point I see two options within reach of the beta timeline:
Option 1 would look like:
/usr/libexec/fedora-kinoite-plasmalogin-workaround/usr/lib/systemd/system/fedora-kinoite-plasmalogin-workaround.serviceEdit: Updated as I as miss-clicked on the comment button too soon.
I've made https://pagure.io/workstation-ostree-config/pull-request/734 and this can be tested with the F44 image at https://quay.io/repository/fedora-atomic-desktops-staging/kinoite?tab=tags
From an F43 installation, rebasing to the F44 image with the workaround:
Ugly but looks like it works.
I tested the workaround in a virtual machine created using
bcvk. I can log in to the F44 deployment via a graphical user interface using the Virtual Machine Manager Flatpak from Flathub. The following output is from an SSH session into the VM for easier copy/paste.Great, thanks for testing.
F44 beta is GO, thus I'm pushing that now.
Closing as this is work'ed around in F44 in time for the Beta. I'll create a new issue to track the real fix which is moving away from nss-altfiles: kde/tickets#684 (comment).
Proper fix for this issue is tracked in atomic-desktops/tracker#108