I joined a new client’s box one morning to set up SSO via SSSD. By the afternoon, no one — including me, including root — could SSH in. The PAM stack had been rearranged subtly in /etc/pam.d/common-auth, and a five-line file that nobody normally reads had locked everyone out of the server. Console access via the cloud provider’s emergency tools saved us; the lesson stuck.
This is the PAM ordering rule I now follow on every Ubuntu 22 server I touch — what each module does, why the default order works, and which two swaps will silently break SSH for everyone.
The default common-auth on Ubuntu 22
# /etc/pam.d/common-auth
auth [success=2 default=ignore] pam_unix.so nullok
auth [success=1 default=ignore] pam_sss.so use_first_pass
auth requisite pam_deny.so
auth required pam_permit.so
auth optional pam_cap.so
Reading this top-to-bottom: try local Unix passwords first. If success, jump 2 lines forward (to pam_permit, granting access). If no match, fall through to SSSD (LDAP / AD). If SSSD says yes, jump 1 forward to pam_permit. Either way, if neither matched, hit pam_deny and stop. pam_cap.so is decorative — it sets capabilities for the session and never fails auth.
The control values [success=2 default=ignore] are the entire ball game. They’re a tiny state machine, and rearranging modules without rearranging the success-jump counts is how you break the stack.
The order that broke SSH for me
The change was reasonable on its face: “let’s prefer SSO before falling back to local accounts.” The intuitive flip:
# BROKEN — do not copy
auth [success=2 default=ignore] pam_sss.so
auth [success=1 default=ignore] pam_unix.so use_first_pass
auth requisite pam_deny.so
auth required pam_permit.so
Two things were wrong:
- The success counts didn’t get updated. The original
success=2was relative to the original line position. After the swap, the same number now points at the wrong target line. use_first_passon pam_unix. When pam_sss runs first and prompts the user for a password, that password is captured into the PAM “AUTHTOK”. When pam_unix runs second,use_first_passsays “use the password you already have” — but if pam_sss prompted in a way that didn’t store it, pam_unix has nothing and silently fails.
Result: SSSD tried first, didn’t find the user (because we hadn’t joined the domain yet), pam_unix tried with no password, failed, pam_deny ran. auth.log showed nothing useful — just authentication failure for every user. Console-only access until I rolled it back.
The other order that’s also broken
Putting pam_systemd.so in the auth stack at all:
# Wrong stack — pam_systemd belongs in session, not auth
auth optional pam_systemd.so
auth [success=2 default=ignore] pam_unix.so nullok
...
pam_systemd.so is for the session phase — registering the new login with logind, creating the cgroup, setting up XDG_RUNTIME_DIR. In the auth phase, it does nothing useful and (depending on systemd version) can hang for tens of seconds waiting for a logind socket that’s not ready. Logins time out at the SSH 30-second auth window. Strict spec for which file it belongs in: /etc/pam.d/common-session, not common-auth.
The correct way to add SSSD
Use pam-auth-update. It’s the Debian/Ubuntu tool that knows how to compose the success-jump counts correctly:
sudo apt install sssd libpam-sss libnss-sss
sudo pam-auth-update --enable sss
This runs a TUI that lets you pick which PAM modules are active. It then regenerates common-auth, common-account, common-session, and common-password with correct success/jump arithmetic. Don’t hand-edit those files; pam-auth-update will overwrite them on the next package upgrade.
If you need a fixed order (SSO before local, local before SSO), pam-auth-update‘s priority numbers in /usr/share/pam-configs/sss control it. Edit the Priority: field in that profile, re-run pam-auth-update, and you get a correct stack with your desired ordering.
The “always test from a second session” rule
Whenever you touch any file in /etc/pam.d, follow this protocol:
- Keep your existing SSH session open.
- Open a second SSH session from a different terminal, before making the change.
- Make the PAM change in the first session.
- From a third, brand-new SSH session, log in to verify auth still works.
- If the third session fails, the second session is your safety line — undo the change there.
If you’re remote and only have one connection: don’t touch PAM. Schedule a maintenance window, take a snapshot, get console access ready, then do the change.
Reading auth.log when it goes wrong
# Watch live during auth attempts
sudo tail -f /var/log/auth.log /var/log/syslog
# Test PAM stack without touching SSH
echo "" | sudo pamtester -v sshd youruser authenticate
pamtester is the underrated tool here — it lets you simulate a PAM auth flow against any service (“sshd”, “su”, “login”) with verbose tracing. If pamtester fails, your SSH login was going to fail; debug from the same trace before you lose your shell.
PAM is one of those subsystems where 95% of admins never look at it, and the 5% who do usually break it in the same two ways. Use pam-auth-update, never hand-roll the success counts, keep two SSH sessions open, and pamtester is your friend.
Cover photo: Brett Sayles on Pexels.
