ludo-web-multiplayer/troubleshooting.md

563 lines
9.6 KiB
Markdown
Raw Permalink Blame History

This file contains invisible Unicode characters

This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

Youve identified the Mongo volume thats causing the auth issue.
What this means
local mern-ludo_mongo_data
This volume still exists, so Mongo is:
reusing old, broken initialization
not recreating the admin user
continuing to fail authentication
We now need to delete this volume so Mongo can initialize cleanly.
✅ Do this EXACTLY (safe + correct)
🔴 Step 1: Stop containers
docker compose down
🔴 Step 2: Remove the Mongo volume
docker volume rm mern-ludo_mongo_data
If Docker complains its in use, run:
docker compose down -v
(that removes volumes automatically)
✅ Step 3: Confirm its gone
docker volume ls | grep ludo
👉 This should return nothing.
✅ Step 4: Start ONLY Mongo first (important)
docker compose up -d mongo
Wait 1015 seconds.
Check logs:
docker logs ludo-mongo --tail=30
You should NOT see:
bad file
Unable to acquire security key
crash/restart loops
✅ Step 5: Test authentication (this MUST work)
docker exec -it ludo-mongo mongosh -u admin -p adminpassword --authenticationDatabase admin
If you get a > prompt → 🎉 auth is fixed
✅ Step 6: Initialize replica set (ONCE)
Inside mongosh:
rs.initiate({
_id: "rs0",
members: [{ _id: 0, host: "mongo:27017" }]
})
Then:
rs.status().members[0].stateStr
Must be:
"PRIMARY"
✅ Step 7: Start the app
docker compose up -d app
(or just)
docker compose up -d
Then:
docker restart mern-ludo-app
🎯 Expected final result
Mongo
Auth works
Replica set PRIMARY
No errors in logs
Backend
MongoDB connected
Server listening on port 18081
Browser
App loads
No crashes
No noscript / auth errors
🧾 One-line status
The Mongo volume still existed; deleting it allows Mongo to create the admin user correctly.
Once Step 5 succeeds, you are fully unblocked.
Run Step 2 now and tell me when mongosh login works 👍
Youre very close, and the remaining issue is not random.
What the two messages really mean (important)
1⃣ These messages are NOT the main problem ❌
Collection [local.oplog.rs] not found
ReadConcernMajorityNotAvailableYet
These appear when:
replica set is not initiated yet
PRIMARY is not elected yet
They are expected noise before rs.initiate() succeeds.
👉 Well fix them automatically once the replica set is initialized.
2⃣ THIS is the real blocker 🔴
MongoServerError: Authentication failed
This means 100%:
🔐 MongoDB is running with authentication enabled,
❌ but the admin user does not exist.
This is why:
backend cant connect
mongosh -u admin fails
oplog cant be created
replica set cant complete
Why the admin user does NOT exist (root cause)
Mongo creates the root user only once, on first successful startup, when:
/data/db is empty
Mongo reaches a stable running state
Earlier, Mongo:
crashed due to keyFile issues
restarted several times
skipped user creation
but auth stayed enabled
Result:
Mongo requires auth, but no users exist → permanent auth failure
This is a classic Mongo + replica set + keyFile edge case.
✅ The ONLY correct fix (clean & guaranteed)
We must let Mongo start once without auth, create the user manually, then re-enable auth.
This is the official MongoDB recovery method.
🔥 DO THIS STEP-BY-STEP (no skipping)
STEP 1⃣ Stop everything
docker compose down
STEP 2⃣ Temporarily DISABLE auth (Mongo only)
Edit docker-compose.yml mongo service:
🔴 TEMPORARY CHANGE
Comment out these lines for now:
# environment:
# MONGO_INITDB_ROOT_USERNAME: admin
# MONGO_INITDB_ROOT_PASSWORD: adminpassword
And change the command to NO keyFile, NO auth:
command:
- mongod
- --replSet
- rs0
- --bind_ip_all
(Leave volumes as-is.)
STEP 3⃣ Start Mongo only
docker compose up -d mongo
Wait ~10 seconds.
STEP 4⃣ Connect WITHOUT auth (this must work)
docker exec -it ludo-mongo mongosh
If you get a prompt → good.
STEP 5⃣ Create admin user MANUALLY
Inside mongosh:
use admin
db.createUser({
user: "admin",
pwd: "adminpassword",
roles: [ { role: "root", db: "admin" } ]
})
You must see:
{ ok: 1 }
STEP 6⃣ Initialize replica set (NOW it will work)
rs.initiate({
_id: "rs0",
members: [{ _id: 0, host: "mongo:27017" }]
})
Wait ~510 seconds, then:
rs.status().members[0].stateStr
Must be:
"PRIMARY"
Exit mongosh:
exit
STEP 7⃣ Re-enable auth + keyFile (FINAL config)
Restore your correct mongo config:
environment:
MONGO_INITDB_ROOT_USERNAME: admin
MONGO_INITDB_ROOT_PASSWORD: adminpassword
command:
- bash
- -c
- |
cp /tmp/mongo-keyfile /data/db/keyfile &&
chown mongodb:mongodb /data/db/keyfile &&
chmod 400 /data/db/keyfile &&
exec mongod --replSet rs0 --keyFile /data/db/keyfile --bind_ip_all
STEP 8⃣ Restart clean
docker compose down
docker compose up -d
STEP 9⃣ Test auth (this WILL work now)
docker exec -it ludo-mongo mongosh -u admin -p adminpassword --authenticationDatabase admin
STEP 🔟 Start / restart backend
docker restart mern-ludo-app
✅ What you should see now
Mongo
Auth works
PRIMARY elected
local.oplog.rs exists
No auth errors
Backend
MongoDB connected
Server listening on port 18081
App
Loads normally
No crashes
Change Streams work
🧠 One-line final diagnosis
MongoDB authentication was enabled before any user existed; manual user creation was required to recover.
This is the last MongoDB trap — once done, you will not hit this again.
MongoServerError[NotWritablePrimary]: not primary
Perfect — that error is expected at this exact moment, and it tells us precisely what to do next 👍
You didnt break anything.
What this error means (very clear)
MongoServerError[NotWritablePrimary]: not primary
This means:
MongoDB is running as a replica set member,
but no PRIMARY has been elected yet.
Until a node becomes PRIMARY:
❌ you cannot create users
❌ you cannot write data
❌ db.createUser() will fail
So this is not an auth issue anymore.
✅ Correct order (important)
When running Mongo as a replica set, the order must be:
Start Mongo
Initiate replica set
Wait until PRIMARY
Create users
Enable auth (already planned)
Right now, youre at step 2½.
✅ DO THIS NOW (inside mongosh)
You are already connected with:
docker exec -it ludo-mongo mongosh
At the > prompt, run THIS FIRST 👇
🔥 Step 1: Initiate replica set
rs.initiate({
_id: "rs0",
members: [
{ _id: 0, host: "mongo:27017" }
]
})
Expected output
{ ok: 1 }
🔥 Step 2: Wait a few seconds, then verify PRIMARY
rs.status().members[0].stateStr
You must see:
"PRIMARY"
If it still says STARTUP or SECONDARY, wait 5 seconds and run again.
✅ ONLY AFTER PRIMARY → create admin user
Now run:
use admin
db.createUser({
user: "admin",
pwd: "adminpassword",
roles: [ { role: "root", db: "admin" } ]
})
Expected:
{ ok: 1 }
🧠 Why this happened (important lesson)
You started Mongo with:
mongod --replSet rs0
That automatically puts Mongo into replica set mode, but:
replica set is inactive
no PRIMARY exists yet
Mongo rejects all writes until a PRIMARY is elected.
This is correct MongoDB behavior.
✅ After user creation (next steps preview)
Once { ok: 1 } appears:
Exit mongosh:
exit
Re-enable auth + keyFile in docker-compose (as discussed)
Restart containers
Verify:
docker exec -it ludo-mongo mongosh -u admin -p adminpassword --authenticationDatabase admin
🎯 One-line status update
MongoDB is in replica-set mode but not PRIMARY yet — initialize the replica set first.
👉 Action for you now
Run rs.initiate(...), wait for PRIMARY, then retry db.createUser.
Tell me what rs.status().members[0].stateStr prints, and well finish this cleanly 💪
```yaml
version: "3.9"
services:
mongo:
image: mongo:7.0
container_name: ludo-mongo
restart: unless-stopped
networks:
- ludo-net
environment:
MONGO_INITDB_ROOT_USERNAME: admin
MONGO_INITDB_ROOT_PASSWORD: adminpassword
volumes:
- ludo-mongo-data:/data/db
- ./mongo-keyfile:/tmp/mongo-keyfile:ro
command:
- bash
- -c
- |
cp /tmp/mongo-keyfile /data/db/keyfile &&
chown mongodb:mongodb /data/db/keyfile &&
chmod 400 /data/db/keyfile &&
exec mongod --replSet rs0 --keyFile /data/db/keyfile --bind_ip_all
ports:
- "27017:27017"
app:
build: .
container_name: mern-ludo-app
restart: unless-stopped
networks:
- ludo-net
env_file:
- .env
ports:
- "18081:8080"
depends_on:
- mongo
networks:
ludo-net:
volumes:
ludo-mongo-data:
```
Step 2⃣ Recreate keyfile cleanly (one last time)
cd ~/mern-ludo
rm -f mongo-keyfile
openssl rand -base64 756 > mongo-keyfile
chmod 400 mongo-keyfile
Verify:
wc -c mongo-keyfile
Step 3⃣ Full reset (MANDATORY)
docker compose down -v
docker compose up -d --build
Wait ~10 seconds.
Step 4⃣ Confirm Mongo is RUNNING (not restarting)
docker ps
You should NOT see Restarting.
Step 5⃣ Now exec WILL work
docker exec -it ludo-mongo mongosh -u admin -p adminpassword --authenticationDatabase admin
If you get a shell → 🎉 Mongo is fixed
Step 6⃣ Initialize replica set (ONCE)
rs.initiate({
_id: "rs0",
members: [{ _id: 0, host: "mongo:27017" }]
})
Check:
rs.status().members[0].stateStr
// PRIMARY
Step 7⃣ Restart app
docker restart mern-ludo-app
Your Change Stream error will now be gone permanently.
🧠 Why ALL previous attempts failed
Attempt Why it failed
Bind mount keyfile Ownership mismatch
chmod 400 only Not enough
DietPi ARM Stricter FS behavior
Mongo 7 Enforces keyfile rules hard
This copy + chown pattern solves all of them.
✅ One-line final diagnosis
MongoDB rejected the keyfile because bind-mounted files keep host ownership, and Mongo runs as a non-root user.