diff --git a/DEBUGGING_GITOPS_STATUS.md b/DEBUGGING_GITOPS_STATUS.md new file mode 100644 index 0000000..3492327 --- /dev/null +++ b/DEBUGGING_GITOPS_STATUS.md @@ -0,0 +1,187 @@ +# Debugging GitOps Status Issues + +This guide helps troubleshoot issues with the GitOps status reporting system. + +## Common Issue: Status shows OUT_OF_SYNC after deployment + +### Symptoms +- You pushed changes to the repo +- The deploy step succeeded +- But `update-gitops-status` shows OUT_OF_SYNC +- Changed files are not displayed + +### Root Causes + +#### 1. **Deployment didn't actually succeed** +The deploy step might have failed silently or the configuration wasn't applied correctly. + +**How to check:** +- Look at the deploy step logs in Woodpecker +- SSH to the server and verify files match Git: + ```bash + diff /etc/rsyslog.conf files/rsyslog.conf + diff -r /etc/rsyslog.d/ files/rsyslog.d/ + ``` + +#### 2. **Parsing issue with DRIFTED_FILES output** +The script might not be correctly extracting the file list from Ansible's output. + +**How to debug:** +Run the status update script locally with debug mode: +```bash +export KEEP_PLAYBOOK_LOG=true +./update-gitops-status.sh +``` + +This will save the playbook output to `drift-check-output.log`. Check: +- Does the log contain `DRIFTED_FILES=` line? +- What does the line look like exactly? +- Are there ANSI color codes interfering? + +Look for these debug lines in the output: +``` +DEBUG: Searching for DRIFTED_FILES in playbook output... +DEBUG: Found DRIFTED_FILES pattern +DEBUG: Raw line: ... +DEBUG: Extracted value: '...' +``` + +#### 3. **Too many open files error** +If you see "failed to create fsnotify watcher: too many open files": + +**Fixed in latest version:** +- `.woodpecker.yml` now sets `ANSIBLE_CALLBACKS_ENABLED=""` and `ANSIBLE_GATHERING=explicit` +- `update-gitops-status.sh` uses `ANSIBLE_CALLBACKS_ENABLED=""` when running playbooks +- These settings prevent Ansible from exhausting inotify watches + +**If issue persists:** +- The container might need privileged mode to adjust kernel parameters +- Or reduce Ansible parallelism in inventory settings + +#### 4. **Ansible output format changed** +If Ansible version changed, the debug output format might be different. + +**How to fix:** +Check `drift-check-output.log` and adjust the parsing in `update-gitops-status.sh`: +```bash +# Current parsing (line ~110 in update-gitops-status.sh): +DRIFTED_FILES_STR=$(echo "$DRIFTED_FILES_STR" | sed 's/.*DRIFTED_FILES=//' | sed 's/\x1b\[[0-9;]*m//g' | sed 's/".*$//' | xargs) +``` + +You might need to adjust the `sed` commands based on the actual format. + +## Testing the Fix + +### 1. Test locally +```bash +# Set up SSH key +export SSH_PRIVATE_KEY="$(cat ~/.ssh/id_rsa)" + +# Run the script with debug output +export KEEP_PLAYBOOK_LOG=true +./update-gitops-status.sh + +# Check the log +cat drift-check-output.log | grep -A 2 "DRIFTED_FILES=" +``` + +### 2. Test in Woodpecker +Push a small change and monitor the `update-gitops-status` step: +```bash +# Make a small comment change +echo "# Test change $(date)" >> files/rsyslog.conf + +# Commit and push +git add files/rsyslog.conf +git commit -m "test: verify gitops status detection" +git push + +# Watch the pipeline in Woodpecker UI +# The update-gitops-status step should: +# 1. Run deploy (apply.yml) +# 2. Run drift-check immediately after +# 3. Show SYNCED (because deploy just ran) +# 4. Show no drifted files +``` + +### 3. Test drift detection (manual change on server) +```bash +# SSH to the server +ssh rsyslog-lab + +# Make a manual change +echo "# Manual change" >> /etc/rsyslog.conf + +# Wait for the cron job to run (every 2 minutes) +# Or manually trigger it in Woodpecker + +# The status should now show: +# - Status: OUT_OF_SYNC +# - Files: rsyslog.conf +``` + +## Expected Behavior + +### After successful deployment (push to master) +``` +Step 2/4: Analyzing drift detection results... + ✓ Status: SYNCED - server configuration matches Git + Total drift count: 0 + +Step 3/4: Building JSON payload... + Generated JSON: + { + "repo": "rsyslog", + "server": "rsyslog-lab", + "sync_status": "SYNCED", + "drift_count": 0, + "files": [], + "last_check": "2026-04-22T14:30:00Z" + } +``` + +### When drift is detected (cron job or manual server change) +``` +Step 2/4: Analyzing drift detection results... + ✗ Status: OUT OF SYNC - configuration drift detected + - Drift detected in: rsyslog.conf + Total drift count: 1 + +Step 3/4: Building JSON payload... + Generated JSON: + { + "repo": "rsyslog", + "server": "rsyslog-lab", + "sync_status": "OUT_OF_SYNC", + "drift_count": 1, + "files": [ + {"name": "rsyslog.conf"} + ], + "last_check": "2026-04-22T14:32:00Z" + } +``` + +## Quick Reference + +### Enable debug mode +```bash +export KEEP_PLAYBOOK_LOG=true +./update-gitops-status.sh +cat drift-check-output.log +``` + +### Manually run drift-check +```bash +ansible-playbook -i ansible/inventory/hosts.yml ansible/playbooks/drift-check.yml +``` + +### Manually run deployment +```bash +ansible-playbook -i ansible/inventory/hosts.yml ansible/playbooks/apply.yml +``` + +### Check server state +```bash +ssh rsyslog-lab "md5sum /etc/rsyslog.conf /etc/rsyslog.d/*.conf" +md5sum files/rsyslog.conf files/rsyslog.d/*.conf +```