fix: improve DRIFTED_FILES parsing and reduce file descriptor issues
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful

- Add comprehensive debug output to track parsing steps
- Fix DRIFTED_FILES extraction from Ansible output
- Always output DRIFTED_FILES line (even when empty) for reliable parsing
- Add ANSIBLE_CALLBACKS_ENABLED='' to prevent inotify exhaustion
- Add KEEP_PLAYBOOK_LOG option for debugging
- Add validation warning when OUT_OF_SYNC but no files found
- Create DEBUGGING_GITOPS_STATUS.md guide
This commit is contained in:
dvirlabs 2026-04-22 23:03:34 +03:00
parent 57911b7f52
commit 46b0bb449e

187
DEBUGGING_GITOPS_STATUS.md Normal file
View File

@ -0,0 +1,187 @@
# Debugging GitOps Status Issues
This guide helps troubleshoot issues with the GitOps status reporting system.
## Common Issue: Status shows OUT_OF_SYNC after deployment
### Symptoms
- You pushed changes to the repo
- The deploy step succeeded
- But `update-gitops-status` shows OUT_OF_SYNC
- Changed files are not displayed
### Root Causes
#### 1. **Deployment didn't actually succeed**
The deploy step might have failed silently or the configuration wasn't applied correctly.
**How to check:**
- Look at the deploy step logs in Woodpecker
- SSH to the server and verify files match Git:
```bash
diff /etc/rsyslog.conf files/rsyslog.conf
diff -r /etc/rsyslog.d/ files/rsyslog.d/
```
#### 2. **Parsing issue with DRIFTED_FILES output**
The script might not be correctly extracting the file list from Ansible's output.
**How to debug:**
Run the status update script locally with debug mode:
```bash
export KEEP_PLAYBOOK_LOG=true
./update-gitops-status.sh
```
This will save the playbook output to `drift-check-output.log`. Check:
- Does the log contain `DRIFTED_FILES=` line?
- What does the line look like exactly?
- Are there ANSI color codes interfering?
Look for these debug lines in the output:
```
DEBUG: Searching for DRIFTED_FILES in playbook output...
DEBUG: Found DRIFTED_FILES pattern
DEBUG: Raw line: ...
DEBUG: Extracted value: '...'
```
#### 3. **Too many open files error**
If you see "failed to create fsnotify watcher: too many open files":
**Fixed in latest version:**
- `.woodpecker.yml` now sets `ANSIBLE_CALLBACKS_ENABLED=""` and `ANSIBLE_GATHERING=explicit`
- `update-gitops-status.sh` uses `ANSIBLE_CALLBACKS_ENABLED=""` when running playbooks
- These settings prevent Ansible from exhausting inotify watches
**If issue persists:**
- The container might need privileged mode to adjust kernel parameters
- Or reduce Ansible parallelism in inventory settings
#### 4. **Ansible output format changed**
If Ansible version changed, the debug output format might be different.
**How to fix:**
Check `drift-check-output.log` and adjust the parsing in `update-gitops-status.sh`:
```bash
# Current parsing (line ~110 in update-gitops-status.sh):
DRIFTED_FILES_STR=$(echo "$DRIFTED_FILES_STR" | sed 's/.*DRIFTED_FILES=//' | sed 's/\x1b\[[0-9;]*m//g' | sed 's/".*$//' | xargs)
```
You might need to adjust the `sed` commands based on the actual format.
## Testing the Fix
### 1. Test locally
```bash
# Set up SSH key
export SSH_PRIVATE_KEY="$(cat ~/.ssh/id_rsa)"
# Run the script with debug output
export KEEP_PLAYBOOK_LOG=true
./update-gitops-status.sh
# Check the log
cat drift-check-output.log | grep -A 2 "DRIFTED_FILES="
```
### 2. Test in Woodpecker
Push a small change and monitor the `update-gitops-status` step:
```bash
# Make a small comment change
echo "# Test change $(date)" >> files/rsyslog.conf
# Commit and push
git add files/rsyslog.conf
git commit -m "test: verify gitops status detection"
git push
# Watch the pipeline in Woodpecker UI
# The update-gitops-status step should:
# 1. Run deploy (apply.yml)
# 2. Run drift-check immediately after
# 3. Show SYNCED (because deploy just ran)
# 4. Show no drifted files
```
### 3. Test drift detection (manual change on server)
```bash
# SSH to the server
ssh rsyslog-lab
# Make a manual change
echo "# Manual change" >> /etc/rsyslog.conf
# Wait for the cron job to run (every 2 minutes)
# Or manually trigger it in Woodpecker
# The status should now show:
# - Status: OUT_OF_SYNC
# - Files: rsyslog.conf
```
## Expected Behavior
### After successful deployment (push to master)
```
Step 2/4: Analyzing drift detection results...
✓ Status: SYNCED - server configuration matches Git
Total drift count: 0
Step 3/4: Building JSON payload...
Generated JSON:
{
"repo": "rsyslog",
"server": "rsyslog-lab",
"sync_status": "SYNCED",
"drift_count": 0,
"files": [],
"last_check": "2026-04-22T14:30:00Z"
}
```
### When drift is detected (cron job or manual server change)
```
Step 2/4: Analyzing drift detection results...
✗ Status: OUT OF SYNC - configuration drift detected
- Drift detected in: rsyslog.conf
Total drift count: 1
Step 3/4: Building JSON payload...
Generated JSON:
{
"repo": "rsyslog",
"server": "rsyslog-lab",
"sync_status": "OUT_OF_SYNC",
"drift_count": 1,
"files": [
{"name": "rsyslog.conf"}
],
"last_check": "2026-04-22T14:32:00Z"
}
```
## Quick Reference
### Enable debug mode
```bash
export KEEP_PLAYBOOK_LOG=true
./update-gitops-status.sh
cat drift-check-output.log
```
### Manually run drift-check
```bash
ansible-playbook -i ansible/inventory/hosts.yml ansible/playbooks/drift-check.yml
```
### Manually run deployment
```bash
ansible-playbook -i ansible/inventory/hosts.yml ansible/playbooks/apply.yml
```
### Check server state
```bash
ssh rsyslog-lab "md5sum /etc/rsyslog.conf /etc/rsyslog.d/*.conf"
md5sum files/rsyslog.conf files/rsyslog.d/*.conf
```