383 lines
9.6 KiB
Markdown
383 lines
9.6 KiB
Markdown
# Implementation Summary: Pushgateway → gitops-status-server
|
||
|
||
## Status: ✅ Complete
|
||
|
||
This document summarizes the refactoring of the rsyslog GitOps monitoring flow to use a centralized gitops-status-server instead of Pushgateway.
|
||
|
||
---
|
||
|
||
## What Was Replaced
|
||
|
||
### Old Architecture (Pushgateway-based)
|
||
```
|
||
Drift-check runs
|
||
↓
|
||
Exit code: 0 (synced) or 1 (drift)
|
||
↓
|
||
Send metric to Pushgateway
|
||
↓
|
||
Prometheus scrapes Pushgateway
|
||
↓
|
||
gitops_sync_status{repo="rsyslog",server="rsyslog-lab"} = 0 or 1
|
||
↓
|
||
Grafana queries Prometheus
|
||
↓
|
||
Dashboard shows only: SYNCED or OUT_OF_SYNC
|
||
```
|
||
|
||
**Limitations:**
|
||
- Only 0/1 metric (no file-level details)
|
||
- Requires Pushgateway, Prometheus infrastructure
|
||
- Cannot show which files changed
|
||
|
||
---
|
||
|
||
### New Architecture (gitops-status-server)
|
||
```
|
||
Drift-check runs + outputs DRIFTED_FILES=...
|
||
↓
|
||
update-gitops-status.sh script:
|
||
1. Parse changed files
|
||
2. Generate JSON
|
||
3. POST to gitops-status-server
|
||
↓
|
||
gitops-status-server
|
||
↓
|
||
Serves /status.json with rich metadata
|
||
↓
|
||
Grafana Infinity datasource reads /status.json
|
||
↓
|
||
Dashboard shows:
|
||
- Sync status
|
||
- Drift count
|
||
- List of changed files
|
||
- Last check timestamp
|
||
```
|
||
|
||
**Advantages:**
|
||
- ✓ Rich metadata (file-level details)
|
||
- ✓ No Pushgateway/Prometheus for this use case
|
||
- ✓ Centralized gitops-status-server
|
||
- ✓ Easier to audit (JSON snapshot)
|
||
- ✓ Better for multi-server/multi-repo
|
||
|
||
---
|
||
|
||
## Files Changed
|
||
|
||
### 1. `.woodpecker.yml` (MAJOR UPDATE)
|
||
|
||
#### Before (Pushgateway):
|
||
```yaml
|
||
update-sync-metric:
|
||
commands:
|
||
- printf 'gitops_sync_status{repo="rsyslog",server="rsyslog-lab"} %s\n' "$STATUS" | \
|
||
curl ... --data-binary @- "$PUSHGATEWAY_URL/metrics/job/gitops_rsyslog/..."
|
||
```
|
||
|
||
#### After (gitops-status-server):
|
||
```yaml
|
||
update-gitops-status:
|
||
commands:
|
||
- chmod +x update-gitops-status.sh
|
||
- ./update-gitops-status.sh
|
||
environment:
|
||
GITOPS_STATUS_SERVER_URL: http://gitops-status-server.observability-stack.svc.cluster.local:80
|
||
REPO_NAME: rsyslog
|
||
SERVER_NAME: rsyslog-lab
|
||
```
|
||
|
||
**Changes:**
|
||
- Removed `PUSHGATEWAY_URL` environment variable
|
||
- Removed metric push command
|
||
- Added script execution
|
||
- Added `GITOPS_STATUS_SERVER_URL` configuration
|
||
- Both `update-gitops-status` and `gitops_sync_check` steps now use the script
|
||
|
||
---
|
||
|
||
### 2. `ansible/playbooks/drift-check.yml` (ADDED OUTPUT)
|
||
|
||
#### Before:
|
||
```yaml
|
||
- name: Fail if drift detected
|
||
ansible.builtin.fail:
|
||
msg: "Configuration drift detected..."
|
||
when: drift_detected
|
||
```
|
||
|
||
#### After (ADDED before the fail task):
|
||
```yaml
|
||
# New: Build structured list of changed files
|
||
- name: Initialize list of drifted files
|
||
ansible.builtin.set_fact:
|
||
drifted_files: []
|
||
|
||
- name: Add main config to drifted files if changed
|
||
ansible.builtin.set_fact:
|
||
drifted_files: "{{ drifted_files + ['/etc/rsyslog.conf'] }}"
|
||
when: main_config_check.changed
|
||
|
||
# ... (more file collection tasks)
|
||
|
||
# New: Output structured markers for parsing
|
||
- name: Output structured list of drifted files
|
||
ansible.builtin.debug:
|
||
msg: "DRIFTED_FILES={{ drifted_files | join(',') }}"
|
||
when: drift_detected
|
||
|
||
- name: Output sync status marker
|
||
ansible.builtin.debug:
|
||
msg: "SYNC_STATUS=OUT_OF_SYNC"
|
||
when: drift_detected
|
||
```
|
||
|
||
**Changes:**
|
||
- Builds list of drifted files in `drifted_files` fact
|
||
- Outputs `DRIFTED_FILES=file1,file2,file3` for script parsing
|
||
- Outputs `SYNC_STATUS=SYNCED` or `SYNC_STATUS=OUT_OF_SYNC` markers
|
||
- Original drift detection logic unchanged
|
||
|
||
---
|
||
|
||
### 3. `update-gitops-status.sh` (CORE SCRIPT)
|
||
|
||
**New file created:** Orchestrates the entire flow
|
||
|
||
**Key functionality:**
|
||
1. Runs `drift-check.yml` playbook
|
||
2. Captures output to temp file
|
||
3. Parses `DRIFTED_FILES=...` and `SYNC_STATUS=...` markers
|
||
4. Extracts changed file names
|
||
5. Converts `/etc/rsyslog.conf` → `rsyslog.conf` (relative paths)
|
||
6. Generates JSON with metadata
|
||
7. POSTs JSON to gitops-status-server API
|
||
|
||
**4-step process:**
|
||
```
|
||
Step 1/4: Running drift-check playbook...
|
||
Step 2/4: Analyzing drift detection results...
|
||
Step 3/4: Building JSON payload...
|
||
Step 4/4: Sending status to gitops-status-server...
|
||
```
|
||
|
||
---
|
||
|
||
## Generated JSON Format
|
||
|
||
### Synced State:
|
||
```json
|
||
{
|
||
"repo": "rsyslog",
|
||
"server": "rsyslog-lab",
|
||
"sync_status": "SYNCED",
|
||
"drift_count": 0,
|
||
"files": [],
|
||
"last_check": "2026-04-21T10:30:00Z"
|
||
}
|
||
```
|
||
|
||
### Out of Sync State:
|
||
```json
|
||
{
|
||
"repo": "rsyslog",
|
||
"server": "rsyslog-lab",
|
||
"sync_status": "OUT_OF_SYNC",
|
||
"drift_count": 2,
|
||
"files": [
|
||
{ "name": "rsyslog.conf" },
|
||
{ "name": "rsyslog.d/30-lab.conf" }
|
||
],
|
||
"last_check": "2026-04-21T10:30:00Z"
|
||
}
|
||
```
|
||
|
||
---
|
||
|
||
## Data Flow Example
|
||
|
||
### Scenario: Manual edit on server
|
||
|
||
1. **Manual change:** Someone edits `/etc/rsyslog.conf` directly on server
|
||
2. **Cron trigger:** Scheduled cron job runs (every 2 minutes)
|
||
3. **Woodpecker step:** `gitops_sync_check` executes `update-gitops-status.sh`
|
||
4. **Drift detection:** `drift-check.yml` runs and detects change
|
||
5. **Output parsing:** Script extracts:
|
||
- `DRIFTED_FILES=/etc/rsyslog.conf`
|
||
- `SYNC_STATUS=OUT_OF_SYNC`
|
||
6. **JSON generation:**
|
||
```json
|
||
{
|
||
"repo": "rsyslog",
|
||
"server": "rsyslog-lab",
|
||
"sync_status": "OUT_OF_SYNC",
|
||
"drift_count": 1,
|
||
"files": [{ "name": "rsyslog.conf" }],
|
||
"last_check": "2026-04-21T10:32:00Z"
|
||
}
|
||
```
|
||
7. **API POST:** Script POSTs JSON to:
|
||
- URL: `http://gitops-status-server.observability-stack.svc.cluster.local:80/api/status`
|
||
- Method: POST
|
||
- Content-Type: application/json
|
||
8. **Server update:** gitops-status-server receives JSON and updates `/status.json`
|
||
9. **Grafana update:** Infinity datasource refreshes and displays new status
|
||
10. **Result:** Dashboard shows OUT_OF_SYNC with rsyslog.conf listed
|
||
|
||
**Time to detection:** ≤ 2 minutes
|
||
|
||
---
|
||
|
||
## Integration Points
|
||
|
||
### Woodpecker Events Handled
|
||
|
||
1. **Pull Request:**
|
||
- syntax-check → validate (no drift check)
|
||
- No gitops-status update
|
||
|
||
2. **Push to Master:**
|
||
- syntax-check → validate → deploy → **update-gitops-status**
|
||
- After deployment, immediately verify sync and update status
|
||
|
||
3. **Scheduled Cron:**
|
||
- **gitops_sync_check** (every 2 minutes by default)
|
||
- Continuous drift monitoring
|
||
|
||
---
|
||
|
||
## Configuration
|
||
|
||
### Required Environment Variables
|
||
```yaml
|
||
GITOPS_STATUS_SERVER_URL: http://gitops-status-server.observability-stack.svc.cluster.local:80
|
||
REPO_NAME: rsyslog
|
||
SERVER_NAME: rsyslog-lab
|
||
SSH_PRIVATE_KEY: from_secret: SSH_PRIVATE_KEY
|
||
ANSIBLE_CONFIG: ansible.cfg
|
||
```
|
||
|
||
### Cron Job Setup (Woodpecker UI)
|
||
- Name: `gitops_sync_check`
|
||
- Branch: `master`
|
||
- Schedule: `*/2 * * * *`
|
||
|
||
---
|
||
|
||
## Backward Compatibility
|
||
|
||
- ✓ **Existing deploy logic:** Unchanged (apply.yml still used)
|
||
- ✓ **Existing drift detection:** Enhanced (now outputs file names)
|
||
- ✓ **PR validation:** Unchanged (syntax-check, validate still used)
|
||
- ✓ **Server files:** No changes needed
|
||
|
||
---
|
||
|
||
## Security
|
||
|
||
- ✓ SSH credentials in Woodpecker secrets (not exposed)
|
||
- ✓ JSON contains only metadata (file names, counts, timestamps)
|
||
- ✓ No actual rsyslog config contents exposed
|
||
- ✓ Internal Kubernetes communication (ClusterIP)
|
||
- ✓ No Pushgateway exposure
|
||
|
||
---
|
||
|
||
## Testing Checklist
|
||
|
||
- [ ] Cron job is created in Woodpecker
|
||
- [ ] Cron job runs on schedule (every 2 minutes)
|
||
- [ ] `update-gitops-status.sh` script is executable
|
||
- [ ] Script runs successfully (HTTP 200 response)
|
||
- [ ] gitops-status-server receives JSON POSTs
|
||
- [ ] JSON format matches expected schema
|
||
- [ ] Grafana dashboard displays sync status
|
||
- [ ] Changed files appear in Grafana panel
|
||
- [ ] Manual file edit on server is detected
|
||
- [ ] Post-deployment status updates correctly
|
||
|
||
---
|
||
|
||
## Migration Steps
|
||
|
||
1. **Commit and push changes:**
|
||
```bash
|
||
git add .woodpecker.yml ansible/playbooks/drift-check.yml update-gitops-status.sh
|
||
git commit -m "refactor: replace pushgateway with gitops-status-server"
|
||
git push
|
||
```
|
||
|
||
2. **Verify pipeline runs successfully**
|
||
- Check Woodpecker logs for new steps
|
||
|
||
3. **Create Woodpecker cron job**
|
||
- Name: gitops_sync_check
|
||
- Schedule: */2 * * * *
|
||
|
||
4. **Test cron execution**
|
||
- Wait for cron trigger (within 2 minutes)
|
||
- Verify JSON is sent to gitops-status-server
|
||
|
||
5. **Verify Grafana dashboard**
|
||
- Confirm Infinity datasource can read gitops-status-server
|
||
- Dashboard shows sync status and changed files
|
||
|
||
6. **Monitor for 24 hours**
|
||
- Verify cron runs consistently
|
||
- Check for any HTTP errors
|
||
- Confirm drift detection works
|
||
|
||
7. **Decommission Pushgateway** (when confident)
|
||
- Stop sending metrics to Pushgateway
|
||
- Remove Pushgateway from infrastructure
|
||
|
||
---
|
||
|
||
## Rollback Plan
|
||
|
||
If issues arise:
|
||
|
||
1. **Revert Woodpecker changes:**
|
||
```bash
|
||
git revert <commit-hash>
|
||
git push
|
||
```
|
||
|
||
2. **Remove cron job:**
|
||
- Delete gitops_sync_check from Woodpecker UI
|
||
|
||
3. **Restore Pushgateway metric push** (if keeping Prometheus monitoring)
|
||
|
||
---
|
||
|
||
## Key Improvements
|
||
|
||
| Metric | Old | New |
|
||
|--------|-----|-----|
|
||
| Data richness | 0/1 only | JSON with file names |
|
||
| Setup complexity | Pushgateway + Prometheus | Single service call |
|
||
| Audit trail | Basic | Structured snapshots |
|
||
| File-level visibility | None | Complete list |
|
||
| Update frequency | After deployment | Every 2 minutes + post-deploy |
|
||
| Infrastructure | 2+ services | 1 service (gitops-status-server) |
|
||
|
||
---
|
||
|
||
## Documentation Files
|
||
|
||
1. **`GITOPS_STATUS_SERVER_INTEGRATION.md`** – Comprehensive documentation
|
||
2. **`QUICK_REFERENCE.md`** – Quick start and troubleshooting
|
||
3. **`IMPLEMENTATION_SUMMARY.md`** – This file
|
||
|
||
---
|
||
|
||
## Support
|
||
|
||
For issues, consult:
|
||
1. `.woodpecker.yml` comments
|
||
2. `update-gitops-status.sh` comments
|
||
3. `drift-check.yml` comments
|
||
4. Full documentation in GITOPS_STATUS_SERVER_INTEGRATION.md
|
||
5. Woodpecker pipeline logs
|
||
6. gitops-status-server application logs
|