Compare commits

..

1 Commits

Author SHA1 Message Date
7891389e89 migrate: applicationsets -> argocd-apps (prod only) 2026-03-04 06:29:08 +02:00
721 changed files with 63224 additions and 156861 deletions

43
MIGRATION.md Normal file
View File

@ -0,0 +1,43 @@
# ArgoCD Apps Migration
## Summary
This repository has been converted from `applicationsets/` to `argocd-apps/`.
- Previous model: `ApplicationSet` manifests under `applicationsets/`
- New model: single `Application` manifests under `argocd-apps/`
`applicationsets/` is now **deprecated** and has been moved to `archive/applicationsets/`.
## What Changed
For each file in `applicationsets/`, an equivalent `Application` was created in `argocd-apps/`.
- API kind changed from `ApplicationSet` to `Application`
- Existing app name, project, destination server, destination namespace, repo URL, target revision, and sync policy were preserved
- `syncOptions` includes `CreateNamespace=true`
- `spec.syncPolicy.automated.prune` is temporarily set to `false` during takeover to avoid accidental deletions
## Environment Values Policy
This repo now references **prod-only** chart values in ArgoCD app definitions.
- `values-int.yaml` is no longer referenced by generated ArgoCD apps
- Existing `values-int.yaml` files were not deleted
- Helm apps now read values from `manifests/<app>/values.yaml`
## Archived Legacy Structure
- Old `applicationsets/` files were moved to `archive/applicationsets/`
- `charts/` and `manifests/` remain active in place
## Required Root App Change
Update your ArgoCD root app (or app-of-apps) to point to `argocd-apps/` instead of `applicationsets/`.
Typical change:
- From: `spec.source.path: applicationsets`
- To: `spec.source.path: argocd-apps`
Apply and sync the root app after this path update.

View File

@ -0,0 +1,38 @@
apiVersion: argoproj.io/v1alpha1
kind: ApplicationSet
metadata:
name: eck-resources
namespace: argocd
spec:
generators:
- list:
elements:
- env: prod
valuesFile: values-prod.yaml
nameSuffix: eck-prod
host: kibana.dvirlabs.com
- env: int
valuesFile: values-int.yaml
nameSuffix: eck-int
host: kibana-int.dvirlabs.com
template:
metadata:
name: '{{nameSuffix}}'
spec:
project: observability
source:
repoURL: https://git.dvirlabs.com/dvirlabs/observability-stack.git
targetRevision: master
path: charts/eck-resources
helm:
valueFiles:
- my-values/{{valuesFile}}
destination:
server: https://kubernetes.default.svc
namespace: monitoring
syncPolicy:
automated:
prune: true
selfHeal: true
syncOptions:
- CreateNamespace=true

View File

@ -0,0 +1,30 @@
apiVersion: argoproj.io/v1alpha1
kind: ApplicationSet
metadata:
name: external-secrets-appset
namespace: argocd
spec:
generators:
- git:
repoURL: https://git.dvirlabs.com/dvirlabs/observability-stack.git
revision: master
directories:
- path: manifests/external-secrets
template:
metadata:
name: 'external-secret-{{path.basename}}'
spec:
project: observability
source:
repoURL: https://git.dvirlabs.com/dvirlabs/observability-stack.git
targetRevision: master
path: '{{path}}'
directory:
recurse: true
destination:
server: https://kubernetes.default.svc
namespace: monitoring
syncPolicy:
automated:
prune: true
selfHeal: true

View File

@ -0,0 +1,28 @@
apiVersion: argoproj.io/v1alpha1
kind: ApplicationSet
metadata:
name: prometheus-scrape-secret
namespace: argocd
spec:
generators:
- list:
elements:
- name: prometheus-scrape-secret
template:
metadata:
name: '{{name}}'
spec:
project: observability
source:
repoURL: https://git.dvirlabs.com/dvirlabs/observability-stack.git
targetRevision: HEAD
path: manifests/prometheus-scrape-secret
destination:
server: https://kubernetes.default.svc
namespace: monitoring
syncPolicy:
automated:
prune: true
selfHeal: true
syncOptions:
- CreateNamespace=true

View File

@ -0,0 +1,34 @@
apiVersion: argoproj.io/v1alpha1
kind: ApplicationSet
metadata:
name: rancher-monitoring-appset
namespace: argocd
spec:
generators:
- list:
elements:
- env: prod
valuesFile: values-prod.yaml
nameSuffix: rancher-monitoring-prod
host: grafana.dvirlabs.com
template:
metadata:
name: '{{nameSuffix}}'
spec:
project: observability
source:
repoURL: https://git.dvirlabs.com/dvirlabs/observability-stack.git
targetRevision: HEAD
path: charts/rancher-monitoring
helm:
valueFiles:
- my-values/{{valuesFile}}
destination:
server: https://kubernetes.default.svc
namespace: monitoring
syncPolicy:
automated:
prune: true
selfHeal: true
syncOptions:
- CreateNamespace=true

View File

@ -0,0 +1,34 @@
apiVersion: argoproj.io/v1alpha1
kind: ApplicationSet
metadata:
name: uptime-kuma
namespace: argocd
spec:
generators:
- list:
elements:
- env: prod
valuesFile: values-prod.yaml
nameSuffix: uptime-kuma-prod
host: kuma.dvirlabs.com
template:
metadata:
name: '{{nameSuffix}}'
spec:
project: observability
source:
repoURL: https://git.dvirlabs.com/dvirlabs/observability-stack.git
targetRevision: master
path: charts/uptime-kuma
helm:
valueFiles:
- my-values/{{valuesFile}}
destination:
server: https://kubernetes.default.svc
namespace: monitoring
syncPolicy:
automated:
prune: true
selfHeal: true
syncOptions:
- CreateNamespace=true

View File

@ -1,20 +1,20 @@
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: secrets-observability-stack
name: dcgm-exporter
namespace: argocd
spec:
project: observability-stack
project: ai-stack
source:
repoURL: ssh://git@gitea-ssh.dev-tools.svc.cluster.local:2222/dvirlabs/observability-stack.git
repoURL: https://git.dvirlabs.com/dvirlabs/observability-stack.git
targetRevision: HEAD
path: charts/secrets
path: charts/ollama
helm:
valueFiles:
- ../../manifests/secrets-observability-stack/values.yaml
- ../../manifests/ollama/values.yaml
destination:
server: https://kubernetes.default.svc
namespace: observability-stack
namespace: ai-stack
syncPolicy:
automated:
prune: true

View File

@ -0,0 +1,38 @@
apiVersion: argoproj.io/v1alpha1
kind: ApplicationSet
metadata:
name: eck-resources
namespace: argocd
spec:
generators:
- list:
elements:
- env: prod
valuesFile: values-prod.yaml
nameSuffix: eck-prod
host: kibana.dvirlabs.com
- env: int
valuesFile: values-int.yaml
nameSuffix: eck-int
host: kibana-int.dvirlabs.com
template:
metadata:
name: '{{nameSuffix}}'
spec:
project: observability
source:
repoURL: https://git.dvirlabs.com/dvirlabs/observability-stack.git
targetRevision: master
path: charts/eck-resources
helm:
valueFiles:
- my-values/{{valuesFile}}
destination:
server: https://kubernetes.default.svc
namespace: monitoring
syncPolicy:
automated:
prune: true
selfHeal: true
syncOptions:
- CreateNamespace=true

View File

@ -0,0 +1,30 @@
apiVersion: argoproj.io/v1alpha1
kind: ApplicationSet
metadata:
name: external-secrets-appset
namespace: argocd
spec:
generators:
- git:
repoURL: https://git.dvirlabs.com/dvirlabs/observability-stack.git
revision: master
directories:
- path: manifests/external-secrets
template:
metadata:
name: 'external-secret-{{path.basename}}'
spec:
project: observability
source:
repoURL: https://git.dvirlabs.com/dvirlabs/observability-stack.git
targetRevision: master
path: '{{path}}'
directory:
recurse: true
destination:
server: https://kubernetes.default.svc
namespace: monitoring
syncPolicy:
automated:
prune: true
selfHeal: true

View File

@ -0,0 +1,28 @@
apiVersion: argoproj.io/v1alpha1
kind: ApplicationSet
metadata:
name: prometheus-scrape-secret
namespace: argocd
spec:
generators:
- list:
elements:
- name: prometheus-scrape-secret
template:
metadata:
name: '{{name}}'
spec:
project: observability
source:
repoURL: https://git.dvirlabs.com/dvirlabs/observability-stack.git
targetRevision: HEAD
path: manifests/prometheus-scrape-secret
destination:
server: https://kubernetes.default.svc
namespace: monitoring
syncPolicy:
automated:
prune: true
selfHeal: true
syncOptions:
- CreateNamespace=true

View File

@ -0,0 +1,34 @@
apiVersion: argoproj.io/v1alpha1
kind: ApplicationSet
metadata:
name: rancher-monitoring-appset
namespace: argocd
spec:
generators:
- list:
elements:
- env: prod
valuesFile: values-prod.yaml
nameSuffix: rancher-monitoring-prod
host: grafana.dvirlabs.com
template:
metadata:
name: '{{nameSuffix}}'
spec:
project: observability
source:
repoURL: https://git.dvirlabs.com/dvirlabs/observability-stack.git
targetRevision: HEAD
path: charts/rancher-monitoring
helm:
valueFiles:
- my-values/{{valuesFile}}
destination:
server: https://kubernetes.default.svc
namespace: monitoring
syncPolicy:
automated:
prune: true
selfHeal: true
syncOptions:
- CreateNamespace=true

View File

@ -0,0 +1,34 @@
apiVersion: argoproj.io/v1alpha1
kind: ApplicationSet
metadata:
name: uptime-kuma
namespace: argocd
spec:
generators:
- list:
elements:
- env: prod
valuesFile: values-prod.yaml
nameSuffix: uptime-kuma-prod
host: kuma.dvirlabs.com
template:
metadata:
name: '{{nameSuffix}}'
spec:
project: observability
source:
repoURL: https://git.dvirlabs.com/dvirlabs/observability-stack.git
targetRevision: master
path: charts/uptime-kuma
helm:
valueFiles:
- my-values/{{valuesFile}}
destination:
server: https://kubernetes.default.svc
namespace: monitoring
syncPolicy:
automated:
prune: true
selfHeal: true
syncOptions:
- CreateNamespace=true

View File

@ -0,0 +1,23 @@
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: dcgm-exporter
namespace: argocd
spec:
project: ai-stack
source:
repoURL: https://git.dvirlabs.com/dvirlabs/observability-stack.git
targetRevision: HEAD
path: charts/ollama
helm:
valueFiles:
- ../../manifests/ollama/values.yaml
destination:
server: https://kubernetes.default.svc
namespace: ai-stack
syncPolicy:
automated:
prune: false
selfHeal: true
syncOptions:
- CreateNamespace=true

View File

@ -0,0 +1,23 @@
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: eck-resources
namespace: argocd
spec:
project: observability
source:
repoURL: https://git.dvirlabs.com/dvirlabs/observability-stack.git
targetRevision: HEAD
path: charts/eck-resources
helm:
valueFiles:
- ../../manifests/eck-resources/values.yaml
destination:
server: https://kubernetes.default.svc
namespace: monitoring
syncPolicy:
automated:
prune: false
selfHeal: true
syncOptions:
- CreateNamespace=true

View File

@ -0,0 +1,22 @@
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: external-secrets-appset
namespace: argocd
spec:
project: observability
source:
repoURL: https://git.dvirlabs.com/dvirlabs/observability-stack.git
targetRevision: master
path: manifests/external-secrets
directory:
recurse: true
destination:
server: https://kubernetes.default.svc
namespace: monitoring
syncPolicy:
automated:
prune: false
selfHeal: true
syncOptions:
- CreateNamespace=true

View File

@ -1,29 +0,0 @@
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: gitops-status-server
namespace: argocd
spec:
project: observability-stack
source:
repoURL: ssh://git@gitea-ssh.dev-tools.svc.cluster.local:2222/dvirlabs/observability-stack.git
targetRevision: HEAD
path: charts/gitops-status-server
helm:
valueFiles:
- ../../manifests/gitops-status-server/values.yaml
destination:
server: https://kubernetes.default.svc
namespace: observability-stack
syncPolicy:
automated:
prune: true
selfHeal: true
syncOptions:
- CreateNamespace=true
retry:
limit: 5
backoff:
duration: 5s
factor: 2
maxDuration: 3m

View File

@ -1,29 +0,0 @@
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: kube-prometheus-stack
namespace: argocd
spec:
project: observability-stack
source:
repoURL: ssh://git@gitea-ssh.dev-tools.svc.cluster.local:2222/dvirlabs/observability-stack.git
targetRevision: HEAD
path: charts/kube-prometheus-stack
helm:
valueFiles:
- ../../manifests/kube-prometheus-stack/values.yaml
destination:
server: https://kubernetes.default.svc
namespace: observability-stack
syncPolicy:
automated:
prune: true
selfHeal: true
syncOptions:
- CreateNamespace=true
retry:
limit: 5
backoff:
duration: 5s
factor: 2
maxDuration: 3m

View File

@ -0,0 +1,20 @@
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: prometheus-scrape-secret
namespace: argocd
spec:
project: observability
source:
repoURL: https://git.dvirlabs.com/dvirlabs/observability-stack.git
targetRevision: HEAD
path: manifests/prometheus-scrape-secret
destination:
server: https://kubernetes.default.svc
namespace: monitoring
syncPolicy:
automated:
prune: false
selfHeal: true
syncOptions:
- CreateNamespace=true

View File

@ -0,0 +1,23 @@
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: rancher-monitoring-appset
namespace: argocd
spec:
project: observability
source:
repoURL: https://git.dvirlabs.com/dvirlabs/observability-stack.git
targetRevision: HEAD
path: charts/rancher-monitoring
helm:
valueFiles:
- ../../manifests/rancher-monitoring/values.yaml
destination:
server: https://kubernetes.default.svc
namespace: monitoring
syncPolicy:
automated:
prune: false
selfHeal: true
syncOptions:
- CreateNamespace=true

View File

@ -1,22 +0,0 @@
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: raw-resources-observability-stack
namespace: argocd
spec:
project: observability-stack
source:
repoURL: ssh://git@gitea-ssh.dev-tools.svc.cluster.local:2222/dvirlabs/observability-stack.git
targetRevision: HEAD
path: manifests/raw-resources-observability-stack
directory:
recurse: true
destination:
server: https://kubernetes.default.svc
namespace: observability-stack
syncPolicy:
automated:
prune: true
selfHeal: true
syncOptions:
- CreateNamespace=true

View File

@ -0,0 +1,23 @@
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: uptime-kuma
namespace: argocd
spec:
project: observability
source:
repoURL: https://git.dvirlabs.com/dvirlabs/observability-stack.git
targetRevision: HEAD
path: charts/uptime-kuma
helm:
valueFiles:
- ../../manifests/uptime-kuma/values.yaml
destination:
server: https://kubernetes.default.svc
namespace: monitoring
syncPolicy:
automated:
prune: false
selfHeal: true
syncOptions:
- CreateNamespace=true

View File

@ -0,0 +1,5 @@
apiVersion: v2
name: eck-resources
description: Deploy ECK Elasticsearch and Kibana CRs
version: 0.1.0
appVersion: "8.12.0"

View File

@ -0,0 +1,3 @@
enabled: false
env: int
host: kibana-int.dvirlabs.com

View File

@ -0,0 +1,3 @@
enabled: true
env: prod
host: kibana.dvirlabs.com

View File

@ -0,0 +1,22 @@
# elasticsearch.yaml
apiVersion: elasticsearch.k8s.elastic.co/v1
kind: Elasticsearch
metadata:
name: elasticsearch-{{ .Values.env }}
namespace: monitoring
spec:
version: 8.12.0
nodeSets:
- name: default
count: 1
config:
node.store.allow_mmap: false
volumeClaimTemplates:
- metadata:
name: elasticsearch-data
spec:
accessModes: ["ReadWriteOnce"]
storageClassName: nfs-client
resources:
requests:
storage: 100Gi

View File

@ -0,0 +1,25 @@
# ingress.yaml (Kibana)
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: kibana-{{ .Values.env }}
namespace: monitoring
annotations:
kubernetes.io/ingress.class: traefik
# if behind Cloudflare, strongly recommended to disable cache for bundles:
traefik.ingress.kubernetes.io/browser-xss-filter: "true"
spec:
tls:
- hosts: [kibana.dvirlabs.com]
secretName: kibana-tls
rules:
- host: kibana.dvirlabs.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: kibana-{{ .Values.env }}-kb-http
port:
number: 5601

View File

@ -0,0 +1,27 @@
# kibana.yaml
apiVersion: kibana.k8s.elastic.co/v1
kind: Kibana
metadata:
name: kibana-{{ .Values.env }}
namespace: monitoring
spec:
version: 8.12.0
count: 1
elasticsearchRef:
name: elasticsearch-{{ .Values.env }} # same ns: monitoring
config:
# set correct external URL for Ingress
server.publicBaseUrl: "https://kibana.dvirlabs.com"
# if you use a path like /kibana, also set:
# server.basePath: "/kibana"
# server.rewriteBasePath: true
xpack.security.authc.providers:
basic.basic1:
order: 0
http:
tls:
selfSignedCertificate:
disabled: true # Ingress terminates TLS
service:
spec:
type: ClusterIP

View File

@ -1,14 +0,0 @@
apiVersion: v2
name: gitops-status-server
description: A minimal HTTP server that serves GitOps status information as JSON
type: application
version: 1.0.0
appVersion: "1.25.5"
keywords:
- gitops
- status
- monitoring
- nginx
maintainers:
- name: DevOps Team
home: https://github.com/your-org/observability-stack

View File

@ -1,478 +0,0 @@
# GitOps Status Server Helm Chart
A dual-container HTTP server that receives GitOps status updates via POST API and serves status information as JSON for monitoring and observability purposes.
## Overview
This chart deploys a two-container pod:
1. **Nginx** - Serves `/status.json` endpoint for monitoring tools and handles API routing
2. **Flask API** - Processes POST requests to `/api/status` and updates the status JSON
It's designed to be consumed by Grafana's Infinity datasource or other monitoring tools, and to receive updates from CI/CD pipelines like Woodpecker.
## Architecture
```
CI/CD Pipeline (Woodpecker)
POST /api/status
Kubernetes Service (port 80)
Nginx (port 8080)
├─→ /api/status → Proxies to Flask (localhost:5000)
└─→ /status.json → Serves static file
Shared Volume (emptyDir)
├─→ status.json (updated by Flask API)
└─→ Read by Nginx
Grafana Infinity Datasource
Reads /status.json
```
## Features
- **API-driven updates**: POST endpoint for CI/CD pipelines to update status
- **Read-only serving**: Grafana-friendly JSON endpoint
- **Minimal footprint**: nginx-unprivileged + Python-Alpine with minimal resources
- **Secure by default**: Runs as non-root with restricted filesystems
- **Internal only**: ClusterIP service for cluster-internal access
- **ArgoCD compatible**: Init container auto-initializes status from ConfigMap
- **Production-ready**: Includes health checks, security contexts, and resource limits
## Installation
### Using Helm
```bash
# Install with default values
helm install gitops-status ./gitops-status-server
# Install with custom namespace
helm install gitops-status ./gitops-status-server -n observability-stack --create-namespace
# Install with custom values
helm install gitops-status ./gitops-status-server -f custom-values.yaml
```
### Using ArgoCD
Create an Application manifest:
```yaml
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: gitops-status-server
namespace: argocd
spec:
project: default
source:
repoURL: https://github.com/your-org/observability-stack
targetRevision: main
path: gitops-status-server
helm:
values: |
replicaCount: 1
statusJson:
repo: "rsyslog"
server: "rsyslog-lab"
sync_status: "UNKNOWN"
```
## API Endpoints
### GET /status.json
Returns the current status JSON
```bash
curl http://gitops-status-server.observability-stack.svc.cluster.local:80/status.json
```
Response:
```json
{
"repo": "rsyslog",
"server": "rsyslog-lab",
"sync_status": "SYNCED",
"drift_count": 0,
"files": [],
"last_check": "2026-04-21T10:30:00Z"
}
```
### POST /api/status
Updates the status with new data
```bash
curl -X POST http://gitops-status-server.observability-stack.svc.cluster.local:80/api/status \
-H "Content-Type: application/json" \
-d '{
"repo": "rsyslog",
"server": "rsyslog-lab",
"sync_status": "OUT_OF_SYNC",
"drift_count": 2,
"files": [
{"name": "rsyslog.conf"},
{"name": "rsyslog.d/30-lab.conf"}
],
"last_check": "2026-04-21T10:30:00Z"
}'
```
Response (HTTP 200):
```json
{
"success": true,
"message": "Status updated successfully",
"status": { ... }
}
```
### GET /health
Health check endpoint (returns HTTP 200)
```bash
curl http://gitops-status-server.observability-stack.svc.cluster.local:80/health
```
### GET /ready
Readiness check (verifies status file is readable)
```bash
curl http://gitops-status-server.observability-stack.svc.cluster.local:80/ready
```
## Integration with Woodpecker
The rsyslog CI/CD pipeline can update status by POSTing to the `/api/status` endpoint:
```bash
#!/bin/bash
GITOPS_STATUS_SERVER_URL="http://gitops-status-server.observability-stack.svc.cluster.local:80"
STATUS_JSON='{
"repo": "rsyslog",
"server": "rsyslog-lab",
"sync_status": "SYNCED",
"drift_count": 0,
"files": [],
"last_check": "2026-04-21T10:30:00Z"
}'
curl -X POST "$GITOPS_STATUS_SERVER_URL/api/status" \
-H "Content-Type: application/json" \
-d "$STATUS_JSON"
```
## Service Discovery
### Internal Kubernetes URL
```
http://gitops-status-server.observability-stack.svc.cluster.local:80/status.json
```
### Port Forwarding (for local testing)
```bash
kubectl port-forward -n observability-stack svc/gitops-status-server 8080:80
# Then access at http://localhost:8080/status.json
```
### NodePort (if service type is changed)
```bash
kubectl patch service -n observability-stack gitops-status-server -p '{"spec":{"type":"NodePort"}}'
# Then access at http://<node-ip>:<node-port>/status.json
```
## Configuration
See `values.yaml` for all configuration options:
- `replicaCount`: Number of replicas
- `image.repository`: Container image
- `image.tag`: Image tag
- `service.type`: Service type (ClusterIP, NodePort, LoadBalancer)
- `service.port`: Service port (default 80)
- `service.targetPort`: Container port (default 8080)
- `resources`: CPU/memory limits and requests
- `statusJson`: Default status JSON values
- `api.image.*`: Python/Flask image configuration
## Grafana Integration
### Infinity Datasource Configuration
1. Install Infinity datasource plugin:
```bash
grafana-cli plugins install yesoreyeram-infinity-datasource
```
2. Add datasource with URL:
```
http://gitops-status-server.observability-stack.svc.cluster.local:80/status.json
```
3. Create panels to visualize:
- `sync_status`: Current synchronization state
- `drift_count`: Number of drifted files
- `files[]`: List of changed files
- `last_check`: Timestamp of last check
### Example Query
```json
{
"url": "http://gitops-status-server.observability-stack.svc.cluster.local:80/status.json",
"format": "json"
}
```
## Security
- Runs as non-root user (UID 101)
- Read-only root filesystem (except for /tmp, /var/cache/nginx, /var/run)
- No privileged capabilities
- Network policies recommended for production
- Service Account with minimal RBAC
## Troubleshooting
### POST Request Returns 400 Error
**Issue**: "Invalid JSON" error
**Solution**: Verify JSON formatting with:
```bash
echo '{...}' | jq '.'
```
### POST Updates Not Appearing in GET Response
**Issue**: Update endpoint returns 200 but status.json isn't updated
**Possible causes**:
- Shared volume permission issue
- API container crashed after POST
- Status file permissions
**Debug**:
```bash
# Check logs
kubectl logs -f deployment/gitops-status-server -c api
kubectl logs -f deployment/gitops-status-server -c nginx
# Check shared volume
kubectl exec deployment/gitops-status-server -c nginx -- ls -la /usr/share/nginx/html/
# Test API directly (port-forward to 5000 first)
kubectl port-forward deployment/gitops-status-server 5000:5000
curl -X POST http://localhost:5000/api/status -H "Content-Type: application/json" -d '{...}'
```
### Connection Refused to gitops-status-server
**Issue**: Woodpecker can't reach the service
**Possible causes**:
- Service in different namespace
- Network policies blocking traffic
- Woodpecker outside cluster
- Service DNS name incorrect
**Solutions**:
- Verify service exists: `kubectl get svc gitops-status-server -n observability-stack`
- Use NodePort for external access (update service type in values)
- Use port-forward as a temporary solution
- Verify network policies allow traffic
## Performance
- **CPU**: 150m limit (100m nginx + 100m API)
- **Memory**: 192Mi limit (64Mi nginx + 128Mi API)
- **Startup time**: ~5 seconds (Flask app install + startup)
- **Update latency**: <100ms (direct file write)
- **Read performance**: <10ms (static file serving)
## License
Same as observability-stack repository
statusJson:
repo: "my-repo"
server: "my-server"
sync_status: "SYNCED"
drift_count: 0
files: []
last_check: "2026-04-21T10:00:00Z"
destination:
server: https://kubernetes.default.svc
namespace: monitoring
syncPolicy:
automated:
prune: true
selfHeal: true
```
## Configuration
### Key Values
| Parameter | Description | Default |
|-----------|-------------|---------|
| `replicaCount` | Number of replicas | `1` |
| `image.repository` | Container image repository | `nginxinc/nginx-unprivileged` |
| `image.tag` | Container image tag | `1.25-alpine` |
| `service.type` | Kubernetes service type | `ClusterIP` |
| `service.port` | Service port | `80` |
| `service.targetPort` | Container target port | `8080` |
| `resources.limits.cpu` | CPU limit | `100m` |
| `resources.limits.memory` | Memory limit | `64Mi` |
| `statusJson` | JSON content to serve | See values.yaml |
### Custom Status JSON
Override the status JSON content in your values:
```yaml
statusJson:
repo: "production-apps"
server: "prod-cluster-01"
sync_status: "SYNCED"
drift_count: 2
files:
- "deployment.yaml"
- "service.yaml"
last_check: "2026-04-21T12:30:00Z"
```
## Usage
### Access the Status Endpoint
From inside the cluster:
```bash
# Using the service DNS name
curl http://gitops-status-server/status.json
# With namespace
curl http://gitops-status-server.monitoring.svc.cluster.local/status.json
```
### Grafana Infinity Datasource Configuration
1. Add an Infinity datasource in Grafana
2. Configure URL: `http://gitops-status-server.monitoring.svc.cluster.local/status.json`
3. Parser: JSON
4. Use fields from the JSON response in your dashboard
Example query fields:
- `sync_status` - Current sync status
- `drift_count` - Number of drifted resources
- `files` - List of changed files
- `last_check` - Timestamp of last check
## Updating Status Data
### Manual Update
Edit the ConfigMap directly:
```bash
kubectl edit configmap gitops-status-server -n monitoring
```
The deployment will automatically roll out with the new content due to the ConfigMap checksum annotation.
### Automated Update via Pipeline
Use `kubectl` in your CI/CD pipeline:
```bash
kubectl create configmap gitops-status-server \
--from-file=status.json=./status.json \
--dry-run=client -o yaml | kubectl apply -f -
```
### ArgoCD Hook (Advanced)
Create a PostSync hook that updates the ConfigMap with current sync status:
```yaml
apiVersion: batch/v1
kind: Job
metadata:
name: update-status
annotations:
argocd.argoproj.io/hook: PostSync
spec:
template:
spec:
containers:
- name: update
image: bitnami/kubectl
command:
- /bin/sh
- -c
- |
# Update status.json with current sync status
kubectl patch configmap gitops-status-server \
--patch '{"data":{"status.json":"..."}}'
restartPolicy: Never
```
## Security Considerations
- Runs as non-root user (UID 101)
- Read-only root filesystem
- No privilege escalation
- Minimal capabilities (all dropped)
- No external network access required
- ClusterIP only (no external exposure)
## Resource Requirements
Minimal resource footprint suitable for small clusters:
- CPU: 50m request / 100m limit
- Memory: 32Mi request / 64Mi limit
## Troubleshooting
### Check pod status
```bash
kubectl get pods -l app.kubernetes.io/name=gitops-status-server
```
### View logs
```bash
kubectl logs -l app.kubernetes.io/name=gitops-status-server
```
### Test endpoint
```bash
kubectl run -it --rm curl --image=curlimages/curl --restart=Never -- \
curl http://gitops-status-server/status.json
```
### Common Issues
**Pod not starting**: Check security context compatibility with your cluster's PSP/PSA policies.
**Empty response**: Verify the ConfigMap is mounted correctly:
```bash
kubectl describe pod -l app.kubernetes.io/name=gitops-status-server
```
**Service not accessible**: Ensure you're accessing from within the cluster and using the correct namespace.
## License
This chart is part of the observability-stack project.
## Maintainers
- DevOps Team

View File

@ -1,142 +0,0 @@
{{/*
ConfigMap containing the API backend Python script
Handles POST requests to /api/status and updates the status.json file
*/}}
apiVersion: v1
kind: ConfigMap
metadata:
name: {{ include "gitops-status-server.fullname" . }}-api
labels:
{{- include "gitops-status-server.labels" . | nindent 4 }}
data:
app.py: |
#!/usr/bin/env python3
"""
Simple Flask API for updating status.json
Listens on port 5000 and handles POST requests to /api/status
"""
import os
import json
import logging
from flask import Flask, request, jsonify
from datetime import datetime
app = Flask(__name__)
# Configuration
STATUS_FILE = '/usr/share/nginx/html/status.json'
API_PORT = int(os.environ.get('API_PORT', 5000))
API_HOST = os.environ.get('API_HOST', '127.0.0.1')
# Setup logging
logging.basicConfig(
level=logging.INFO,
format='%(asctime)s - %(name)s - %(levelname)s - %(message)s'
)
logger = logging.getLogger(__name__)
def load_status():
"""Load the current status from file"""
try:
if os.path.exists(STATUS_FILE):
with open(STATUS_FILE, 'r') as f:
return json.load(f)
else:
# Default status if file doesn't exist
return {
"repo": "unknown",
"server": "unknown",
"sync_status": "UNKNOWN",
"drift_count": 0,
"files": [],
"last_check": ""
}
except Exception as e:
logger.error(f"Error loading status: {e}")
return {}
def save_status(status):
"""Save the status to file"""
try:
# Ensure directory exists (should already exist from mount)
os.makedirs(os.path.dirname(STATUS_FILE), exist_ok=True)
# Write with proper formatting
with open(STATUS_FILE, 'w') as f:
json.dump(status, f, indent=2)
logger.info(f"Status saved successfully: {status['repo']}/{status['server']} -> {status['sync_status']}")
return True
except Exception as e:
logger.error(f"Error saving status: {e}")
return False
@app.route('/api/status', methods=['GET', 'POST', 'OPTIONS'])
def api_status():
"""
GET: Retrieve current status
POST: Update status with new data
"""
if request.method == 'OPTIONS':
return '', 204
if request.method == 'GET':
status = load_status()
return jsonify(status), 200
if request.method == 'POST':
try:
# Parse incoming JSON
incoming_data = request.get_json()
if not incoming_data:
return jsonify({"error": "No JSON data provided"}), 400
# Load current status
status = load_status()
# Update with incoming data (merge)
status.update(incoming_data)
# Ensure required fields exist
if 'last_check' not in status or not status['last_check']:
status['last_check'] = datetime.utcnow().isoformat() + 'Z'
# Save updated status
if save_status(status):
return jsonify({
"success": True,
"message": "Status updated successfully",
"status": status
}), 200
else:
return jsonify({
"error": "Failed to save status"
}), 500
except json.JSONDecodeError:
return jsonify({"error": "Invalid JSON"}), 400
except Exception as e:
logger.error(f"Error processing POST request: {e}")
return jsonify({"error": str(e)}), 500
@app.route('/health', methods=['GET'])
def health():
"""Health check endpoint"""
return jsonify({"status": "healthy"}), 200
@app.route('/ready', methods=['GET'])
def ready():
"""Readiness check - verify status file is accessible"""
try:
status = load_status()
if status:
return jsonify({"status": "ready"}), 200
else:
return jsonify({"status": "not_ready", "reason": "status file empty"}), 503
except Exception as e:
return jsonify({"status": "not_ready", "error": str(e)}), 503
if __name__ == '__main__':
logger.info(f"Starting gitops-status-server API on {API_HOST}:{API_PORT}")
logger.info(f"Status file: {STATUS_FILE}")
app.run(host=API_HOST, port=API_PORT, debug=False)

View File

@ -1,22 +0,0 @@
{{/*
ConfigMap for default status.json values
Used by init container to set up initial status if file doesn't exist
*/}}
apiVersion: v1
kind: ConfigMap
metadata:
name: {{ include "gitops-status-server.fullname" . }}
labels:
{{- include "gitops-status-server.labels" . | nindent 4 }}
{{- with .Values.annotations }}
annotations:
{{- toYaml . | nindent 4 }}
{{- end }}
data:
# Default status.json values (used for initialization)
# This is not mounted directly; instead it's used by the init container
# to set up the initial status.json in the shared emptyDir volume.
# The actual status.json is stored on the emptyDir and updated via the API.
status.json: |
{{- .Values.statusJson | toJson | nindent 4 }}

View File

@ -1,94 +0,0 @@
{{/*
Deployment for the gitops-status-server
Runs a simple Flask API for status updates
Uses the gitops-status-api Docker image
*/}}
apiVersion: apps/v1
kind: Deployment
metadata:
name: {{ include "gitops-status-server.fullname" . }}
labels:
{{- include "gitops-status-server.labels" . | nindent 4 }}
{{- with .Values.annotations }}
annotations:
{{- toYaml . | nindent 4 }}
{{- end }}
spec:
replicas: {{ .Values.replicaCount }}
selector:
matchLabels:
{{- include "gitops-status-server.selectorLabels" . | nindent 6 }}
template:
metadata:
{{- with .Values.podAnnotations }}
annotations:
{{- toYaml . | nindent 8 }}
{{- end }}
labels:
{{- include "gitops-status-server.selectorLabels" . | nindent 8 }}
spec:
{{- with .Values.imagePullSecrets }}
imagePullSecrets:
{{- toYaml . | nindent 8 }}
{{- end }}
serviceAccountName: {{ include "gitops-status-server.serviceAccountName" . }}
securityContext:
{{- toYaml .Values.podSecurityContext | nindent 8 }}
containers:
- name: api
image: "{{ .Values.api.image.repository }}:{{ .Values.api.image.tag }}"
imagePullPolicy: {{ .Values.api.image.pullPolicy }}
ports:
- name: http
containerPort: 5000
protocol: TCP
env:
- name: API_HOST
value: "0.0.0.0"
- name: API_PORT
value: "5000"
- name: FLASK_ENV
value: "production"
- name: STATUS_FILE
value: "/data/status.json"
livenessProbe:
httpGet:
path: /health
port: http
initialDelaySeconds: 20
periodSeconds: 10
timeoutSeconds: 5
failureThreshold: 3
readinessProbe:
httpGet:
path: /ready
port: http
initialDelaySeconds: 20
periodSeconds: 5
timeoutSeconds: 5
failureThreshold: 2
resources:
{{- toYaml .Values.resources | nindent 10 }}
volumeMounts:
- name: data
mountPath: /data
volumes:
# Data volume for status.json (writable emptyDir)
- name: data
emptyDir:
sizeLimit: 1Mi
{{- with .Values.nodeSelector }}
nodeSelector:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- with .Values.affinity }}
affinity:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- with .Values.tolerations }}
tolerations:
{{- toYaml . | nindent 8 }}
{{- end }}

View File

@ -1,96 +0,0 @@
{{/*
ConfigMap containing the nginx configuration
Enables serving status.json via GET and updating via POST requests
*/}}
apiVersion: v1
kind: ConfigMap
metadata:
name: {{ include "gitops-status-server.fullname" . }}-nginx-config
labels:
{{- include "gitops-status-server.labels" . | nindent 4 }}
data:
nginx.conf: |
# Minimal nginx config for serving and updating status.json
user nginx;
worker_processes auto;
error_log /var/log/nginx/error.log warn;
pid /var/run/nginx.pid;
events {
worker_connections 1024;
}
http {
include /etc/nginx/mime.types;
default_type application/octet-stream;
log_format main '$remote_addr - $remote_user [$time_local] "$request" '
'$status $body_bytes_sent "$http_referer" '
'"$http_user_agent" "$http_x_forwarded_for"';
access_log /var/log/nginx/access.log main;
sendfile on;
tcp_nopush on;
tcp_nodelay on;
keepalive_timeout 65;
types_hash_max_size 2048;
client_max_body_size 1M;
# Gzip compression
gzip on;
gzip_vary on;
gzip_types text/plain text/css text/xml text/javascript
application/x-javascript application/xml+rss
application/json;
upstream api_backend {
server 127.0.0.1:5000;
keepalive 32;
}
server {
listen 8080 default_server;
server_name _;
# Serve status.json as read-only
location /status.json {
alias /usr/share/nginx/html/status.json;
add_header Cache-Control "no-cache, no-store, must-revalidate";
add_header Pragma "no-cache";
add_header Expires "0";
}
# Health check endpoint
location /health {
access_log off;
return 200 "healthy\n";
add_header Content-Type text/plain;
}
# Proxy POST requests to the API backend (Python Flask)
location /api/ {
proxy_pass http://api_backend;
proxy_http_version 1.1;
proxy_set_header Connection "";
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
# Buffer settings for POST requests
proxy_request_buffering off;
proxy_buffering off;
# Timeouts
proxy_connect_timeout 30s;
proxy_send_timeout 30s;
proxy_read_timeout 30s;
}
# Catch-all for root
location / {
return 301 /status.json;
}
}
}

View File

@ -1,27 +0,0 @@
{{/*
Service for the gitops-status-server
Exposes the Flask API inside the cluster (ClusterIP)
This allows rsyslog pipeline and Grafana to query the API endpoints
*/}}
apiVersion: v1
kind: Service
metadata:
name: {{ include "gitops-status-server.fullname" . }}
labels:
{{- include "gitops-status-server.labels" . | nindent 4 }}
{{- with .Values.service.annotations }}
annotations:
{{- toYaml . | nindent 4 }}
{{- end }}
spec:
type: {{ .Values.service.type }}
ports:
- port: {{ .Values.service.port }}
targetPort: {{ .Values.service.targetPort | default 5000 }}
protocol: TCP
name: http
{{- if and (eq .Values.service.type "NodePort") .Values.service.nodePort }}
nodePort: {{ .Values.service.nodePort }}
{{- end }}
selector:
{{- include "gitops-status-server.selectorLabels" . | nindent 4 }}

View File

@ -1,88 +0,0 @@
# Default values for gitops-status-server
# This is a YAML-formatted file.
# Declare variables to be passed into your templates.
# Number of replicas for the deployment
replicaCount: 1
# API backend container configuration
api:
image:
# Use the gitops-status-api image (Python Flask API)
# Build from: gitops-status-api/Dockerfile
# Tag with: docker build -t gitops-status-api:latest gitops-status-api/
# Can be from Harbor registry or built locally
repository: gitops-status-api
pullPolicy: IfNotPresent
tag: "latest"
# Image pull secrets for private registries
imagePullSecrets: []
# Override the name of the chart
nameOverride: ""
fullnameOverride: ""
# Service configuration
service:
# Service type - NodePort for external access, ClusterIP for internal-only
type: ClusterIP
# Port where the service will be exposed
port: 5000
# Target port on the container (API port)
targetPort: 5000
# NodePort (30000-32767) for external access when type is NodePort
nodePort: null
# Annotations to add to the service
annotations: {}
# Resource limits and requests
resources:
limits:
cpu: 100m
memory: 128Mi
requests:
cpu: 50m
memory: 64Mi
# Node selector for pod assignment
nodeSelector: {}
# Tolerations for pod assignment
tolerations: []
# Affinity rules for pod assignment
affinity: {}
# Security context for the pod
podSecurityContext:
runAsNonRoot: true
runAsUser: 1000
fsGroup: 1000
# Security context for the container
securityContext:
allowPrivilegeEscalation: false
capabilities:
drop:
- ALL
readOnlyRootFilesystem: false
# Labels to add to all resources
labels: {}
# Annotations to add to all resources
annotations: {}
# Pod annotations
podAnnotations: {}
# Service account configuration
serviceAccount:
# Specifies whether a service account should be created
create: true
# Annotations to add to the service account
annotations: {}
# The name of the service account to use.
# If not set and create is true, a name is generated using the fullname template
name: ""

View File

@ -1,18 +0,0 @@
dependencies:
- name: crds
repository: ""
version: 0.0.0
- name: kube-state-metrics
repository: https://prometheus-community.github.io/helm-charts
version: 7.2.2
- name: prometheus-node-exporter
repository: https://prometheus-community.github.io/helm-charts
version: 4.53.1
- name: grafana
repository: https://grafana-community.github.io/helm-charts
version: 11.6.1
- name: prometheus-windows-exporter
repository: https://prometheus-community.github.io/helm-charts
version: 0.12.6
digest: sha256:e21304bc9748d1449437449b6e8819afeed2f1f68c473efb775f712790bdff40
generated: "2026-04-14T18:06:28.207180094Z"

View File

@ -1,72 +0,0 @@
annotations:
artifacthub.io/license: Apache-2.0
artifacthub.io/links: |
- name: Chart Source
url: https://github.com/prometheus-community/helm-charts
- name: Upstream Project
url: https://github.com/prometheus-operator/kube-prometheus
- name: Upgrade Process
url: https://github.com/prometheus-community/helm-charts/blob/main/charts/kube-prometheus-stack/README.md#upgrading-chart
artifacthub.io/operator: "true"
apiVersion: v2
appVersion: v0.90.1
dependencies:
- condition: crds.enabled
name: crds
repository: ""
version: 0.0.0
- condition: kubeStateMetrics.enabled
name: kube-state-metrics
repository: https://prometheus-community.github.io/helm-charts
version: 7.2.2
- condition: nodeExporter.enabled
name: prometheus-node-exporter
repository: https://prometheus-community.github.io/helm-charts
version: 4.53.1
- condition: grafana.enabled
name: grafana
repository: https://grafana-community.github.io/helm-charts
version: 11.6.1
- condition: windowsMonitoring.enabled
name: prometheus-windows-exporter
repository: https://prometheus-community.github.io/helm-charts
version: 0.12.*
description: kube-prometheus-stack collects Kubernetes manifests, Grafana dashboards,
and Prometheus rules combined with documentation and scripts to provide easy to
operate end-to-end Kubernetes cluster monitoring with Prometheus using the Prometheus
Operator.
home: https://github.com/prometheus-operator/kube-prometheus
icon: https://raw.githubusercontent.com/prometheus/prometheus.github.io/master/assets/prometheus_logo-cb55bb5c346.png
keywords:
- operator
- prometheus
- kube-prometheus
kubeVersion: '>=1.25.0-0'
maintainers:
- email: andrew@quadcorps.co.uk
name: andrewgkew
url: https://github.com/andrewgkew
- email: gianrubio@gmail.com
name: gianrubio
url: https://github.com/gianrubio
- email: github.gkarthiks@gmail.com
name: gkarthiks
url: https://github.com/gkarthiks
- email: kube-prometheus-stack@sisti.pt
name: GMartinez-Sisti
url: https://github.com/GMartinez-Sisti
- email: github@jkroepke.de
name: jkroepke
url: https://github.com/jkroepke
- email: miroslav.hadzhiev@gmail.com
name: Xtigyro
url: https://github.com/Xtigyro
- email: quentin.bisson@gmail.com
name: QuentinBisson
url: https://github.com/QuentinBisson
name: kube-prometheus-stack
sources:
- https://github.com/prometheus-community/helm-charts
- https://github.com/prometheus-operator/kube-prometheus
type: application
version: 83.4.2

View File

@ -1,3 +0,0 @@
apiVersion: v2
name: crds
version: 0.0.0

View File

@ -1,3 +0,0 @@
# crds subchart
See: [https://github.com/prometheus-community/helm-charts/issues/3548](https://github.com/prometheus-community/helm-charts/issues/3548)

File diff suppressed because it is too large Load Diff

View File

@ -1,267 +0,0 @@
# https://raw.githubusercontent.com/prometheus-operator/prometheus-operator/v0.90.1/example/prometheus-operator-crd/monitoring.coreos.com_prometheusrules.yaml
---
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
annotations:
controller-gen.kubebuilder.io/version: v0.19.0
operator.prometheus.io/version: 0.90.1
name: prometheusrules.monitoring.coreos.com
spec:
group: monitoring.coreos.com
names:
categories:
- prometheus-operator
kind: PrometheusRule
listKind: PrometheusRuleList
plural: prometheusrules
shortNames:
- promrule
singular: prometheusrule
scope: Namespaced
versions:
- name: v1
schema:
openAPIV3Schema:
description: |-
The `PrometheusRule` custom resource definition (CRD) defines [alerting](https://prometheus.io/docs/prometheus/latest/configuration/alerting_rules/) and [recording](https://prometheus.io/docs/prometheus/latest/configuration/recording_rules/) rules to be evaluated by `Prometheus` or `ThanosRuler` objects.
`Prometheus` and `ThanosRuler` objects select `PrometheusRule` objects using label and namespace selectors.
properties:
apiVersion:
description: |-
APIVersion defines the versioned schema of this representation of an object.
Servers should convert recognized schemas to the latest internal value, and
may reject unrecognized values.
More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources
type: string
kind:
description: |-
Kind is a string value representing the REST resource this object represents.
Servers may infer this from the endpoint the client submits requests to.
Cannot be updated.
In CamelCase.
More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds
type: string
metadata:
type: object
spec:
description: spec defines the specification of desired alerting rule definitions
for Prometheus.
properties:
groups:
description: groups defines the content of Prometheus rule file
items:
description: RuleGroup is a list of sequentially evaluated recording
and alerting rules.
properties:
interval:
description: interval defines how often rules in the group are
evaluated.
pattern: ^(0|(([0-9]+)y)?(([0-9]+)w)?(([0-9]+)d)?(([0-9]+)h)?(([0-9]+)m)?(([0-9]+)s)?(([0-9]+)ms)?)$
type: string
labels:
additionalProperties:
type: string
description: |-
labels define the labels to add or overwrite before storing the result for its rules.
The labels defined at the rule level take precedence.
It requires Prometheus >= 3.0.0.
The field is ignored for Thanos Ruler.
type: object
limit:
description: |-
limit defines the number of alerts an alerting rule and series a recording
rule can produce.
Limit is supported starting with Prometheus >= 2.31 and Thanos Ruler >= 0.24.
type: integer
name:
description: name defines the name of the rule group.
minLength: 1
type: string
partial_response_strategy:
description: |-
partial_response_strategy is only used by ThanosRuler and will
be ignored by Prometheus instances.
More info: https://github.com/thanos-io/thanos/blob/main/docs/components/rule.md#partial-response
pattern: ^(?i)(abort|warn)?$
type: string
query_offset:
description: |-
query_offset defines the offset the rule evaluation timestamp of this particular group by the specified duration into the past.
It requires Prometheus >= v2.53.0.
It is not supported for ThanosRuler.
pattern: ^(0|(([0-9]+)y)?(([0-9]+)w)?(([0-9]+)d)?(([0-9]+)h)?(([0-9]+)m)?(([0-9]+)s)?(([0-9]+)ms)?)$
type: string
rules:
description: rules defines the list of alerting and recording
rules.
items:
description: |-
Rule describes an alerting or recording rule
See Prometheus documentation: [alerting](https://www.prometheus.io/docs/prometheus/latest/configuration/alerting_rules/) or [recording](https://www.prometheus.io/docs/prometheus/latest/configuration/recording_rules/#recording-rules) rule
properties:
alert:
description: |-
alert defines the name of the alert. Must be a valid label value.
Only one of `record` and `alert` must be set.
type: string
annotations:
additionalProperties:
type: string
description: |-
annotations defines annotations to add to each alert.
Only valid for alerting rules.
type: object
expr:
anyOf:
- type: integer
- type: string
description: expr defines the PromQL expression to evaluate.
x-kubernetes-int-or-string: true
for:
description: for defines how alerts are considered firing
once they have been returned for this long.
pattern: ^(0|(([0-9]+)y)?(([0-9]+)w)?(([0-9]+)d)?(([0-9]+)h)?(([0-9]+)m)?(([0-9]+)s)?(([0-9]+)ms)?)$
type: string
keep_firing_for:
description: keep_firing_for defines how long an alert
will continue firing after the condition that triggered
it has cleared.
minLength: 1
pattern: ^(0|(([0-9]+)y)?(([0-9]+)w)?(([0-9]+)d)?(([0-9]+)h)?(([0-9]+)m)?(([0-9]+)s)?(([0-9]+)ms)?)$
type: string
labels:
additionalProperties:
type: string
description: labels defines labels to add or overwrite.
type: object
record:
description: |-
record defines the name of the time series to output to. Must be a valid metric name.
Only one of `record` and `alert` must be set.
type: string
required:
- expr
type: object
type: array
required:
- name
type: object
type: array
x-kubernetes-list-map-keys:
- name
x-kubernetes-list-type: map
type: object
status:
description: |-
status defines the status subresource. It is under active development and is updated only when the
"StatusForConfigurationResources" feature gate is enabled.
Most recent observed status of the PrometheusRule. Read-only.
More info:
https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/api-conventions.md#spec-and-status
properties:
bindings:
description: bindings defines the list of workload resources (Prometheus,
PrometheusAgent, ThanosRuler or Alertmanager) which select the configuration
resource.
items:
description: WorkloadBinding is a link between a configuration resource
and a workload resource.
properties:
conditions:
description: conditions defines the current state of the configuration
resource when bound to the referenced Workload object.
items:
description: ConfigResourceCondition describes the status
of configuration resources linked to Prometheus, PrometheusAgent,
Alertmanager or ThanosRuler.
properties:
lastTransitionTime:
description: lastTransitionTime defines the time of the
last update to the current status property.
format: date-time
type: string
message:
description: message defines the human-readable message
indicating details for the condition's last transition.
type: string
observedGeneration:
description: |-
observedGeneration defines the .metadata.generation that the
condition was set based upon. For instance, if `.metadata.generation` is
currently 12, but the `.status.conditions[].observedGeneration` is 9, the
condition is out of date with respect to the current state of the object.
format: int64
type: integer
reason:
description: reason for the condition's last transition.
type: string
status:
description: status of the condition.
minLength: 1
type: string
type:
description: |-
type of the condition being reported.
Currently, only "Accepted" is supported.
enum:
- Accepted
minLength: 1
type: string
required:
- lastTransitionTime
- status
- type
type: object
type: array
x-kubernetes-list-map-keys:
- type
x-kubernetes-list-type: map
group:
description: group defines the group of the referenced resource.
enum:
- monitoring.coreos.com
type: string
name:
description: name defines the name of the referenced object.
minLength: 1
type: string
namespace:
description: namespace defines the namespace of the referenced
object.
minLength: 1
type: string
resource:
description: resource defines the type of resource being referenced
(e.g. Prometheus, PrometheusAgent, ThanosRuler or Alertmanager).
enum:
- prometheuses
- prometheusagents
- thanosrulers
- alertmanagers
type: string
required:
- group
- name
- namespace
- resource
type: object
type: array
x-kubernetes-list-map-keys:
- group
- resource
- name
- namespace
x-kubernetes-list-type: map
type: object
required:
- spec
type: object
served: true
storage: true
subresources:
status: {}

View File

@ -1,20 +0,0 @@
{{/* Shortened name suffixed with upgrade-crd */}}
{{- define "kube-prometheus-stack.crd.upgradeJob.name" -}}
{{- print (include "kube-prometheus-stack.fullname" .) "-upgrade" -}}
{{- end -}}
{{- define "kube-prometheus-stack.crd.upgradeJob.labels" -}}
{{- include "kube-prometheus-stack.labels" . }}
app: {{ template "kube-prometheus-stack.name" . }}-operator
app.kubernetes.io/name: {{ template "kube-prometheus-stack.name" . }}-prometheus-operator
app.kubernetes.io/component: crds-upgrade
{{- end -}}
{{/* Create the name of crd.upgradeJob service account to use */}}
{{- define "kube-prometheus-stack.crd.upgradeJob.serviceAccountName" -}}
{{- if .Values.upgradeJob.serviceAccount.create -}}
{{ default (include "kube-prometheus-stack.crd.upgradeJob.name" .) .Values.upgradeJob.serviceAccount.name }}
{{- else -}}
{{ default "default" .Values.upgradeJob.serviceAccount.name }}
{{- end -}}
{{- end -}}

View File

@ -1,28 +0,0 @@
{{- if .Values.upgradeJob.enabled }}
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: {{ template "kube-prometheus-stack.crd.upgradeJob.name" . }}
namespace: {{ template "kube-prometheus-stack.namespace" . }}
annotations:
"helm.sh/hook": pre-install,pre-upgrade,pre-rollback
"helm.sh/hook-weight": "-5"
"helm.sh/hook-delete-policy": before-hook-creation,hook-succeeded
labels:
{{- include "kube-prometheus-stack.crd.upgradeJob.labels" . | nindent 4 }}
rules:
- apiGroups:
- "apiextensions.k8s.io"
resources:
- "customresourcedefinitions"
verbs:
- create
- patch
- update
- get
- list
resourceNames:
{{- range $path, $_ := $.Files.Glob "crds/*.yaml" }}
- {{ ($.Files.Get $path | fromYaml ).metadata.name }}
{{- end }}
{{- end }}

View File

@ -1,21 +0,0 @@
{{- if .Values.upgradeJob.enabled }}
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: {{ template "kube-prometheus-stack.crd.upgradeJob.name" . }}
namespace: {{ template "kube-prometheus-stack.namespace" . }}
annotations:
"helm.sh/hook": pre-install,pre-upgrade,pre-rollback
"helm.sh/hook-weight": "-3"
"helm.sh/hook-delete-policy": before-hook-creation,hook-succeeded
labels:
{{- include "kube-prometheus-stack.crd.upgradeJob.labels" . | nindent 4 }}
subjects:
- kind: ServiceAccount
namespace: {{ template "kube-prometheus-stack.namespace" . }}
name: {{ template "kube-prometheus-stack.crd.upgradeJob.serviceAccountName" . }}
roleRef:
kind: ClusterRole
name: {{ template "kube-prometheus-stack.crd.upgradeJob.name" . }}
apiGroup: rbac.authorization.k8s.io
{{- end }}

View File

@ -1,15 +0,0 @@
{{- if .Values.upgradeJob.enabled }}
apiVersion: v1
kind: ConfigMap
metadata:
name: {{ include "kube-prometheus-stack.crd.upgradeJob.serviceAccountName" . }}
namespace: {{ template "kube-prometheus-stack.namespace" . }}
annotations:
"helm.sh/hook": pre-install,pre-upgrade,pre-rollback
"helm.sh/hook-weight": "-2"
"helm.sh/hook-delete-policy": before-hook-creation,hook-succeeded
labels:
{{- include "kube-prometheus-stack.crd.upgradeJob.labels" . | nindent 4 }}
binaryData:
crds.bz2: {{ .Files.Get "files/crds.bz2" | b64enc }}
{{- end }}

View File

@ -1,147 +0,0 @@
{{- if .Values.upgradeJob.enabled }}
apiVersion: batch/v1
kind: Job
metadata:
name: {{ template "kube-prometheus-stack.crd.upgradeJob.name" . }}
namespace: {{ template "kube-prometheus-stack.namespace" . }}
annotations:
"helm.sh/hook": pre-install,pre-upgrade,pre-rollback
"helm.sh/hook-weight": "5"
"helm.sh/hook-delete-policy": before-hook-creation,hook-succeeded
{{- with .Values.upgradeJob.annotations }}
{{- toYaml . | nindent 4 }}
{{- end }}
labels:
{{- include "kube-prometheus-stack.crd.upgradeJob.labels" . | nindent 4 }}
{{- with .Values.upgradeJob.labels }}
{{- toYaml . | nindent 4 }}
{{- end }}
spec:
backoffLimit: 3
template:
metadata:
{{- with .Values.upgradeJob.podLabels }}
labels:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- with .Values.upgradeJob.podAnnotations }}
annotations:
{{- toYaml . | nindent 8 }}
{{- end }}
spec:
{{- if .Values.global.imagePullSecrets }}
imagePullSecrets:
{{- include "kube-prometheus-stack.imagePullSecrets" . | indent 8 }}
{{- end }}
automountServiceAccountToken: {{ .Values.upgradeJob.automountServiceAccountToken }}
serviceAccountName: {{ include "kube-prometheus-stack.crd.upgradeJob.serviceAccountName" . }}
initContainers:
- name: busybox
{{- $busyboxRegistry := .Values.global.imageRegistry | default .Values.upgradeJob.image.busybox.registry -}}
{{- if .Values.upgradeJob.image.sha }}
image: "{{ $busyboxRegistry }}/{{ .Values.upgradeJob.image.busybox.repository }}:{{ .Values.upgradeJob.image.busybox.tag }}@sha256:{{ .Values.upgradeJob.image.busybox.sha }}"
{{- else }}
image: "{{ $busyboxRegistry }}/{{ .Values.upgradeJob.image.busybox.repository }}:{{ .Values.upgradeJob.image.busybox.tag }}"
{{- end }}
imagePullPolicy: "{{ .Values.upgradeJob.image.busybox.pullPolicy }}"
workingDir: /tmp/
command:
- sh
args:
- -c
- bzcat /crds/crds.bz2 > /tmp/crds.yaml
{{- with .Values.upgradeJob.resources }}
resources:
{{- toYaml . | nindent 12 }}
{{- end }}
{{- with .Values.upgradeJob.containerSecurityContext }}
securityContext:
{{- toYaml . | nindent 12 }}
{{- end }}
volumeMounts:
- mountPath: /crds/
name: crds
- mountPath: /tmp/
name: tmp
{{- with .Values.upgradeJob.extraVolumeMounts }}
{{- toYaml . | nindent 12 }}
{{- end }}
{{- with .Values.upgradeJob.env }}
env:
{{- range $key, $value := . }}
- name: {{ $key }}
value: {{ $value | quote }}
{{- end }}
{{- end }}
containers:
- name: kubectl
{{- $kubectlRegistry := .Values.global.imageRegistry | default .Values.upgradeJob.image.kubectl.registry -}}
{{- $defaultKubernetesVersion := (ternary (printf "%s.0" .Capabilities.KubeVersion.Version) (regexFind "v\\d+\\.\\d+\\.\\d+" .Capabilities.KubeVersion.Version) (regexMatch "^v\\d+\\.\\d+$" .Capabilities.KubeVersion.Version)) -}}
{{- if .Values.upgradeJob.image.kubectl.sha }}
image: "{{ $kubectlRegistry }}/{{ .Values.upgradeJob.image.kubectl.repository }}:{{ .Values.upgradeJob.image.kubectl.tag | default $defaultKubernetesVersion }}@sha256:{{ .Values.upgradeJob.image.kubectl.sha }}"
{{- else }}
image: "{{ $kubectlRegistry }}/{{ .Values.upgradeJob.image.kubectl.repository }}:{{ .Values.upgradeJob.image.kubectl.tag | default $defaultKubernetesVersion }}"
{{- end }}
imagePullPolicy: "{{ .Values.upgradeJob.image.kubectl.pullPolicy }}"
command:
- kubectl
args:
- apply
- --server-side
{{- if .Values.upgradeJob.forceConflicts }}
- --force-conflicts
{{- end }}
- --filename
- /tmp/crds.yaml
{{- with .Values.upgradeJob.resources }}
resources:
{{- toYaml . | nindent 12 }}
{{- end }}
{{- with .Values.upgradeJob.containerSecurityContext }}
securityContext:
{{- toYaml . | nindent 12 }}
{{- end }}
volumeMounts:
- mountPath: /tmp/
name: tmp
{{- with .Values.upgradeJob.extraVolumeMounts }}
{{- toYaml . | nindent 12 }}
{{- end }}
{{- with .Values.upgradeJob.env }}
env:
{{- range $key, $value := . }}
- name: {{ $key }}
value: {{ $value | quote }}
{{- end }}
{{- end }}
volumes:
- name: tmp
emptyDir: {}
- name: crds
configMap:
name: {{ template "kube-prometheus-stack.crd.upgradeJob.name" . }}
{{- with .Values.upgradeJob.extraVolumes }}
{{- toYaml . | nindent 8 }}
{{- end }}
restartPolicy: OnFailure
{{- with .Values.upgradeJob.podSecurityContext }}
securityContext:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- with .Values.upgradeJob.nodeSelector }}
nodeSelector:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- with .Values.upgradeJob.tolerations }}
tolerations:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- with .Values.upgradeJob.affinity }}
affinity:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- with .Values.upgradeJob.topologySpreadConstraints }}
topologySpreadConstraints:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- end }}

View File

@ -1,20 +0,0 @@
{{- if and .Values.upgradeJob.enabled .Values.upgradeJob.serviceAccount.create }}
apiVersion: v1
kind: ServiceAccount
automountServiceAccountToken: {{ .Values.upgradeJob.serviceAccount.automountServiceAccountToken }}
metadata:
name: {{ include "kube-prometheus-stack.crd.upgradeJob.serviceAccountName" . }}
namespace: {{ template "kube-prometheus-stack.namespace" . }}
annotations:
"helm.sh/hook": pre-install,pre-upgrade,pre-rollback
"helm.sh/hook-weight": "-4"
"helm.sh/hook-delete-policy": before-hook-creation,hook-succeeded
{{- with .Values.upgradeJob.serviceAccount.annotations }}
{{- toYaml . | nindent 4 }}
{{- end }}
labels:
{{- include "kube-prometheus-stack.crd.upgradeJob.labels" . | nindent 4 }}
{{- with .Values.upgradeJob.serviceAccount.labels }}
{{- toYaml . | nindent 4 }}
{{- end }}
{{- end }}

View File

@ -1,4 +0,0 @@
## Check out kube-prometheus-stack/values.yaml for more information
## on this parameter
upgradeJob:
enabled: false

View File

@ -1,583 +0,0 @@
# Grafana Helm Chart
The leading tool for querying and visualizing time series and metrics.
## Source Code
* <https://github.com/grafana/grafana>
## Requirements
Kubernetes: `^1.25.0-0`
## Installing the Chart
### OCI Registry
OCI registries are preferred in Helm as they implement unified storage, distribution, and improved security.
```console
helm install RELEASE-NAME oci://ghcr.io/grafana-community/helm-charts/grafana
```
### HTTP Registry
```console
helm repo add grafana-community https://grafana-community.github.io/helm-charts
helm repo update
helm install RELEASE-NAME grafana-community/grafana
```
## Uninstalling the Chart
To remove all of the Kubernetes objects associated with the Helm chart release:
```console
helm delete RELEASE-NAME
```
## Changelog
See the [changelog](https://grafana-community.github.io/helm-charts/changelog/?chart=grafana).
---
## Upgrading
A major chart version change (like v1.2.3 -> v2.0.0) indicates that there is an
incompatible breaking change needing manual actions.
### To 4.0.0 (And 3.12.1)
This version requires Helm >= 2.12.0.
### To 5.0.0
You have to add --force to your helm upgrade command as the labels of the chart have changed.
### To 6.0.0
This version requires Helm >= 3.1.0.
### To 7.0.0
For consistency with other Helm charts, the `global.image.registry` parameter was renamed
to `global.imageRegistry`. If you were not previously setting `global.image.registry`, no action
is required on upgrade. If you were previously setting `global.image.registry`, you will
need to instead set `global.imageRegistry`.
### To 10.0.0
Static alerting resources now support Helm templating. This means that alerting resources loaded from external files (`alerting.*.files`) are now processed by the Helm template engine.
If you already use template expressions intended for Alertmanager (for example, `{{ $labels.instance }}`), these must now be escaped to avoid unintended Helm or Go template evaluation. To escape them, wrap the braces with an extra layer like this:
`{{ "{{" }} $labels.instance {{ "}}" }}`
This ensures the expressions are preserved for Alertmanager instead of being rendered by Helm.
### To 11.0.0
The minimum required Kubernetes version is now 1.25. All references to deprecated APIs have been removed.
## Configuration
### Example ingress with path
With grafana 6.3 and above
```yaml
grafana.ini:
server:
domain: monitoring.example.com
root_url: "%(protocol)s://%(domain)s/grafana"
serve_from_sub_path: true
ingress:
enabled: true
hosts:
- "monitoring.example.com"
path: "/grafana"
```
### Example of extraVolumeMounts and extraVolumes
Configure additional volumes with `extraVolumes` and volume mounts with `extraVolumeMounts`.
Example for `extraVolumeMounts` and corresponding `extraVolumes`:
```yaml
extraVolumeMounts:
- name: plugins
mountPath: /var/lib/grafana/plugins
subPath: configs/grafana/plugins
readOnly: false
- name: dashboards
mountPath: /var/lib/grafana/dashboards
hostPath: /usr/shared/grafana/dashboards
readOnly: false
extraVolumes:
- name: plugins
existingClaim: existing-grafana-claim
- name: dashboards
hostPath: /usr/shared/grafana/dashboards
```
Volumes default to `emptyDir`. Set to `persistentVolumeClaim`,
`hostPath`, `csi`, or `configMap` for other types. For a
`persistentVolumeClaim`, specify an existing claim name with
`existingClaim`.
## Import dashboards
There are a few methods to import dashboards to Grafana. Below are some examples and explanations as to how to use each method:
```yaml
dashboards:
default:
some-dashboard:
json: |
{
"annotations":
...
# Complete json file here
...
"title": "Some Dashboard",
"uid": "abcd1234",
"version": 1
}
custom-dashboard:
# This is a path to a file inside the dashboards directory inside the chart directory
file: dashboards/custom-dashboard.json
prometheus-stats:
# Ref: https://grafana.com/dashboards/2
# title: My Custom Title # optional; when set for a downloaded dashboard (gnetId or url), overrides the title displayed in Grafana
gnetId: 2
revision: 2
datasource: Prometheus
loki-dashboard-quick-search:
gnetId: 12019
revision: 2
datasource:
- name: DS_PROMETHEUS
value: Prometheus
- name: DS_LOKI
value: Loki
local-dashboard:
url: https://github.com/cloudnative-pg/grafana-dashboards/blob/main/charts/cluster/grafana-dashboard.json
# redirects to:
# https://raw.githubusercontent.com/cloudnative-pg/grafana-dashboards/refs/heads/main/charts/cluster/grafana-dashboard.json
# default: -skf
# -s - silent mode
# -k - allow insecure (eg: non-TLS) connections
# -f - fail fast
# -L - follow HTTP redirects
curlOptions: -Lf
```
## BASE64 dashboards
Dashboards could be stored on a server that does not return JSON directly and instead of it returns a base64 encoded file (e.g. Gerrit)
A new parameter has been added to the URL use case so if you specify a b64content value equals to true after the URL entry a base64 decoding is applied before save the file to disk.
If this entry is not set or is equals to false not decoding is applied to the file before saving it to disk.
### Gerrit use case
Gerrit API for download files has the following schema: <https://yourgerritserver/a/{project-name}/branches/{branch-id}/files/{file-id}/content> where {project-name} and
{file-id} usually has '/' in their values and so they MUST be replaced by %2F so if project-name is user/repository, branch-id is master and file-id is equals to dir1/dir2/dashboard
the URL value is <https://yourgerritserver/a/user%2Frepo/branches/master/files/dir1%2Fdir2%2Fdashboard/content>
## Sidecar for dashboards
If the parameter `sidecar.dashboards.enabled` is set, a sidecar container is deployed in the grafana
pod. This container watches all configmaps (or secrets) in the cluster and filters out the ones with
a label as defined in `sidecar.dashboards.label`. The files defined in those configmaps are written
to a folder and accessed by grafana. Changes to the configmaps are monitored and the imported
dashboards are deleted/updated.
A recommendation is to use one configmap per dashboard, as a reduction of multiple dashboards inside
one configmap is currently not properly mirrored in grafana.
Example dashboard config:
```yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: sample-grafana-dashboard
labels:
grafana_dashboard: "1"
data:
k8s-dashboard.json: |-
[...]
```
## Sidecar for datasources
If the parameter `sidecar.datasources.enabled` is set, an init container is deployed in the grafana
pod. This container lists all secrets (or configmaps, though not recommended) in the cluster and
filters out the ones with a label as defined in `sidecar.datasources.label`. The files defined in
those secrets are written to a folder and accessed by grafana on startup. Using these YAML files,
the data sources in grafana can be imported.
Should you aim for reloading datasources in Grafana each time the config is changed, set `sidecar.datasources.skipReload: false` and adjust `sidecar.datasources.reloadURL` to `http://<svc-name>.<namespace>.svc.cluster.local/api/admin/provisioning/datasources/reload`.
Secrets are recommended over configmaps for this usecase because datasources usually contain private
data like usernames and passwords. Secrets are the more appropriate cluster resource to manage those.
Example values to add a postgres datasource as a kubernetes secret:
```yaml
apiVersion: v1
kind: Secret
metadata:
name: grafana-datasources
labels:
grafana_datasource: 'true' # default value for: sidecar.datasources.label
stringData:
pg-db.yaml: |-
apiVersion: 1
datasources:
- name: My pg db datasource
type: postgres
url: my-postgresql-db:5432
user: db-readonly-user
secureJsonData:
password: 'SUperSEcretPa$$word'
jsonData:
database: my_datase
sslmode: 'disable' # disable/require/verify-ca/verify-full
maxOpenConns: 0 # Grafana v5.4+
maxIdleConns: 2 # Grafana v5.4+
connMaxLifetime: 14400 # Grafana v5.4+
postgresVersion: 1000 # 903=9.3, 904=9.4, 905=9.5, 906=9.6, 1000=10
timescaledb: false
# <bool> allow users to edit datasources from the UI.
editable: false
```
Example values to add a datasource adapted from [Grafana](http://docs.grafana.org/administration/provisioning/#example-datasource-config-file):
```yaml
datasources:
datasources.yaml:
apiVersion: 1
datasources:
# <string, required> name of the datasource. Required
- name: Graphite
# <string, required> datasource type. Required
type: graphite
# <string, required> access mode. proxy or direct (Server or Browser in the UI). Required
access: proxy
# <int> org id. will default to orgId 1 if not specified
orgId: 1
# <string> url
url: http://localhost:8080
# <string> database password, if used
password:
# <string> database user, if used
user:
# <string> database name, if used
database:
# <bool> enable/disable basic auth
basicAuth:
# <string> basic auth username
basicAuthUser:
# <string> basic auth password
basicAuthPassword:
# <bool> enable/disable with credentials headers
withCredentials:
# <bool> mark as default datasource. Max one per org
isDefault:
# <map> fields that will be converted to json and stored in json_data
jsonData:
graphiteVersion: "1.1"
tlsAuth: true
tlsAuthWithCACert: true
# <string> json object of data that will be encrypted.
secureJsonData:
tlsCACert: "..."
tlsClientCert: "..."
tlsClientKey: "..."
version: 1
# <bool> allow users to edit datasources from the UI.
editable: false
```
## Sidecar for notifiers
If the parameter `sidecar.notifiers.enabled` is set, an init container is deployed in the grafana
pod. This container lists all secrets (or configmaps, though not recommended) in the cluster and
filters out the ones with a label as defined in `sidecar.notifiers.label`. The files defined in
those secrets are written to a folder and accessed by grafana on startup. Using these YAML files,
the notification channels in grafana can be imported. The secrets must be created before
`helm install` so that the notifiers init container can list the secrets.
Secrets are recommended over configmaps for this usecase because alert notification channels usually contain
private data like SMTP usernames and passwords. Secrets are the more appropriate cluster resource to manage those.
Example datasource config adapted from [Grafana](https://grafana.com/docs/grafana/latest/administration/provisioning/#alert-notification-channels):
```yaml
notifiers:
- name: notification-channel-1
type: slack
uid: notifier1
# either
org_id: 2
# or
org_name: Main Org.
is_default: true
send_reminder: true
frequency: 1h
disable_resolve_message: false
# See `Supported Settings` section for settings supporter for each
# alert notification type.
settings:
recipient: 'XXX'
token: 'xoxb'
uploadImage: true
url: https://slack.com
delete_notifiers:
- name: notification-channel-1
uid: notifier1
org_id: 2
- name: notification-channel-2
# default org_id: 1
```
## Sidecar for alerting resources
If the parameter `sidecar.alerts.enabled` is set, a sidecar container is deployed in the grafana
pod. This container watches all configmaps (or secrets) in the cluster (namespace defined by `sidecar.alerts.searchNamespace`) and filters out the ones with
a label as defined in `sidecar.alerts.label` (default is `grafana_alert`). The files defined in those configmaps are written
to a folder and accessed by grafana. Changes to the configmaps are monitored and the imported alerting resources are updated, however, deletions are a little more complicated (see below).
This sidecar can be used to provision alert rules, contact points, notification policies, notification templates and mute timings as shown in [Grafana Documentation](https://grafana.com/docs/grafana/next/alerting/set-up/provision-alerting-resources/file-provisioning/).
To fetch the alert config which will be provisioned, use the alert provisioning API ([Grafana Documentation](https://grafana.com/docs/grafana/next/developers/http_api/alerting_provisioning/)).
You can use either JSON or YAML format.
Example config for an alert rule:
```yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: sample-grafana-alert
labels:
grafana_alert: "1"
data:
k8s-alert.yml: |-
apiVersion: 1
groups:
- orgId: 1
name: k8s-alert
[...]
```
To delete provisioned alert rules is a two step process, you need to delete the configmap which defined the alert rule
and then create a configuration which deletes the alert rule.
Example deletion configuration:
```yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: delete-sample-grafana-alert
namespace: monitoring
labels:
grafana_alert: "1"
data:
delete-k8s-alert.yml: |-
apiVersion: 1
deleteRules:
- orgId: 1
uid: 16624780-6564-45dc-825c-8bded4ad92d3
```
## Statically provision alerting resources
If you don't need to change alerting resources (alert rules, contact points, notification policies and notification templates) regularly you could use the `alerting` config option instead of the sidecar option above.
This will grab the alerting config and apply it statically at build time for the helm file.
There are two methods to statically provision alerting configuration in Grafana. Below are some examples and explanations as to how to use each method:
```yaml
alerting:
team1-alert-rules.yaml:
file: alerting/team1/rules.yaml
team2-alert-rules.yaml:
file: alerting/team2/rules.yaml
team3-alert-rules.yaml:
file: alerting/team3/rules.yaml
notification-policies.yaml:
file: alerting/shared/notification-policies.yaml
notification-templates.yaml:
file: alerting/shared/notification-templates.yaml
contactpoints.yaml:
apiVersion: 1
contactPoints:
- orgId: 1
name: Slack channel
receivers:
- uid: default-receiver
type: slack
settings:
# Webhook URL to be filled in
url: ""
# We need to escape double curly braces for the tpl function.
text: '{{ `{{ template "default.message" . }}` }}'
title: '{{ `{{ template "default.title" . }}` }}'
```
The two possibilities for static alerting resource provisioning are:
* Inlining the file contents as shown for contact points in the above example.
* Importing a file using a relative path starting from the chart root directory as shown for the alert rules in the above example.
### Important notes on file provisioning
* The format of the files is defined in the [Grafana documentation](https://grafana.com/docs/grafana/next/alerting/set-up/provision-alerting-resources/file-provisioning/) on file provisioning.
* The chart supports importing YAML and JSON files.
* The filename must be unique, otherwise one volume mount will overwrite the other.
* Alerting configurations support Helm templating. Double curly braces that arise from the Grafana configuration format and are not intended as templates for the chart must be escaped.
* The number of total files under `alerting:` is not limited. Each file will end up as a volume mount in the corresponding provisioning folder of the deployed Grafana instance.
* The file size for each import is limited by what the function `.Files.Get` can handle, which suffices for most cases.
## How to serve Grafana with a path prefix (/grafana)
In order to serve Grafana with a prefix (e.g., <http://example.com/grafana>), add the following to your values.yaml.
```yaml
ingress:
enabled: true
annotations:
kubernetes.io/ingress.class: "nginx"
nginx.ingress.kubernetes.io/rewrite-target: /$1
nginx.ingress.kubernetes.io/use-regex: "true"
path: /grafana/?(.*)
hosts:
- k8s.example.dev
grafana.ini:
server:
root_url: http://localhost:3000/grafana # this host can be localhost
```
## How to securely reference secrets in grafana.ini
This example uses Grafana [file providers](https://grafana.com/docs/grafana/latest/administration/configuration/#file-provider) for secret values and the `extraSecretMounts` configuration flag (Additional grafana server secret mounts) to mount the secrets.
In grafana.ini:
```yaml
grafana.ini:
[auth.generic_oauth]
enabled = true
client_id = $__file{/etc/secrets/auth_generic_oauth/client_id}
client_secret = $__file{/etc/secrets/auth_generic_oauth/client_secret}
```
Existing secret, or created along with helm:
```yaml
---
apiVersion: v1
kind: Secret
metadata:
name: auth-generic-oauth-secret
type: Opaque
stringData:
client_id: <value>
client_secret: <value>
```
Include in the `extraSecretMounts` configuration flag:
```yaml
extraSecretMounts:
- name: auth-generic-oauth-secret-mount
secretName: auth-generic-oauth-secret
defaultMode: 0440
mountPath: /etc/secrets/auth_generic_oauth
readOnly: true
```
### extraSecretMounts using a Container Storage Interface (CSI) provider
This example uses a CSI driver e.g. retrieving secrets using [Azure Key Vault Provider](https://github.com/Azure/secrets-store-csi-driver-provider-azure)
```yaml
extraSecretMounts:
- name: secrets-store-inline
mountPath: /run/secrets
readOnly: true
csi:
driver: secrets-store.csi.k8s.io
readOnly: true
volumeAttributes:
secretProviderClass: "my-provider"
nodePublishSecretRef:
name: akv-creds
```
## Image Renderer Plug-In
This chart supports enabling [remote image rendering](https://github.com/grafana/grafana-image-renderer/blob/master/README.md#run-in-docker)
```yaml
imageRenderer:
enabled: true
```
### Image Renderer NetworkPolicy
By default the image-renderer pods will have a network policy which only allows ingress traffic from the created grafana instance
### High Availability for unified alerting
If you want to run Grafana in a high availability cluster you need to enable
the headless service by setting `headlessService: true` in your `values.yaml`
file.
As next step you have to setup the `grafana.ini` in your `values.yaml` in a way
that it will make use of the headless service to obtain all the IPs of the
cluster. For example, use ``{{ .Release.Name }}`` to refer to the Helm release name in your values.
```yaml
grafana.ini:
...
unified_alerting:
enabled: true
ha_peers: {{ .Release.Name }}-headless:9094
ha_listen_address: ${POD_IP}:9094
ha_advertise_address: ${POD_IP}:9094
rule_version_record_limit: "5"
alerting:
enabled: false
```
### Installing plugins
If you want to install a Grafana plugin using the helm chart, you can do so by using the identifier of the plugin, for example `digirich-bubblechart-panel` will install [Bubble Chart](https://grafana.com/grafana/plugins/digrich-bubblechart-panel/).
You can also install a plugin and a specific version by specifying the version and URL of the download file as shown in the example below :
```yaml
plugins:
- digrich-bubblechart-panel
- grafana-clock-panel
## You can also use other plugin download URL, as long as they are valid zip files,
## and specify the name of the plugin as prefix, with an version. Like this:
# - marcusolsson-json-datasource@1.3.24@https://grafana.com/api/plugins/marcusolsson-json-datasource/versions/1.3.24/download
```
Generic documentation about plugins can be found in the [official documentation](https://grafana.com/docs/grafana/latest/administration/plugin-management/).

View File

@ -1,56 +0,0 @@
{{- if and .Values.verticalPodAutoscaler.enabled (.Capabilities.APIVersions.Has "autoscaling.k8s.io/v1/VerticalPodAutoscaler") }}
{{- $vpa := .Values.verticalPodAutoscaler }}
{{- $resources := $vpa.controlledResources | default dict }}
{{- $target := $vpa.target | default dict }}
{{- $container := $vpa.container | default dict }}
{{- /* Match deployment.yaml condition */ -}}
{{- $isDeployment := and (not .Values.useStatefulSet) (or (not .Values.persistence.enabled) (eq .Values.persistence.type "pvc")) -}}
{{- /* Derived defaults */ -}}
{{- $defaultApiVersion := "apps/v1" -}}
{{- $defaultKind := ternary "Deployment" "StatefulSet" $isDeployment -}}
{{- $defaultName := include "grafana.fullname" . -}}
{{- /* Optional override (ONLY if you document it in values.yaml/schema) */ -}}
{{- $t := $vpa.targetRef | default dict -}}
{{- $apiVersion := default $defaultApiVersion $t.apiVersion -}}
{{- $kind := default $defaultKind $t.kind -}}
{{- $name := default $defaultName $t.name -}}
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
name: {{ include "grafana.fullname" . }}
namespace: {{ .Release.Namespace }}
labels:
{{- include "grafana.labels" . | nindent 4 }}
spec:
targetRef:
apiVersion: {{ $apiVersion | quote }}
kind: {{ $kind | quote }}
name: {{ $name | quote }}
updatePolicy:
updateMode: {{ default "Off" $vpa.updateMode | quote }}
resourcePolicy:
containerPolicies:
- containerName: "grafana"
{{- if or (get $resources "cpu") (get $resources "memory") }}
controlledResources:
{{- if (get $resources "cpu") }}
- "cpu"
{{- end }}
{{- if (get $resources "memory") }}
- "memory"
{{- end }}
{{- end }}
{{- with $vpa.minAllowed }}
minAllowed:
{{ toYaml . | nindent 10 }}
{{- end }}
{{- with $vpa.maxAllowed }}
maxAllowed:
{{ toYaml . | nindent 10 }}
{{- end }}
{{- end }}

View File

@ -1,60 +0,0 @@
{{- if .Values.prometheus.scrapeconfig.enabled }}
apiVersion: monitoring.coreos.com/v1alpha1
kind: ScrapeConfig
metadata:
name: {{ template "kube-state-metrics.fullname" . }}
namespace: {{ template "kube-state-metrics.namespace" . }}
labels:
{{- include "kube-state-metrics.labels" . | indent 4 }}
{{- with .Values.prometheus.scrapeconfig.additionalLabels }}
{{- tpl (toYaml . | nindent 4) $ }}
{{- end }}
{{- with .Values.prometheus.scrapeconfig.annotations }}
annotations:
{{- tpl (toYaml . | nindent 4) $ }}
{{- end }}
spec:
{{- include "scrapeconfig.scrapeLimits" .Values.prometheus.scrapeconfig | indent 2 }}
staticConfigs:
- targets:
- {{ template "kube-state-metrics.fullname" . }}.{{ template "kube-state-metrics.namespace" . }}.svc:{{ .Values.service.port }}
{{- if .Values.prometheus.scrapeconfig.staticConfigLabels}}
labels:
{{- with .Values.prometheus.scrapeconfig.staticConfigLabels }}
{{- tpl (toYaml . | nindent 8) $ }}
{{- end }}
{{- end }}
{{- if .Values.prometheus.scrapeconfig.jobName }}
jobName: {{ .Values.prometheus.scrapeconfig.jobName }}
{{- end }}
{{- if .Values.prometheus.scrapeconfig.honorLabels }}
honorLabels: true
{{- end }}
{{- if .Values.prometheus.scrapeconfig.scrapeInterval }}
scrapeInterval: {{ .Values.prometheus.scrapeconfig.scrapeInterval }}
{{- end }}
{{- if .Values.prometheus.scrapeconfig.scrapeTimeout }}
scrapeTimeout: {{ .Values.prometheus.scrapeconfig.scrapeTimeout }}
{{- end }}
{{- if .Values.prometheus.scrapeconfig.proxyUrl }}
proxyUrl: {{ .Values.prometheus.scrapeconfig.proxyUrl }}
{{- end }}
{{- if .Values.prometheus.scrapeconfig.enableHttp2 }}
enableHttp2: {{ .Values.prometheus.scrapeconfig.enableHttp2 }}
{{- end }}
{{- if .Values.prometheus.scrapeconfig.metricRelabelings }}
metricRelabelings:
{{- toYaml .Values.prometheus.scrapeconfig.metricRelabelings | nindent 4 }}
{{- end }}
{{- if .Values.prometheus.scrapeconfig.relabelings }}
relabelings:
{{- toYaml .Values.prometheus.scrapeconfig.relabelings | nindent 4 }}
{{- end }}
{{- if .Values.prometheus.scrapeconfig.scheme }}
scheme: {{ .Values.prometheus.scrapeconfig.scheme }}
{{- end }}
{{- if .Values.prometheus.scrapeconfig.tlsConfig }}
tlsConfig:
{{- toYaml (.Values.prometheus.scrapeconfig.tlsConfig ) | nindent 4 }}
{{- end }}
{{- end }}

View File

@ -1,41 +0,0 @@
{{- if and .Values.alertmanager.enabled .Values.alertmanager.verticalPodAutoscaler.enabled }}
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
name: {{ template "kube-prometheus-stack.fullname" . }}-alertmanager
namespace: {{ template "kube-prometheus-stack-alertmanager.namespace" . }}
labels:
app: {{ template "kube-prometheus-stack.name" . }}-alertmanager
{{- include "kube-prometheus-stack.labels" . | nindent 4 }}
spec:
{{- with .Values.alertmanager.verticalPodAutoscaler.recommenders }}
recommenders:
{{- toYaml . | nindent 4 }}
{{- end }}
resourcePolicy:
containerPolicies:
- containerName: alertmanager
{{- with .Values.alertmanager.verticalPodAutoscaler.controlledResources }}
controlledResources:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- if .Values.alertmanager.verticalPodAutoscaler.controlledValues }}
controlledValues: {{ .Values.alertmanager.verticalPodAutoscaler.controlledValues }}
{{- end }}
{{- if .Values.alertmanager.verticalPodAutoscaler.maxAllowed }}
maxAllowed:
{{- toYaml .Values.alertmanager.verticalPodAutoscaler.maxAllowed | nindent 8 }}
{{- end }}
{{- if .Values.alertmanager.verticalPodAutoscaler.minAllowed }}
minAllowed:
{{- toYaml .Values.alertmanager.verticalPodAutoscaler.minAllowed | nindent 8 }}
{{- end }}
targetRef:
apiVersion: monitoring.coreos.com/v1
kind: Alertmanager
name: {{ template "kube-prometheus-stack.alertmanager.crname" . }}
{{- with .Values.alertmanager.verticalPodAutoscaler.updatePolicy }}
updatePolicy:
{{- toYaml . | nindent 4 }}
{{- end }}
{{- end }}

View File

@ -1,15 +0,0 @@
{{- /* Normalize extraObjects to a list, easier to loop over */ -}}
{{- $extraObjects := .Values.extraManifests | default (list) -}}
{{- if kindIs "map" $extraObjects -}}
{{- $extraObjects = values $extraObjects -}}
{{- end -}}
{{- range $extraObjects }}
---
{{- if kindIs "map" . }}
{{- tpl (toYaml .) $ | nindent 0 }}
{{- else if kindIs "string" . }}
{{- tpl . $ | nindent 0 }}
{{- end }}
{{- end }}

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

View File

@ -1,15 +0,0 @@
{{- if .Values.prometheusOperator.podDisruptionBudget.enabled -}}
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
name: {{ template "kube-prometheus-stack.operator.fullname" . }}
namespace: {{ template "kube-prometheus-stack.namespace" . }}
labels:
{{- include "kube-prometheus-stack.prometheus-operator.labels" . | nindent 4 }}
spec:
selector:
matchLabels:
app: {{ template "kube-prometheus-stack.name" . }}-operator
release: {{ $.Release.Name | quote }}
{{- toYaml (omit .Values.prometheusOperator.podDisruptionBudget "enabled") | nindent 2 }}
{{- end }}

View File

@ -1,37 +0,0 @@
{{- if .Values.additionalPrometheusRulesMap }}
{{- range $prometheusRuleName, $prometheusRule := .Values.additionalPrometheusRulesMap }}
---
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
name: {{ printf "%s-%s" (include "kube-prometheus-stack.fullname" $) $prometheusRuleName | trunc 63 | trimSuffix "-" }}
namespace: {{ template "kube-prometheus-stack.namespace" $ }}
labels:
app: {{ template "kube-prometheus-stack.name" $ }}
{{- include "kube-prometheus-stack.labels" $ | nindent 4 }}
{{- if $prometheusRule.additionalLabels }}
{{- toYaml $prometheusRule.additionalLabels | nindent 4 }}
{{- end }}
spec:
groups:
{{- toYaml $prometheusRule.groups | nindent 4 }}
{{- end }}
{{- else if .Values.additionalPrometheusRules }}
{{- range .Values.additionalPrometheusRules }}
---
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
name: {{ printf "%s-%s" (include "kube-prometheus-stack.fullname" $) .name | trunc 63 | trimSuffix "-" }}
namespace: {{ template "kube-prometheus-stack.namespace" $ }}
labels:
app: {{ template "kube-prometheus-stack.name" $ }}
{{- include "kube-prometheus-stack.labels" $ | nindent 4 }}
{{- if .additionalLabels }}
{{- toYaml .additionalLabels | nindent 4 }}
{{- end }}
spec:
groups:
{{- toYaml .groups | nindent 4 }}
{{- end }}
{{- end }}

View File

@ -1,220 +0,0 @@
{{- /*
Generated from 'k8s.rules.pod-owner' group from https://github.com/prometheus-operator/kube-prometheus.git
Do not change in-place! In order to change this file first read following link:
https://github.com/prometheus-community/helm-charts/tree/main/charts/kube-prometheus-stack/hack
*/ -}}
{{- $kubeTargetVersion := default .Capabilities.KubeVersion.GitVersion .Values.kubeTargetVersionOverride }}
{{- if and (semverCompare ">=1.14.0-0" $kubeTargetVersion) (semverCompare "<9.9.9-9" $kubeTargetVersion) .Values.defaultRules.create .Values.defaultRules.rules.k8sPodOwner }}
{{- $kubeStateMetricsJob := include "kube-prometheus-stack-kube-state-metrics.name" . }}
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
name: {{ printf "%s-%s" (include "kube-prometheus-stack.fullname" .) "k8s.rules.pod-owner" | trunc 63 | trimSuffix "-" }}
namespace: {{ template "kube-prometheus-stack.namespace" . }}
labels:
app: {{ template "kube-prometheus-stack.name" . }}
{{ include "kube-prometheus-stack.labels" . | indent 4 }}
{{- if .Values.defaultRules.labels }}
{{ toYaml .Values.defaultRules.labels | indent 4 }}
{{- end }}
{{- if .Values.defaultRules.annotations }}
annotations:
{{ toYaml .Values.defaultRules.annotations | indent 4 }}
{{- end }}
spec:
groups:
- name: k8s.rules.pod_owner
rules:
- expr: |-
max by ({{ range $.Values.defaultRules.additionalAggregationLabels }}{{ . }},{{ end }}cluster, namespace, workload, pod) (
label_replace(
label_replace(
kube_pod_owner{job="{{ $kubeStateMetricsJob }}", owner_kind="ReplicaSet"},
"replicaset", "$1", "owner_name", "(.*)"
) * on ({{ range $.Values.defaultRules.additionalAggregationLabels }}{{ . }},{{ end }}cluster, replicaset, namespace) group_left(owner_name) topk by ({{ range $.Values.defaultRules.additionalAggregationLabels }}{{ . }},{{ end }}cluster, replicaset, namespace) (
1, max by ({{ range $.Values.defaultRules.additionalAggregationLabels }}{{ . }},{{ end }}cluster, replicaset, namespace, owner_name) (
kube_replicaset_owner{job="{{ $kubeStateMetricsJob }}", owner_kind=""}
)
),
"workload", "$1", "replicaset", "(.*)"
)
)
labels:
workload_type: replicaset
{{- if or .Values.defaultRules.additionalRuleLabels .Values.defaultRules.additionalRuleGroupLabels.k8sPodOwner }}
{{- with .Values.defaultRules.additionalRuleLabels }}
{{- toYaml . | nindent 8 }}
{{- end }}
{{- with .Values.defaultRules.additionalRuleGroupLabels.k8sPodOwner }}
{{- toYaml . | nindent 8 }}
{{- end }}
{{- end }}
record: namespace_workload_pod:kube_pod_owner:relabel
- expr: |-
max by ({{ range $.Values.defaultRules.additionalAggregationLabels }}{{ . }},{{ end }}cluster, namespace, workload, pod) (
label_replace(
label_replace(
kube_pod_owner{job="{{ $kubeStateMetricsJob }}", owner_kind="ReplicaSet"},
"replicaset", "$1", "owner_name", "(.*)"
) * on ({{ range $.Values.defaultRules.additionalAggregationLabels }}{{ . }},{{ end }}replicaset, namespace, cluster) group_left(owner_name) topk by ({{ range $.Values.defaultRules.additionalAggregationLabels }}{{ . }},{{ end }}cluster, replicaset, namespace) (
1, max by ({{ range $.Values.defaultRules.additionalAggregationLabels }}{{ . }},{{ end }}cluster, replicaset, namespace, owner_name) (
kube_replicaset_owner{job="{{ $kubeStateMetricsJob }}", owner_kind="Deployment"}
)
),
"workload", "$1", "owner_name", "(.*)"
)
)
labels:
workload_type: deployment
{{- if or .Values.defaultRules.additionalRuleLabels .Values.defaultRules.additionalRuleGroupLabels.k8sPodOwner }}
{{- with .Values.defaultRules.additionalRuleLabels }}
{{- toYaml . | nindent 8 }}
{{- end }}
{{- with .Values.defaultRules.additionalRuleGroupLabels.k8sPodOwner }}
{{- toYaml . | nindent 8 }}
{{- end }}
{{- end }}
record: namespace_workload_pod:kube_pod_owner:relabel
- expr: |-
max by ({{ range $.Values.defaultRules.additionalAggregationLabels }}{{ . }},{{ end }}cluster, namespace, workload, pod) (
label_replace(
kube_pod_owner{job="{{ $kubeStateMetricsJob }}", owner_kind="DaemonSet"},
"workload", "$1", "owner_name", "(.*)"
)
)
labels:
workload_type: daemonset
{{- if or .Values.defaultRules.additionalRuleLabels .Values.defaultRules.additionalRuleGroupLabels.k8sPodOwner }}
{{- with .Values.defaultRules.additionalRuleLabels }}
{{- toYaml . | nindent 8 }}
{{- end }}
{{- with .Values.defaultRules.additionalRuleGroupLabels.k8sPodOwner }}
{{- toYaml . | nindent 8 }}
{{- end }}
{{- end }}
record: namespace_workload_pod:kube_pod_owner:relabel
- expr: |-
max by ({{ range $.Values.defaultRules.additionalAggregationLabels }}{{ . }},{{ end }}cluster, namespace, workload, pod) (
label_replace(
kube_pod_owner{job="{{ $kubeStateMetricsJob }}", owner_kind="StatefulSet"},
"workload", "$1", "owner_name", "(.*)")
)
labels:
workload_type: statefulset
{{- if or .Values.defaultRules.additionalRuleLabels .Values.defaultRules.additionalRuleGroupLabels.k8sPodOwner }}
{{- with .Values.defaultRules.additionalRuleLabels }}
{{- toYaml . | nindent 8 }}
{{- end }}
{{- with .Values.defaultRules.additionalRuleGroupLabels.k8sPodOwner }}
{{- toYaml . | nindent 8 }}
{{- end }}
{{- end }}
record: namespace_workload_pod:kube_pod_owner:relabel
- expr: |-
group by ({{ range $.Values.defaultRules.additionalAggregationLabels }}{{ . }},{{ end }}cluster, namespace, workload, pod) (
label_join(
group by ({{ range $.Values.defaultRules.additionalAggregationLabels }}{{ . }},{{ end }}cluster, namespace, job_name, pod, owner_name) (
label_join(
kube_pod_owner{job="{{ $kubeStateMetricsJob }}", owner_kind="Job"}
, "job_name", "", "owner_name")
)
* on ({{ range $.Values.defaultRules.additionalAggregationLabels }}{{ . }},{{ end }}cluster, namespace, job_name) group_left()
group by ({{ range $.Values.defaultRules.additionalAggregationLabels }}{{ . }},{{ end }}cluster, namespace, job_name) (
kube_job_owner{job="{{ $kubeStateMetricsJob }}", owner_kind=~"Pod|"}
)
, "workload", "", "owner_name")
)
labels:
workload_type: job
{{- if or .Values.defaultRules.additionalRuleLabels .Values.defaultRules.additionalRuleGroupLabels.k8sPodOwner }}
{{- with .Values.defaultRules.additionalRuleLabels }}
{{- toYaml . | nindent 8 }}
{{- end }}
{{- with .Values.defaultRules.additionalRuleGroupLabels.k8sPodOwner }}
{{- toYaml . | nindent 8 }}
{{- end }}
{{- end }}
record: namespace_workload_pod:kube_pod_owner:relabel
- expr: |-
max by ({{ range $.Values.defaultRules.additionalAggregationLabels }}{{ . }},{{ end }}cluster, namespace, workload, pod) (
label_replace(
kube_pod_owner{job="{{ $kubeStateMetricsJob }}", owner_kind="", owner_name=""},
"workload", "$1", "pod", "(.+)")
)
labels:
workload_type: barepod
{{- if or .Values.defaultRules.additionalRuleLabels .Values.defaultRules.additionalRuleGroupLabels.k8sPodOwner }}
{{- with .Values.defaultRules.additionalRuleLabels }}
{{- toYaml . | nindent 8 }}
{{- end }}
{{- with .Values.defaultRules.additionalRuleGroupLabels.k8sPodOwner }}
{{- toYaml . | nindent 8 }}
{{- end }}
{{- end }}
record: namespace_workload_pod:kube_pod_owner:relabel
- expr: |-
max by ({{ range $.Values.defaultRules.additionalAggregationLabels }}{{ . }},{{ end }}cluster, namespace, workload, pod) (
label_replace(
kube_pod_owner{job="{{ $kubeStateMetricsJob }}", owner_kind="Node"},
"workload", "$1", "pod", "(.+)")
)
labels:
workload_type: staticpod
{{- if or .Values.defaultRules.additionalRuleLabels .Values.defaultRules.additionalRuleGroupLabels.k8sPodOwner }}
{{- with .Values.defaultRules.additionalRuleLabels }}
{{- toYaml . | nindent 8 }}
{{- end }}
{{- with .Values.defaultRules.additionalRuleGroupLabels.k8sPodOwner }}
{{- toYaml . | nindent 8 }}
{{- end }}
{{- end }}
record: namespace_workload_pod:kube_pod_owner:relabel
- expr: |-
group by ({{ range $.Values.defaultRules.additionalAggregationLabels }}{{ . }},{{ end }}cluster, namespace, workload, workload_type, pod) (
label_join(
label_join(
group by ({{ range $.Values.defaultRules.additionalAggregationLabels }}{{ . }},{{ end }}cluster, namespace, job_name, pod) (
label_join(
kube_pod_owner{job="{{ $kubeStateMetricsJob }}", owner_kind="Job"}
, "job_name", "", "owner_name")
)
* on ({{ range $.Values.defaultRules.additionalAggregationLabels }}{{ . }},{{ end }}cluster, namespace, job_name) group_left(owner_kind, owner_name)
group by ({{ range $.Values.defaultRules.additionalAggregationLabels }}{{ . }},{{ end }}cluster, namespace, job_name, owner_kind, owner_name) (
kube_job_owner{job="{{ $kubeStateMetricsJob }}", owner_kind!="Pod", owner_kind!=""}
)
, "workload", "", "owner_name")
, "workload_type", "", "owner_kind")
OR
label_replace(
label_replace(
label_replace(
kube_pod_owner{job="{{ $kubeStateMetricsJob }}", owner_kind="ReplicaSet"}
, "replicaset", "$1", "owner_name", "(.+)"
)
* on ({{ range $.Values.defaultRules.additionalAggregationLabels }}{{ . }},{{ end }}cluster, namespace, replicaset) group_left(owner_kind, owner_name)
group by ({{ range $.Values.defaultRules.additionalAggregationLabels }}{{ . }},{{ end }}cluster, namespace, replicaset, owner_kind, owner_name) (
kube_replicaset_owner{job="{{ $kubeStateMetricsJob }}", owner_kind!="Deployment", owner_kind!=""}
)
, "workload", "$1", "owner_name", "(.+)")
OR
label_replace(
group by ({{ range $.Values.defaultRules.additionalAggregationLabels }}{{ . }},{{ end }}cluster, namespace, pod, owner_name, owner_kind) (
kube_pod_owner{job="{{ $kubeStateMetricsJob }}", owner_kind!="ReplicaSet", owner_kind!="DaemonSet", owner_kind!="StatefulSet", owner_kind!="Job", owner_kind!="Node", owner_kind!=""}
)
, "workload", "$1", "owner_name", "(.+)"
)
, "workload_type", "$1", "owner_kind", "(.+)")
)
record: namespace_workload_pod:kube_pod_owner:relabel
{{- if or .Values.defaultRules.additionalRuleLabels .Values.defaultRules.additionalRuleGroupLabels.k8sPodOwner }}
labels:
{{- with .Values.defaultRules.additionalRuleLabels }}
{{- toYaml . | nindent 8 }}
{{- end }}
{{- with .Values.defaultRules.additionalRuleGroupLabels.k8sPodOwner }}
{{- toYaml . | nindent 8 }}
{{- end }}
{{- end }}
{{- end }}

View File

@ -1,46 +0,0 @@
{{- if and .Values.prometheus.enabled .Values.prometheus.verticalPodAutoscaler.enabled }}
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
name: {{ template "kube-prometheus-stack.fullname" . }}-prometheus
namespace: {{ template "kube-prometheus-stack.namespace" . }}
labels:
app: {{ template "kube-prometheus-stack.name" . }}-prometheus
{{- include "kube-prometheus-stack.labels" . | nindent 4 }}
spec:
{{- with .Values.prometheus.verticalPodAutoscaler.recommenders }}
recommenders:
{{- toYaml . | nindent 4 }}
{{- end }}
resourcePolicy:
containerPolicies:
- containerName: prometheus
{{- with .Values.prometheus.verticalPodAutoscaler.controlledResources }}
controlledResources:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- if .Values.prometheus.verticalPodAutoscaler.controlledValues }}
controlledValues: {{ .Values.prometheus.verticalPodAutoscaler.controlledValues }}
{{- end }}
{{- if .Values.prometheus.verticalPodAutoscaler.maxAllowed }}
maxAllowed:
{{- toYaml .Values.prometheus.verticalPodAutoscaler.maxAllowed | nindent 8 }}
{{- end }}
{{- if .Values.prometheus.verticalPodAutoscaler.minAllowed }}
minAllowed:
{{- toYaml .Values.prometheus.verticalPodAutoscaler.minAllowed | nindent 8 }}
{{- end }}
targetRef:
{{- if .Values.prometheus.agentMode }}
apiVersion: monitoring.coreos.com/v1alpha1
kind: PrometheusAgent
{{- else }}
apiVersion: monitoring.coreos.com/v1
kind: Prometheus
{{- end }}
name: {{ template "kube-prometheus-stack.prometheus.crname" . }}
{{- with .Values.prometheus.verticalPodAutoscaler.updatePolicy }}
updatePolicy:
{{- toYaml . | nindent 4 }}
{{- end }}
{{- end }}

View File

@ -0,0 +1,47 @@
# Changelog
All notable changes from the upstream Prometheus Operator chart will be added to this file.
## [Package Version 00] - 2020-07-19
### Added
- Added [Prometheus Adapter](https://github.com/helm/charts/tree/master/stable/prometheus-adapter) as a dependency to the upstream Prometheus Operator chart to allow users to expose custom metrics from the default Prometheus instance deployed by this chart
- Remove `prometheus-operator/cleanup-crds.yaml` and `prometheus-operator/crds.yaml` from the Prometheus Operator upstream chart in favor of just using the CRD directory to install the CRDs.
- Added support for `rkeControllerManager`, `rkeScheduler`, `rkeProxy`, and `rkeEtcd` PushProx exporters for monitoring k8s components within RKE clusters
- Added support for a `k3sServer` PushProx exporter that monitors k3s server components (`kubeControllerManager`, `kubeScheduler`, and `kubeProxy`) within k3s clusters
- Added support for `kubeAdmControllerManager`, `kubeAdmScheduler`, `kubeAdmProxy`, and `kubeAdmEtcd` PushProx exporters for monitoring k8s components within kubeAdm clusters
- Added support for `rke2ControllerManager`, `rke2Scheduler`, `rke2Proxy`, and `rke2Etcd` PushProx exporters for monitoring k8s components within rke2 clusters
- Exposed `prometheus.prometheusSpec.ignoreNamespaceSelectors` on values.yaml and set it to `false` by default. This value instructs the default Prometheus server deployed with this chart to ignore the `namespaceSelector` field within any created ServiceMonitor or PodMonitor CRs that it selects. This prevents ServiceMonitors and PodMonitors from configuring the Prometheus scrape configuration to monitor resources outside the namespace that they are deployed in; if a user needs to have one ServiceMonitor / PodMonitor monitor resources within several namespaces (such as the resources that are used to monitor Istio in a default installation), they should not enable this option since it would require them to create one ServiceMonitor / PodMonitor CR per namespace that they would like to monitor. Relevant fields were also updated in the default README.md.
- Added `grafana.sidecar.dashboards.searchNamespace` to `values.yaml` with a default value of `cattle-dashboards`. The namespace provided should contain all ConfigMaps with the label `grafana_dashboard` and will be searched by the Grafana Dashboards sidecar for updates. The namespace specified is also created along with this deployment. All default dashboard ConfigMaps have been relocated from the deployment namespace to the namespace specified
- Added `monitoring-admin`, `monitoring-edit`, and `monitoring-view` default `ClusterRoles` to allow admins to assign roles to users to interact with Prometheus Operator CRs. These can be enabled by setting `.Values.global.rbac.userRoles.create` (default: `true`). In a typical RBAC setup, you might want to use a `ClusterRoleBinding` to bind these roles to a Subject to allow them to set up or view `ServiceMonitors` / `PodMonitors` / `PrometheusRules` and view `Prometheus` or `Alertmanager` CRs across the cluster. If `.Values.global.rbac.userRoles.aggregateRolesForRBAC` is enabled, these ClusterRoles will aggregate into the respective default ClusterRoles provided by Kubernetes
- Added `monitoring-config-admin`, `monitoring-config-edit` and `monitoring-config-view` default `Roles` to allow admins to assign roles to users to be able to edit / view `Secrets` and `ConfigMaps` within the `cattle-monitoring-system` namespace. These can be enabled by setting `.Values.global.rbac.userRoles.create` (default: `true`). In a typical RBAC setup, you might want to use a `RoleBinding` to bind these roles to a Subject within the `cattle-monitoring-system` namespace to allow them to modify Secrets / ConfigMaps tied to the deployment, such as your Alertmanager Config Secret.
- Added `monitoring-dashboard-admin`, `monitoring-dashboard-edit` and `monitoring-dashboard-view` default `Roles` to allow admins to assign roles to users to be able to edit / view `ConfigMaps` within the `cattle-dashboards` namespace. These can be enabled by setting `.Values.global.rbac.userRoles.create` (default: `true`) and deploying Grafana as part of this chart. In a typical RBAC setup, you might want to use a `RoleBinding` to bind these roles to a Subject within the `cattle-dashboards` namespace to allow them to create / modify ConfigMaps that contain the JSON used to persist Grafana Dashboards on the cluster.
- Added default resource limits for `Prometheus Operator`, `Prometheus`, `AlertManager`, `Grafana`, `kube-state-metrics`, `node-exporter`
- Added a default template `rancher_defaults.tmpl` to AlertManager that Rancher will offer to users in order to help configure the way alerts are rendered on a notifier. Also updated the default template deployed with this chart to reference that template and added an example of a Slack config using this template as a comment in the `values.yaml`.
- Added support for private registries via introducing a new field for `global.cattle.systemDefaultRegistry` that, if supplied, will automatically be prepended onto every image used by the chart.
- Added a default `nginx` proxy container deployed with Grafana whose config is set in the `ConfigMap` located in `charts/grafana/templates/nginx-config.yaml`. The purpose of this container is to make it possible to view Grafana's UI through a proxy that has a subpath (e.g. Rancher's proxy). This proxy container is set to listen on port `8080` (with a `portName` of `nginx-http` instead of the default `service`), which is also where the Grafana service will now point to, and will forward all requests to the Grafana container listening on the default port `3000`.
- Added a default `nginx` proxy container deployed with Prometheus whose config is set in the `ConfigMap` located in `templates/prometheus/nginx-config.yaml`. The purpose of this container is to make it possible to view Prometheus's UI through a proxy that has a subpath (e.g. Rancher's proxy). This proxy container is set to listen on port `8081` (with a `portName` of `nginx-http` instead of the default `web`), which is also where the Prometheus service will now point to, and will forward all requests to the Prometheus container listening on the default port `9090`.
- Added support for passing CIS Scans in a hardened cluster by introducing a Job that patches the default service account within the `cattle-monitoring-system` and `cattle-dashboards` namespaces on install or upgrade and adding a default allow all `NetworkPolicy` to the `cattle-monitoring-system` and `cattle-dashboards` namespaces.
### Modified
- Updated the chart name from `prometheus-operator` to `rancher-monitoring` and added the `io.rancher.certified: rancher` annotation to `Chart.yaml`
- Modified the default `node-exporter` port from `9100` to `9796`
- Modified the default `nameOverride` to `rancher-monitoring`. This change is necessary as the Prometheus Adapter's default URL (`http://{{ .Values.nameOverride }}-prometheus.{{ .Values.namespaceOverride }}.svc`) is based off of the value used here; if modified, the default Adapter URL must also be modified
- Modified the default `namespaceOverride` to `cattle-monitoring-system`. This change is necessary as the Prometheus Adapter's default URL (`http://{{ .Values.nameOverride }}-prometheus.{{ .Values.namespaceOverride }}.svc`) is based off of the value used here; if modified, the default Adapter URL must also be modified
- Configured some default values for `grafana.service` values and exposed them in the default README.md
- The default namespaces the following ServiceMonitors were changed from the deployment namespace to allow them to continue to monitor metrics when `prometheus.prometheusSpec.ignoreNamespaceSelectors` is enabled:
- `core-dns`: `kube-system`
- `api-server`: `default`
- `kube-controller-manager`: `kube-system`
- `kubelet`: `{{ .Values.kubelet.namespace }}`
- Disabled the following deployments by default (can be enabled if required):
- `AlertManager`
- `kube-controller-manager` metrics exporter
- `kube-etcd` metrics exporter
- `kube-scheduler` metrics exporter
- `kube-proxy` metrics exporter
- Updated default Grafana `deploymentStrategy` to `Recreate` to prevent deployments from being stuck on upgrade if a PV is attached to Grafana
- Modified the default `<serviceMonitor|podMonitor|rule>SelectorNilUsesHelmValues` to default to `false`. As a result, we look for all CRs with any labels in all namespaces by default rather than just the ones tagged with the label `release: rancher-monitoring`.
- Modified the default images used by the `rancher-monitoring` chart to point to Rancher mirrors of the original images from upstream.
- Modified the behavior of the chart to create the Alertmanager Config Secret via a pre-install hook instead of using the normal Helm lifecycle to manage the secret. The benefit of this approach is that all changes to the Config Secret done on a live cluster will never get overridden on a `helm upgrade` since the secret only gets created on a `helm install`. If you would like the secret to be cleaned up on an `helm uninstall`, enable `alertmanager.cleanupOnUninstall`; however, this is disabled by default to prevent the loss of alerting configuration on an uninstall. This secret will never be modified on a `helm upgrade`.
- Modified the default `securityContext` for `Pod` templates across the chart to `{"runAsNonRoot": "true", "runAsUser": "1000"}` and replaced `grafana.rbac.pspUseAppArmor` in favor of `grafana.rbac.pspAnnotations={}` in order to make it possible to deploy this chart on a hardened cluster which does not support Seccomp or AppArmor annotations in PSPs. Users can always choose to specify the annotations they want to use for the PSP directly as part of the values provided.
- Modified `.Values.prometheus.prometheusSpec.containers` to take in a string representing a template that should be rendered by Helm (via `tpl`) instead of allowing a user to provide YAML directly.
- Modified the default Grafana configuration to auto assign users who access Grafana to the Viewer role and enable anonymous access to Grafana dashboards by default. This default works well for a Rancher user who is accessing Grafana via the `kubectl proxy` on the Rancher Dashboard UI since anonymous users who enter via the proxy are authenticated by the k8s API Server, but you can / should modify this behavior if you plan on exposing Grafana in a way that does not require authentication (e.g. as a `NodePort` service).
- Modified the default Grafana configuration to add a default dashboard for Rancher on the Grafana home page.

View File

@ -0,0 +1,158 @@
annotations:
artifacthub.io/license: Apache-2.0
artifacthub.io/links: |
- name: Chart Source
url: https://github.com/prometheus-community/helm-charts
- name: Upstream Project
url: https://github.com/prometheus-operator/kube-prometheus
- name: Upgrade Process
url: https://github.com/prometheus-community/helm-charts/blob/main/charts/kube-prometheus-stack/README.md#upgrading-chart
artifacthub.io/operator: "true"
catalog.cattle.io/auto-install: rancher-monitoring-crd=match
catalog.cattle.io/certified: rancher
catalog.cattle.io/deploys-on-os: windows
catalog.cattle.io/display-name: Monitoring
catalog.cattle.io/kube-version: '>= 1.30.0-0 < 1.33.0-0'
catalog.cattle.io/namespace: cattle-monitoring-system
catalog.cattle.io/permits-os: linux,windows
catalog.cattle.io/provides-gvr: monitoring.coreos.com.prometheus/v1
catalog.cattle.io/rancher-version: '>= 2.11.0-0 < 2.12.0-0'
catalog.cattle.io/release-name: rancher-monitoring
catalog.cattle.io/requests-cpu: 4500m
catalog.cattle.io/requests-memory: 4000Mi
catalog.cattle.io/type: cluster-tool
catalog.cattle.io/ui-component: monitoring
catalog.cattle.io/upstream-version: 69.8.2
apiVersion: v2
appVersion: v0.80.1
dependencies:
- condition: grafana.enabled
name: grafana
repository: file://./charts/grafana
version: 8.10.4
- condition: hardenedKubelet.enabled
name: hardenedKubelet
repository: file://./charts/hardenedKubelet
version: 0.1.5-rancher2
- condition: hardenedNodeExporter.enabled
name: hardenedNodeExporter
repository: file://./charts/hardenedNodeExporter
version: 0.1.5-rancher2
- condition: k3sServer.enabled
name: k3sServer
repository: file://./charts/k3sServer
version: 0.1.5-rancher2
- condition: kubeStateMetrics.enabled
name: kube-state-metrics
repository: file://./charts/kube-state-metrics
version: 5.30.1
- condition: kubeAdmControllerManager.enabled
name: kubeAdmControllerManager
repository: file://./charts/kubeAdmControllerManager
version: 0.1.5-rancher2
- condition: kubeAdmEtcd.enabled
name: kubeAdmEtcd
repository: file://./charts/kubeAdmEtcd
version: 0.1.5-rancher2
- condition: kubeAdmProxy.enabled
name: kubeAdmProxy
repository: file://./charts/kubeAdmProxy
version: 0.1.5-rancher2
- condition: kubeAdmScheduler.enabled
name: kubeAdmScheduler
repository: file://./charts/kubeAdmScheduler
version: 0.1.5-rancher2
- condition: prometheus-adapter.enabled
name: prometheus-adapter
repository: file://./charts/prometheus-adapter
version: 4.13.0
- condition: nodeExporter.enabled
name: prometheus-node-exporter
repository: file://./charts/prometheus-node-exporter
version: 4.44.1
- condition: rke2ControllerManager.enabled
name: rke2ControllerManager
repository: file://./charts/rke2ControllerManager
version: 0.1.5-rancher2
- condition: rke2Etcd.enabled
name: rke2Etcd
repository: file://./charts/rke2Etcd
version: 0.1.5-rancher2
- condition: rke2IngressNginx.enabled
name: rke2IngressNginx
repository: file://./charts/rke2IngressNginx
version: 0.1.5-rancher2
- condition: rke2Proxy.enabled
name: rke2Proxy
repository: file://./charts/rke2Proxy
version: 0.1.5-rancher2
- condition: rke2Scheduler.enabled
name: rke2Scheduler
repository: file://./charts/rke2Scheduler
version: 0.1.5-rancher2
- condition: rkeControllerManager.enabled
name: rkeControllerManager
repository: file://./charts/rkeControllerManager
version: 0.1.5-rancher2
- condition: rkeEtcd.enabled
name: rkeEtcd
repository: file://./charts/rkeEtcd
version: 0.1.5-rancher2
- condition: rkeIngressNginx.enabled
name: rkeIngressNginx
repository: file://./charts/rkeIngressNginx
version: 0.1.5-rancher2
- condition: rkeProxy.enabled
name: rkeProxy
repository: file://./charts/rkeProxy
version: 0.1.5-rancher2
- condition: rkeScheduler.enabled
name: rkeScheduler
repository: file://./charts/rkeScheduler
version: 0.1.5-rancher2
- condition: windowsExporter.enabled
name: windowsExporter
repository: file://./charts/windowsExporter
version: 0.9.1
description: kube-prometheus-stack collects Kubernetes manifests, Grafana dashboards,
and Prometheus rules combined with documentation and scripts to provide easy to
operate end-to-end Kubernetes cluster monitoring with Prometheus using the Prometheus
Operator.
home: https://github.com/prometheus-operator/kube-prometheus
icon: file://assets/logos/rancher-monitoring.png
keywords:
- operator
- prometheus
- kube-prometheus
kubeVersion: '>=1.19.0-0'
maintainers:
- email: andrew@quadcorps.co.uk
name: andrewgkew
url: https://github.com/andrewgkew
- email: gianrubio@gmail.com
name: gianrubio
url: https://github.com/gianrubio
- email: github.gkarthiks@gmail.com
name: gkarthiks
url: https://github.com/gkarthiks
- email: kube-prometheus-stack@sisti.pt
name: GMartinez-Sisti
url: https://github.com/GMartinez-Sisti
- email: github@jkroepke.de
name: jkroepke
url: https://github.com/jkroepke
- email: scott@r6by.com
name: scottrigby
url: https://github.com/scottrigby
- email: miroslav.hadzhiev@gmail.com
name: Xtigyro
url: https://github.com/Xtigyro
- email: quentin.bisson@gmail.com
name: QuentinBisson
url: https://github.com/QuentinBisson
name: rancher-monitoring
sources:
- https://github.com/prometheus-community/helm-charts
- https://github.com/prometheus-operator/kube-prometheus
type: application
version: 106.1.2+up69.8.2-rancher.7

View File

@ -11,26 +11,26 @@ _Note: This chart was formerly named `prometheus-operator` chart, now renamed to
- Kubernetes 1.19+
- Helm 3+
## Usage
The chart is distributed as an [OCI Artifact](https://helm.sh/docs/topics/registries/) as well as via a traditional [Helm Repository](https://helm.sh/docs/topics/chart_repository/).
- OCI Artifact: `oci://ghcr.io/prometheus-community/charts/kube-prometheus-stack`
- Helm Repository: `https://prometheus-community.github.io/helm-charts` with chart `kube-prometheus-stack`
The installation instructions use the OCI registry. Refer to the [`helm repo`]([`helm repo`](https://helm.sh/docs/helm/helm_repo/)) command documentation for information on installing charts via the traditional repository.
### Install Helm Chart
## Get Helm Repository Info
```console
helm install [RELEASE_NAME] oci://ghcr.io/prometheus-community/charts/kube-prometheus-stack
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo update
```
_See [`helm repo`](https://helm.sh/docs/helm/helm_repo/) for command documentation._
## Install Helm Chart
```console
helm install [RELEASE_NAME] prometheus-community/kube-prometheus-stack
```
_See [configuration](#configuration) below._
_See [helm install](https://helm.sh/docs/helm/helm_install/) for command documentation._
### Dependencies
## Dependencies
By default this chart installs additional, dependent charts:
@ -42,17 +42,7 @@ To disable dependencies during installation, see [multiple releases](#multiple-r
_See [helm dependency](https://helm.sh/docs/helm/helm_dependency/) for command documentation._
#### Grafana Dashboards
This chart provisions a collection of curated Grafana dashboards that are automatically loaded into Grafana via ConfigMaps. These dashboards are rendered into the Helm chart under [`templates/grafana/`](https://github.com/prometheus-community/helm-charts/tree/main/charts/kube-prometheus-stack/templates/grafana/), but **this is not their source of truth**.
The dashboards originate from various upstream projects and are gathered and processed using scripts in the [`hack/`](https://github.com/prometheus-community/helm-charts/tree/main/charts/kube-prometheus-stack/hack) directory. For details on how these dashboards are sourced and kept up to date, refer to the [hack/README.md](https://github.com/prometheus-community/helm-charts/blob/main/charts/kube-prometheus-stack/hack/README.md).
> **Note:** The dashboards referenced in the `hack` scripts are usually **not the original source** either. Most originate from separate **Prometheus mixin repositories** (e.g., [kubernetes-mixin](https://github.com/kubernetes-monitoring/kubernetes-mixin)) and are processed through `jsonnet` tooling before being included here. To find the original source in case you want to modify it you may have to search even further upstream.
If you wish to contribute or modify dashboards, please follow the guidance in the `hack/README.md` to ensure consistency and reproducibility.
### Uninstall Helm Chart
## Uninstall Helm Chart
```console
helm uninstall [RELEASE_NAME]
@ -77,10 +67,10 @@ kubectl delete crd servicemonitors.monitoring.coreos.com
kubectl delete crd thanosrulers.monitoring.coreos.com
```
### Upgrading Chart
## Upgrading Chart
```console
helm upgrade [RELEASE_NAME] [CHART]
helm upgrade [RELEASE_NAME] prometheus-community/kube-prometheus-stack
```
With Helm v3, CRDs created by this chart are not updated by default and should be manually updated.
@ -91,7 +81,7 @@ The Chart's [appVersion](https://github.com/prometheus-community/helm-charts/blo
_See [helm upgrade](https://helm.sh/docs/helm/helm_upgrade/) for command documentation._
#### Upgrading an existing Release to a new major version
### Upgrading an existing Release to a new major version
A major chart version change (like v1.2.3 -> v2.0.0) indicates that there is an incompatible breaking change needing manual actions.
@ -103,36 +93,41 @@ for breaking changes between versions.
See [Customizing the Chart Before Installing](https://helm.sh/docs/intro/using_helm/#customizing-the-chart-before-installing). To see all configurable options with detailed comments:
```console
helm show values oci://ghcr.io/prometheus-community/charts/kube-prometheus-stack
helm show values prometheus-community/kube-prometheus-stack
```
You may also `helm show values` on this chart's [dependencies](#dependencies) for additional options.
You may also run `helm show values` on this chart's [dependencies](#dependencies) for additional options.
For templated Grafana datasource definitions (e.g. when using Helm flow control), use `grafana.additionalDataSourcesString`, which is rendered via `tpl`.
### Rancher Monitoring Configuration
### Prometheus High Availability (HA)
The following table shows values exposed by Rancher Monitoring's additions to the chart:
For a basic HA setup, run multiple Prometheus replicas:
| Parameter | Description | Default |
| ----- | ----------- | ------ |
| `nameOverride` | Provide a name that should be used instead of the chart name when naming all resources deployed by this chart |`"rancher-monitoring"`|
| `namespaceOverride` | Override the deployment namespace | `"cattle-monitoring-system"` |
| `global.rbac.userRoles.create` | Create default user ClusterRoles to allow users to interact with Prometheus CRs, ConfigMaps, and Secrets | `true` |
| `global.rbac.userRoles.aggregateToDefaultRoles` | Aggregate default user ClusterRoles into default k8s ClusterRoles | `true` |
| `prometheus-adapter.enabled` | Whether to install [prometheus-adapter](https://github.com/prometheus-community/helm-charts/tree/main/charts/prometheus-adapter) within the cluster | `true` |
| `prometheus-adapter.prometheus.url` | A URL pointing to the Prometheus deployment within your cluster. The default value is set based on the assumption that you plan to deploy the default Prometheus instance from this chart where `.Values.namespaceOverride=cattle-monitoring-system` and `.Values.nameOverride=rancher-monitoring` | `http://rancher-monitoring-prometheus.cattle-monitoring-system.svc` |
| `prometheus-adapter.prometheus.port` | The port on the Prometheus deployment that Prometheus Adapter can make requests to | `9090` |
| `prometheus.prometheusSpec.ignoreNamespaceSelectors` | Ignore NamespaceSelector settings from the PodMonitor and ServiceMonitor configs. If true, PodMonitors and ServiceMonitors can only discover Pods and Services within the namespace they are deployed into | `false` |
```yaml
prometheus:
prometheusSpec:
replicas: 2
podAntiAffinity: "hard"
externalLabels:
cluster: prod-eu1
```
The following values are enabled for different distributions via [rancher-pushprox](https://github.com/rancher/dev-charts/tree/master/packages/rancher-pushprox). See the rancher-pushprox `README.md` for more information on what all values can be configured for the PushProxy chart.
Important notes:
1. `replicas` controls how many Prometheus pods are deployed for each shard.
2. Keep anti-affinity enabled (or hardened) to avoid scheduling all replicas on one node.
3. Do not clear replica/instance external labels in HA setups (`replicaExternalLabelNameClear` / `prometheusExternalLabelNameClear`), otherwise deduplication and alert/source identification become harder.
4. Querying replicas through a Kubernetes Service provides availability, but not sample deduplication across replicas by itself. For global/deduplicated querying, use a Thanos Query layer (or another backend that performs deduplication).
See also Prometheus Operator HA guidance:
- [Prometheus Operator HA docs](https://github.com/prometheus-operator/prometheus-operator/blob/main/Documentation/platform/high-availability.md#prometheus)
| Parameter | Description | Default |
| ----- | ----------- | ------ |
| `rkeControllerManager.enabled` | Create a PushProx installation for monitoring kube-controller-manager metrics in RKE clusters | `false` |
| `rkeScheduler.enabled` | Create a PushProx installation for monitoring kube-scheduler metrics in RKE clusters | `false` |
| `rkeProxy.enabled` | Create a PushProx installation for monitoring kube-proxy metrics in RKE clusters | `false` |
| `rkeIngressNginx.enabled` | Create a PushProx installation for monitoring ingress-nginx metrics in RKE clusters | `false` |
| `rkeEtcd.enabled` | Create a PushProx installation for monitoring etcd metrics in RKE clusters | `false` |
| `rke2IngressNginx.enabled` | Create a PushProx installation for monitoring ingress-nginx metrics in RKE2 clusters | `false` |
| `k3sServer.enabled` | Create a PushProx installation for monitoring k3s-server metrics (accounts for kube-controller-manager, kube-scheduler, and kube-proxy metrics) in k3s clusters | `false` |
| `kubeAdmControllerManager.enabled` | Create a PushProx installation for monitoring kube-controller-manager metrics in kubeAdm clusters | `false` |
| `kubeAdmScheduler.enabled` | Create a PushProx installation for monitoring kube-scheduler metrics in kubeAdm clusters | `false` |
| `kubeAdmProxy.enabled` | Create a PushProx installation for monitoring kube-proxy metrics in kubeAdm clusters | `false` |
| `kubeAdmEtcd.enabled` | Create a PushProx installation for monitoring etcd metrics in kubeAdm clusters | `false` |
### Multiple releases
@ -288,7 +283,7 @@ There is no simple and direct migration path between the charts as the changes a
The capabilities of the old chart are all available in the new chart, including the ability to run multiple prometheus instances on a single cluster - you will need to disable the parts of the chart you do not wish to deploy.
You can check out the tickets for this change at [prometheus-operator/prometheus-operator #592](https://github.com/prometheus-operator/prometheus-operator/issues/592) and [helm/charts #6765](https://github.com/helm/charts/pull/6765).
You can check out the tickets for this change [here](https://github.com/prometheus-operator/prometheus-operator/issues/592) and [here](https://github.com/helm/charts/pull/6765).
### High-level overview of Changes

View File

@ -0,0 +1,46 @@
# Rancher Monitoring and Alerting
This chart is based on the upstream [kube-prometheus-stack](https://github.com/prometheus-community/helm-charts/tree/main/charts/kube-prometheus-stack) chart. The chart deploys [Prometheus Operator](https://github.com/prometheus-operator/prometheus-operator) and its CRDs along with [Grafana](https://github.com/grafana/helm-charts/tree/main/charts/grafana), [Prometheus Adapter](https://github.com/prometheus-community/helm-charts/tree/main/charts/prometheus-adapter) and additional charts / Kubernetes manifests to gather metrics. It allows users to monitor their Kubernetes clusters, view metrics in Grafana dashboards, and set up alerts and notifications.
For more information on how to use the feature, refer to our [docs](https://rancher.com/docs/rancher/v2.x/en/monitoring-alerting/v2.5/).
The chart installs the following components:
- [Prometheus Operator](https://github.com/coreos/prometheus-operator) - The operator provides easy monitoring definitions for Kubernetes services, manages [Prometheus](https://prometheus.io/) and [AlertManager](https://prometheus.io/docs/alerting/latest/alertmanager/) instances, and adds default scrape targets for some Kubernetes components.
- [kube-prometheus](https://github.com/prometheus-operator/kube-prometheus/) - A collection of community-curated Kubernetes manifests, Grafana Dashboards, and PrometheusRules that deploy a default end-to-end cluster monitoring configuration.
- [Grafana](https://github.com/grafana/helm-charts/tree/main/charts/grafana) - Grafana allows a user to create / view dashboards based on the cluster metrics collected by Prometheus.
- [node-exporter](https://github.com/prometheus-community/helm-charts/tree/main/charts/prometheus-node-exporter) / [kube-state-metrics](https://github.com/prometheus-community/helm-charts/tree/main/charts/kube-state-metrics) / [rancher-pushprox](https://github.com/rancher/charts/tree/dev-v2.7/packages/rancher-monitoring/rancher-pushprox/charts) - These charts monitor various Kubernetes components across different Kubernetes cluster types.
- [Prometheus Adapter](https://github.com/prometheus-community/helm-charts/tree/main/charts/prometheus-adapter) - The adapter allows a user to expose custom metrics, resource metrics, and external metrics on the default [Prometheus](https://prometheus.io/) instance to the Kubernetes API Server.
For more information, review the Helm README of this chart.
## Upgrading to Kubernetes v1.25+
Starting in Kubernetes v1.25, [Pod Security Policies](https://kubernetes.io/docs/concepts/security/pod-security-policy/) have been removed from the Kubernetes API.
As a result, **before upgrading to Kubernetes v1.25** (or on a fresh install in a Kubernetes v1.25+ cluster), users are expected to perform an in-place upgrade of this chart with `global.cattle.psp.enabled` set to `false` if it has been previously set to `true`.
> **Note:**
> In this chart release, any previous field that was associated with any PSP resources have been removed in favor of a single global field: `global.cattle.psp.enabled`.
> **Note:**
> If you upgrade your cluster to Kubernetes v1.25+ before removing PSPs via a `helm upgrade` (even if you manually clean up resources), **it will leave the Helm release in a broken state within the cluster such that further Helm operations will not work (`helm uninstall`, `helm upgrade`, etc.).**
>
> If your charts get stuck in this state, please consult the Rancher docs on how to clean up your Helm release secrets.
Upon setting `global.cattle.psp.enabled` to false, the chart will remove any PSP resources deployed on its behalf from the cluster. This is the default setting for this chart.
As a replacement for PSPs, [Pod Security Admission](https://kubernetes.io/docs/concepts/security/pod-security-admission/) should be used. Please consult the Rancher docs for more details on how to configure your chart release namespaces to work with the new Pod Security Admission and apply Pod Security Standards.
## Upgrading from 100.0.0+up16.6.0 to 100.1.0+up19.0.3
### Noticeable changes:
Grafana:
- `sidecar.dashboards.searchNamespace`, `sidecar.datasources.searchNamespace` and `sidecar.notifiers.searchNamespace` support a list of namespaces now.
Kube-state-metrics
- the type of `collectors` is changed from Dictionary to List.
- `kubeStateMetrics.serviceMonitor.namespaceOverride` was replaced by `kube-state-metrics.namespaceOverride`.
### Known issues:
- Occasionally, the upgrade fails with errors related to the webhook `prometheusrulemutate.monitoring.coreos.com`. This is a known issue in the upstream, and the workaround is to trigger the upgrade one more time. [32416](https://github.com/rancher/rancher/issues/32416#issuecomment-828881726)

View File

@ -16,18 +16,8 @@
*.tmp
*~
# Various IDEs
.vscode
.project
.idea/
*.tmproj
# helm/charts
OWNERS
hack/
ci/
kube-prometheus-*.tgz
unittests/
files/dashboards/
UPGRADE.md
CONTRIBUTING.md
.editorconfig

View File

@ -2,28 +2,34 @@ annotations:
artifacthub.io/license: Apache-2.0
artifacthub.io/links: |
- name: Chart Source
url: https://github.com/grafana-community/helm-charts
url: https://github.com/grafana/helm-charts
- name: Upstream Project
url: https://github.com/grafana/grafana
apiVersion: v2
appVersion: 12.4.3
appVersion: 11.5.2
description: The leading tool for querying and visualizing time series and metrics.
home: https://grafana.com
icon: https://artifacthub.io/image/b4fed1a7-6c8f-4945-b99d-096efa3e4116
keywords:
- monitoring
- metric
kubeVersion: ^1.25.0-0
kubeVersion: ^1.8.0-0
maintainers:
- email: zanhsieh@gmail.com
name: zanhsieh
- email: rluckie@cisco.com
name: rtluckie
- email: maor.friedman@redhat.com
name: maorfr
- email: miroslav.hadzhiev@gmail.com
name: Xtigyro
- email: mail@torstenwalter.de
name: torstenwalter
- email: github@jkroepke.de
name: Jan-Otto Kröpke
url: https://github.com/jkroepke
- email: quentin.bisson@gmail.com
name: Quentin Bisson
url: https://github.com/QuentinBisson
name: jkroepke
name: grafana
sources:
- https://github.com/grafana/grafana
- https://github.com/grafana-community/helm-charts
- https://github.com/grafana/helm-charts
type: application
version: 11.6.1
version: 8.10.4

Some files were not shown because too many files have changed in this diff Show More