# Kubernetes Best Practices 2025

Comprehensive guide to production-ready Kubernetes deployments following 2025 industry standards.

## Table of Contents

1. [Security](#security)
2. [Resource Management](#resource-management)
3. [Health & Reliability](#health--reliability)
4. [Configuration Management](#configuration-management)
5. [Networking](#networking)
6. [Storage](#storage)
7. [Observability](#observability)
8. [High Availability](#high-availability)
9. [GitOps & CI/CD](#gitops--cicd)
10. [Organizational Practices](#organizational-practices)

## Security

### Pod Security Standards

Kubernetes defines three security profiles:

1. **Privileged**: Unrestricted, for system-level workloads only
2. **Baseline**: Minimally restrictive, prevents known privilege escalations
3. **Restricted**: Heavily restricted, follows pod hardening best practices (RECOMMENDED)

**Apply at namespace level:**

```yaml
apiVersion: v1
kind: Namespace
metadata:
  name: production
  labels:
    pod-security.kubernetes.io/enforce: restricted
    pod-security.kubernetes.io/audit: restricted
    pod-security.kubernetes.io/warn: restricted
    pod-security.kubernetes.io/enforce-version: v1.34
```

### Security Context Configuration

**Always configure these security settings:**

```yaml
spec:
  # Pod-level security context
  securityContext:
    runAsNonRoot: true
    runAsUser: 1000
    runAsGroup: 3000
    fsGroup: 2000
    seccompProfile:
      type: RuntimeDefault

  containers:
  - name: app
    # Container-level security context (overrides pod-level)
    securityContext:
      allowPrivilegeEscalation: false
      readOnlyRootFilesystem: true
      runAsNonRoot: true
      runAsUser: 1000
      capabilities:
        drop:
        - ALL
        # Add ONLY required capabilities
        # add:
        # - NET_BIND_SERVICE
```

**Key settings:**

- `runAsNonRoot: true` - Never run as root user
- `readOnlyRootFilesystem: true` - Prevent filesystem modifications
- `allowPrivilegeEscalation: false` - Block privilege escalation
- `capabilities.drop: [ALL]` - Remove all Linux capabilities by default
- `seccompProfile.type: RuntimeDefault` - Enable seccomp filtering

### Service Accounts

```yaml
spec:
  serviceAccountName: my-app  # Use dedicated service account
  automountServiceAccountToken: false  # Don't auto-mount unless needed
```

**RBAC Best Practices:**

- Create dedicated ServiceAccounts for each application
- Follow principle of least privilege
- Use Roles/RoleBindings for namespace-scoped permissions
- Use ClusterRoles/ClusterRoleBindings sparingly
- Regularly audit permissions

### Secrets Management

**Basic practices:**

```yaml
env:
- name: API_KEY
  valueFrom:
    secretKeyRef:
      name: app-secrets
      key: api-key
```

**Advanced practices:**

1. **Enable encryption at rest** - Configure EncryptionConfiguration
2. **Use external secrets management:**
   - HashiCorp Vault
   - AWS Secrets Manager
   - Azure Key Vault
   - Google Secret Manager
   - External Secrets Operator

3. **For GitOps:**
   - Use Sealed Secrets
   - Use SOPS (Secrets OPerationS)
   - Never commit plain secrets

4. **Prefer volume mounts over environment variables:**

```yaml
volumeMounts:
- name: secrets
  mountPath: /etc/secrets
  readOnly: true
volumes:
- name: secrets
  secret:
    secretName: app-secrets
    defaultMode: 0400  # Read-only for owner
```

### Image Security

```yaml
spec:
  containers:
  - name: app
    image: nginx:1.25-alpine  # Use specific versions, never :latest
    imagePullPolicy: IfNotPresent
```

**Best practices:**

- Use specific image tags (version numbers, SHAs)
- Scan images for vulnerabilities (Trivy, Snyk, Clair)
- Use minimal base images (alpine, distroless)
- Sign images with Cosign/Notary
- Use private registries with authentication
- Implement image pull policies

### Network Security

**Use NetworkPolicies to restrict traffic:**

```yaml
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: api-network-policy
spec:
  podSelector:
    matchLabels:
      app: api
  policyTypes:
  - Ingress
  - Egress
  ingress:
  - from:
    - podSelector:
        matchLabels:
          app: frontend
    ports:
    - protocol: TCP
      port: 8080
```

## Resource Management

### Requests and Limits

**Always set both requests and limits:**

```yaml
resources:
  requests:
    cpu: 100m      # Guaranteed CPU
    memory: 128Mi  # Guaranteed memory
  limits:
    cpu: 200m      # Maximum CPU
    memory: 256Mi  # Maximum memory (hard limit)
```

**QoS Classes:**

1. **Guaranteed** - requests == limits (RECOMMENDED for critical apps)
2. **Burstable** - requests < limits
3. **BestEffort** - no requests/limits (avoid in production)

**CPU units:**
- `1000m` = 1 CPU core
- `100m` = 0.1 CPU core (100 millicores)

**Memory units:**
- `Mi` = Mebibytes (1024-based)
- `Gi` = Gibibytes
- `M` = Megabytes (1000-based)
- `G` = Gigabytes

**Sizing guidelines:**

```yaml
# Microservice / Small API
resources:
  requests:
    cpu: 100m
    memory: 128Mi
  limits:
    cpu: 200m
    memory: 256Mi

# Medium Application
resources:
  requests:
    cpu: 500m
    memory: 512Mi
  limits:
    cpu: 1000m
    memory: 1Gi

# Database / Heavy Application
resources:
  requests:
    cpu: 1000m
    memory: 2Gi
  limits:
    cpu: 2000m
    memory: 4Gi
```

### LimitRanges and ResourceQuotas

**LimitRange** - Set default limits per namespace:

```yaml
apiVersion: v1
kind: LimitRange
metadata:
  name: default-limits
spec:
  limits:
  - default:
      cpu: 500m
      memory: 512Mi
    defaultRequest:
      cpu: 100m
      memory: 128Mi
    type: Container
```

**ResourceQuota** - Set namespace-wide limits:

```yaml
apiVersion: v1
kind: ResourceQuota
metadata:
  name: namespace-quota
spec:
  hard:
    requests.cpu: "10"
    requests.memory: 20Gi
    limits.cpu: "20"
    limits.memory: 40Gi
    pods: "50"
```

## Health & Reliability

### Health Probes

**Three types of probes:**

1. **Startup Probe** - For slow-starting applications
2. **Liveness Probe** - Determines when to restart container
3. **Readiness Probe** - Determines when to send traffic

```yaml
# Startup probe - only runs during initialization
startupProbe:
  httpGet:
    path: /health/startup
    port: 8080
  initialDelaySeconds: 0
  periodSeconds: 10
  timeoutSeconds: 3
  successThreshold: 1
  failureThreshold: 30  # Allow 5 minutes for startup

# Liveness probe - restarts if failing
livenessProbe:
  httpGet:
    path: /health/live
    port: 8080
  initialDelaySeconds: 0
  periodSeconds: 10
  timeoutSeconds: 3
  successThreshold: 1
  failureThreshold: 3

# Readiness probe - removes from service if failing
readinessProbe:
  httpGet:
    path: /health/ready
    port: 8080
  initialDelaySeconds: 0
  periodSeconds: 5
  timeoutSeconds: 3
  successThreshold: 1
  failureThreshold: 3
```

**Probe types:**

```yaml
# HTTP GET
httpGet:
  path: /health
  port: 8080
  scheme: HTTP
  httpHeaders:
  - name: Custom-Header
    value: value

# TCP Socket
tcpSocket:
  port: 8080

# Exec command
exec:
  command:
  - cat
  - /tmp/healthy

# gRPC (K8s 1.24+)
grpc:
  port: 9090
  service: my-service
```

**Best practices:**

- Use startup probes for slow-starting apps (> 30s)
- Liveness checks should be cheap and fast
- Readiness checks can be more thorough
- Don't check external dependencies in liveness (causes cascading failures)
- Set appropriate timeouts and thresholds

### Deployment Strategies

**RollingUpdate (default, recommended):**

```yaml
strategy:
  type: RollingUpdate
  rollingUpdate:
    maxSurge: 1        # Extra pods during update
    maxUnavailable: 0  # Ensure zero downtime
```

**Recreate (for stateful apps):**

```yaml
strategy:
  type: Recreate
```

### Pod Disruption Budgets

**Prevent too many pods from being down:**

```yaml
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: app-pdb
spec:
  minAvailable: 2  # Or use maxUnavailable: 1
  selector:
    matchLabels:
      app: my-app
```

### Graceful Shutdown

```yaml
spec:
  terminationGracePeriodSeconds: 30

  containers:
  - name: app
    lifecycle:
      preStop:
        exec:
          command:
          - /bin/sh
          - -c
          - sleep 5  # Allow time for connection draining
```

**Application should:**
- Handle SIGTERM signal
- Stop accepting new requests
- Complete in-flight requests
- Close database connections
- Flush logs/metrics

## Configuration Management

### ConfigMaps

**Use for non-sensitive configuration:**

```yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: app-config
data:
  app.properties: |
    server.port=8080
    log.level=info
```

**Mount as files (preferred):**

```yaml
volumeMounts:
- name: config
  mountPath: /etc/config
  readOnly: true
volumes:
- name: config
  configMap:
    name: app-config
    defaultMode: 0444  # Read-only
```

**Or as environment variables:**

```yaml
envFrom:
- configMapRef:
    name: app-config
```

### Environment Variables

**Preferred order:**

1. **ConfigMaps/Secrets** (externalized)
2. **Pod annotations/labels** (metadata)
3. **Downward API** (pod/container info)
4. **Container image ENV** (defaults only)

```yaml
env:
# From ConfigMap
- name: LOG_LEVEL
  valueFrom:
    configMapKeyRef:
      name: app-config
      key: log.level

# From Secret
- name: API_KEY
  valueFrom:
    secretKeyRef:
      name: app-secrets
      key: api-key

# From field
- name: POD_NAME
  valueFrom:
    fieldRef:
      fieldPath: metadata.name

# From resource
- name: CPU_REQUEST
  valueFrom:
    resourceFieldRef:
      containerName: app
      resource: requests.cpu
```

## Networking

### Service Types

**ClusterIP (default, internal only):**

```yaml
spec:
  type: ClusterIP
  selector:
    app: my-app
  ports:
  - port: 80
    targetPort: 8080
```

**NodePort (exposes on each node):**

```yaml
spec:
  type: NodePort
  ports:
  - port: 80
    targetPort: 8080
    nodePort: 30080  # 30000-32767
```

**LoadBalancer (cloud provider LB):**

```yaml
spec:
  type: LoadBalancer
  externalTrafficPolicy: Local  # Preserve client IP
  loadBalancerSourceRanges:
  - 10.0.0.0/8  # Restrict access
```

### Ingress

**Use for HTTP/HTTPS routing:**

```yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: app-ingress
  annotations:
    cert-manager.io/cluster-issuer: "letsencrypt-prod"
    nginx.ingress.kubernetes.io/ssl-redirect: "true"
    nginx.ingress.kubernetes.io/rate-limit: "100"
spec:
  ingressClassName: nginx
  tls:
  - hosts:
    - example.com
    secretName: tls-cert
  rules:
  - host: example.com
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: frontend
            port:
              number: 80
```

### DNS

**Service DNS names:**

- Within namespace: `service-name`
- Cross-namespace: `service-name.namespace`
- Fully qualified: `service-name.namespace.svc.cluster.local`

**Pod DNS:**

- `pod-ip.namespace.pod.cluster.local`

## Storage

### Volumes

**emptyDir (ephemeral):**

```yaml
volumes:
- name: tmp
  emptyDir: {}
```

**PersistentVolumeClaim:**

```yaml
volumes:
- name: data
  persistentVolumeClaim:
    claimName: app-data-pvc
```

### PersistentVolumeClaims

```yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: app-data-pvc
spec:
  accessModes:
  - ReadWriteOnce
  storageClassName: fast-ssd
  resources:
    requests:
      storage: 10Gi
```

**Access modes:**
- **ReadWriteOnce (RWO)** - Single node read/write
- **ReadOnlyMany (ROX)** - Multiple nodes read-only
- **ReadWriteMany (RWX)** - Multiple nodes read/write

### StatefulSets

**For stateful applications:**

```yaml
apiVersion: apps/v1
kind: StatefulSet
spec:
  serviceName: postgres-headless
  volumeClaimTemplates:
  - metadata:
      name: data
    spec:
      accessModes:
      - ReadWriteOnce
      resources:
        requests:
          storage: 10Gi
```

## Observability

### Labels and Annotations

**Standard labels (recommended):**

```yaml
metadata:
  labels:
    app.kubernetes.io/name: myapp
    app.kubernetes.io/version: "1.0.0"
    app.kubernetes.io/component: frontend
    app.kubernetes.io/part-of: web-application
    app.kubernetes.io/managed-by: kubectl
    app.kubernetes.io/instance: myapp-prod
```

**Annotations for metadata:**

```yaml
metadata:
  annotations:
    description: "Application description"
    contact: "team@example.com"
    docs: "https://docs.example.com"
    prometheus.io/scrape: "true"
    prometheus.io/port: "9090"
```

### Logging

**Best practices:**

- Log to stdout/stderr (12-factor app)
- Use structured logging (JSON)
- Include correlation IDs
- Use appropriate log levels
- Don't log sensitive data

### Metrics

**Expose Prometheus metrics:**

```yaml
spec:
  containers:
  - name: app
    ports:
    - name: metrics
      containerPort: 9090
```

## High Availability

### Replicas

```yaml
spec:
  replicas: 3  # Minimum 2-3 for HA
```

### Pod Anti-Affinity

**Spread pods across nodes:**

```yaml
spec:
  affinity:
    podAntiAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
      - labelSelector:
          matchLabels:
            app: my-app
        topologyKey: kubernetes.io/hostname
```

### Node Affinity

**Prefer specific nodes:**

```yaml
spec:
  affinity:
    nodeAffinity:
      preferredDuringSchedulingIgnoredDuringExecution:
      - weight: 100
        preference:
          matchExpressions:
          - key: disktype
            operator: In
            values:
            - ssd
```

### Horizontal Pod Autoscaling

```yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: app-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: my-app
  minReplicas: 3
  maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 70
  - type: Resource
    resource:
      name: memory
      target:
        type: Utilization
        averageUtilization: 80
```

## GitOps & CI/CD

### Version Control

- Store all manifests in Git
- Use declarative YAML
- Tag releases
- Use pull requests for changes

### Tools

- **Flux** - GitOps for Kubernetes
- **ArgoCD** - Declarative GitOps
- **Helm** - Package manager
- **Kustomize** - Template-free customization

### Kustomize Example

**kustomization.yaml:**

```yaml
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization

resources:
- deployment.yaml
- service.yaml

commonLabels:
  app: myapp
  env: production

namespace: production

images:
- name: myapp
  newTag: v1.2.3
```

## Organizational Practices

### Namespaces

**Use namespaces for isolation:**

- `production` - Production workloads
- `staging` - Staging environment
- `development` - Dev environment
- `monitoring` - Observability stack
- `ingress` - Ingress controllers

### Naming Conventions

**Resources:**
- Use kebab-case: `my-app-deployment`
- Be descriptive: `frontend-web-app`
- Include environment: `api-production`

**Labels:**
- Use standard labels (app.kubernetes.io/*)
- Be consistent across resources

### Documentation

**Annotate resources:**

```yaml
metadata:
  annotations:
    description: "What this resource does"
    runbook: "https://wiki.example.com/runbook"
    contact: "team@example.com"
```

### Updates and Maintenance

- **Update Kubernetes regularly** (stay within 2 minor versions of latest)
- **Patch container images** (security updates)
- **Rotate secrets** (regularly and after incidents)
- **Review resource usage** (right-size requests/limits)
- **Audit RBAC** (remove unused permissions)

## Summary Checklist

- [ ] Security contexts configured (non-root, read-only FS, dropped capabilities)
- [ ] Resource requests and limits set
- [ ] Health probes configured (startup, liveness, readiness)
- [ ] Secrets externalized (not in manifests)
- [ ] Labels and annotations applied
- [ ] Multiple replicas for HA
- [ ] Pod disruption budgets configured
- [ ] Network policies defined
- [ ] Graceful shutdown implemented
- [ ] Monitoring and logging enabled
- [ ] Documentation added
- [ ] Tested in non-production first
- [ ] GitOps workflow established
- [ ] Backup and disaster recovery planned

## Additional Resources

- [Official Kubernetes Documentation](https://kubernetes.io/docs/)
- [Kubernetes Best Practices (Google)](https://cloud.google.com/kubernetes-engine/docs/best-practices)
- [Production Checklist](https://learnkube.com/production-best-practices)
- [Security Checklist](https://kubernetes.io/docs/concepts/security/security-checklist/)
