# Kubernetes Security Checklist

Comprehensive security checklist for production Kubernetes deployments following 2025 standards.

## Pod Security Standards

### Namespace Configuration

- [ ] Pod Security Standards labels applied to namespaces
  ```yaml
  labels:
    pod-security.kubernetes.io/enforce: restricted
    pod-security.kubernetes.io/audit: restricted
    pod-security.kubernetes.io/warn: restricted
  ```

- [ ] Appropriate profile selected:
  - **Restricted**: For production workloads (RECOMMENDED)
  - **Baseline**: For legacy apps that can't meet restricted
  - **Privileged**: Only for system workloads

### Pod Security Context

- [ ] `runAsNonRoot: true` - Never run containers as root
- [ ] `runAsUser` set to non-root UID (1000+)
- [ ] `fsGroup` configured for volume permissions
- [ ] `seccompProfile.type: RuntimeDefault` - Enable seccomp filtering
- [ ] `supplementalGroups` configured if needed

### Container Security Context

- [ ] `allowPrivilegeEscalation: false` - Block privilege escalation
- [ ] `readOnlyRootFilesystem: true` - Immutable container filesystem
- [ ] `runAsNonRoot: true` - Enforce non-root user
- [ ] `capabilities.drop: [ALL]` - Drop all Linux capabilities
- [ ] Only required capabilities added back (avoid if possible)
- [ ] Never use privileged containers (`privileged: false`)

### Restricted Profile Requirements

The **Restricted** profile enforces:

```yaml
spec:
  securityContext:
    runAsNonRoot: true
    runAsUser: 1000  # Non-root
    fsGroup: 2000
    seccompProfile:
      type: RuntimeDefault

  containers:
  - securityContext:
      allowPrivilegeEscalation: false
      readOnlyRootFilesystem: true
      runAsNonRoot: true
      capabilities:
        drop:
        - ALL
```

**Restrictions:**
- Must run as non-root user
- Cannot escalate privileges
- Must drop all capabilities
- Cannot access host namespaces
- Cannot use hostPath volumes
- Must use approved volume types
- Cannot bind to privileged ports (< 1024) without NET_BIND_SERVICE

## Container Image Security

### Image Configuration

- [ ] Use specific image tags (never `:latest`)
  ```yaml
  image: nginx:1.25-alpine
  ```

- [ ] Use minimal base images (alpine, distroless, scratch)
- [ ] Pull from private registries or verified sources
- [ ] `imagePullPolicy: IfNotPresent` or `Always`
- [ ] Image pull secrets configured for private registries

### Image Scanning

- [ ] Images scanned for vulnerabilities (Trivy, Snyk, Clair, Grype)
- [ ] Critical/High CVEs resolved before deployment
- [ ] Regular rescanning of running images
- [ ] Automated scanning in CI/CD pipeline
- [ ] Image signing with Cosign or Notary v2

### Image Registry

- [ ] Use private registries for production
- [ ] Registry authentication configured
- [ ] Image pull secrets created
- [ ] Registry supports vulnerability scanning
- [ ] Access to registry restricted (RBAC/IAM)

## Secrets Management

### Basic Security

- [ ] Secrets stored in Kubernetes Secrets (not ConfigMaps)
- [ ] Secrets never committed to Git
- [ ] Secrets mounted as volumes (preferred over env vars)
- [ ] Volume permissions set to read-only (`defaultMode: 0400`)
- [ ] `automountServiceAccountToken: false` unless needed

### Advanced Security

- [ ] Encryption at rest enabled (EncryptionConfiguration)
- [ ] External secrets management used:
  - [ ] HashiCorp Vault
  - [ ] AWS Secrets Manager
  - [ ] Azure Key Vault
  - [ ] Google Secret Manager
  - [ ] External Secrets Operator

- [ ] For GitOps:
  - [ ] Sealed Secrets or SOPS used
  - [ ] Encrypted secrets in Git
  - [ ] Never commit plaintext secrets

### Secret Rotation

- [ ] Secrets rotated regularly (90 days or less)
- [ ] Automated rotation implemented
- [ ] Secret age monitored
- [ ] Rotation after security incidents

### Secret Access

- [ ] RBAC limits secret access
- [ ] Service accounts have minimal permissions
- [ ] Secret access audited
- [ ] Applications don't log secret values

## Network Security

### Network Policies

- [ ] NetworkPolicies defined for all workloads
- [ ] Default deny all traffic policy
- [ ] Explicit allow rules for required traffic
- [ ] Both Ingress and Egress policies defined

Example default deny:
```yaml
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: default-deny-all
spec:
  podSelector: {}
  policyTypes:
  - Ingress
  - Egress
```

### Service Mesh (Optional)

- [ ] Service mesh for mTLS (Istio, Linkerd)
- [ ] Traffic encryption between services
- [ ] Fine-grained access control
- [ ] Traffic monitoring and observability

### Ingress Security

- [ ] TLS/SSL termination configured
- [ ] Valid SSL certificates (Let's Encrypt, etc.)
- [ ] TLS 1.2/1.3 only (no TLS 1.0/1.1)
- [ ] Strong cipher suites
- [ ] Security headers configured:
  - `X-Frame-Options: DENY`
  - `X-Content-Type-Options: nosniff`
  - `X-XSS-Protection: 1; mode=block`
  - `Strict-Transport-Security: max-age=31536000`
  - `Content-Security-Policy`

- [ ] Rate limiting configured
- [ ] WAF (Web Application Firewall) if applicable
- [ ] DDoS protection enabled

## RBAC (Role-Based Access Control)

### Service Accounts

- [ ] Dedicated ServiceAccount per application
- [ ] `automountServiceAccountToken: false` by default
- [ ] No use of default ServiceAccount
- [ ] ServiceAccounts scoped to namespace

### Roles and RoleBindings

- [ ] Principle of least privilege applied
- [ ] Roles scoped to namespace (not ClusterRoles)
- [ ] Specific resources and verbs granted (avoid `*`)
- [ ] Regular RBAC audits
- [ ] Remove unused permissions

### User Access

- [ ] Individual user accounts (no shared credentials)
- [ ] Groups used for user management
- [ ] Developers have read-only access to production
- [ ] Separate admin accounts for elevated access
- [ ] MFA (Multi-Factor Authentication) enabled

Example restrictive role:
```yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: app-role
rules:
- apiGroups: [""]
  resources: ["configmaps", "secrets"]
  verbs: ["get", "list"]
  resourceNames: ["app-config", "app-secrets"]
```

## Resource Management

### Resource Limits

- [ ] CPU requests and limits defined
- [ ] Memory requests and limits defined
- [ ] Resources sized appropriately
- [ ] QoS class considered (Guaranteed preferred)
- [ ] LimitRanges configured per namespace
- [ ] ResourceQuotas prevent resource exhaustion

### Storage Security

- [ ] PersistentVolumes encrypted at rest
- [ ] Storage classes use encryption
- [ ] Volume permissions restricted
- [ ] No hostPath volumes (use PVs instead)
- [ ] EmptyDir size limits set

## API Server Security

### Authentication

- [ ] OIDC/OAuth2 for user authentication
- [ ] Client certificate authentication
- [ ] Service account tokens for pods
- [ ] Anonymous authentication disabled
- [ ] Webhook token authentication if needed

### Authorization

- [ ] RBAC enabled (default in modern K8s)
- [ ] Node authorization enabled
- [ ] Webhook authorization if needed
- [ ] ABAC (Attribute-Based Access Control) disabled

### API Server Flags

- [ ] `--anonymous-auth=false`
- [ ] `--enable-admission-plugins` includes:
  - PodSecurityAdmission (or PodSecurityPolicy for older versions)
  - NodeRestriction
  - LimitRanger
  - ResourceQuota
  - ServiceAccount
- [ ] Audit logging enabled
- [ ] TLS encryption for API server

## Admission Control

### Pod Security Admission

- [ ] PodSecurity admission plugin enabled
- [ ] Namespace labels configured
- [ ] Appropriate profiles enforced

### Other Admission Controllers

- [ ] LimitRanger - Enforce resource limits
- [ ] ResourceQuota - Enforce quotas
- [ ] NodeRestriction - Restrict kubelet permissions
- [ ] AlwaysPullImages - Always pull images (optional)
- [ ] PodNodeSelector - Restrict node selection

### Policy Engines (Optional)

- [ ] OPA/Gatekeeper for custom policies
- [ ] Kyverno for Kubernetes-native policies
- [ ] Custom admission webhooks

## Monitoring and Auditing

### Audit Logging

- [ ] Audit logging enabled on API server
- [ ] Audit logs collected centrally
- [ ] Audit logs analyzed for suspicious activity
- [ ] Retention policy defined (90+ days)

### Security Monitoring

- [ ] Runtime security monitoring (Falco, Sysdig)
- [ ] Network traffic monitoring
- [ ] Log aggregation and analysis (ELK, Splunk)
- [ ] Intrusion detection system
- [ ] File integrity monitoring

### Vulnerability Scanning

- [ ] Regular cluster vulnerability scans
- [ ] Image vulnerability scanning
- [ ] Compliance scanning (CIS Benchmark)
- [ ] Automated remediation where possible

### Alerting

- [ ] Security alerts configured:
  - Privileged container creation
  - Secret access anomalies
  - Failed authentication attempts
  - Resource exhaustion
  - Suspicious network activity

## Cluster Security

### Control Plane

- [ ] Control plane components secured
- [ ] etcd encrypted at rest
- [ ] etcd access restricted
- [ ] API server behind firewall/security group
- [ ] Control plane updates applied regularly

### Node Security

- [ ] Nodes hardened (CIS benchmark)
- [ ] Minimal OS installation
- [ ] Automatic security updates
- [ ] SSH access restricted
- [ ] Node isolation via network policies

### Multi-Tenancy

- [ ] Namespaces for tenant isolation
- [ ] NetworkPolicies prevent cross-tenant traffic
- [ ] ResourceQuotas prevent resource hogging
- [ ] RBAC prevents cross-tenant access
- [ ] Consider separate clusters for strong isolation

## Compliance

### Standards

- [ ] CIS Kubernetes Benchmark compliance
- [ ] PCI DSS compliance (if applicable)
- [ ] HIPAA compliance (if applicable)
- [ ] SOC 2 compliance (if applicable)
- [ ] GDPR compliance (if applicable)

### Tools

- [ ] kube-bench for CIS compliance
- [ ] kubesec for manifest security
- [ ] Polaris for best practices
- [ ] kubeaudit for security audits

## Backup and Disaster Recovery

### Backups

- [ ] etcd backups automated
- [ ] PersistentVolume backups
- [ ] Configuration backups (manifests)
- [ ] Secrets backed up securely
- [ ] Backup restoration tested regularly

### Disaster Recovery

- [ ] DR plan documented
- [ ] RTO (Recovery Time Objective) defined
- [ ] RPO (Recovery Point Objective) defined
- [ ] Failover procedures tested
- [ ] Multi-region deployment (if required)

## Supply Chain Security

### Software Supply Chain

- [ ] Container images from trusted sources
- [ ] Base images regularly updated
- [ ] Dependencies scanned for vulnerabilities
- [ ] SBOM (Software Bill of Materials) generated
- [ ] Image provenance verified

### Deployment Pipeline

- [ ] CI/CD pipeline secured
- [ ] Pipeline secrets managed securely
- [ ] Code signing implemented
- [ ] Artifact signing (Cosign, Notary)
- [ ] Pipeline RBAC configured

## Incident Response

### Preparation

- [ ] Incident response plan documented
- [ ] Security contacts defined
- [ ] Escalation procedures established
- [ ] Playbooks for common scenarios

### Detection

- [ ] Real-time alerting configured
- [ ] Log correlation and analysis
- [ ] Anomaly detection
- [ ] Regular security reviews

### Response

- [ ] Automated response where possible
- [ ] Manual escalation procedures
- [ ] Communication plan
- [ ] Post-incident reviews

## Security Checklist Summary

### Critical (Must Have)

- [x] Pod Security Standards enforced
- [x] Non-root containers
- [x] Read-only root filesystem
- [x] Secrets not in code/manifests
- [x] NetworkPolicies defined
- [x] RBAC configured
- [x] Resource limits set
- [x] Image scanning enabled

### High Priority (Should Have)

- [ ] Encryption at rest
- [ ] External secrets management
- [ ] Audit logging
- [ ] Runtime security monitoring
- [ ] Regular vulnerability scans
- [ ] Automated updates

### Recommended (Nice to Have)

- [ ] Service mesh (mTLS)
- [ ] Policy engine (OPA/Kyverno)
- [ ] Image signing
- [ ] SBOM generation
- [ ] Advanced threat detection

## Regular Review Schedule

- **Daily**: Check security alerts and audit logs
- **Weekly**: Review access logs and RBAC permissions
- **Monthly**: Vulnerability scans, patch management
- **Quarterly**: Security audits, compliance reviews
- **Annually**: Full security assessment, DR testing

## Additional Resources

- [Kubernetes Security Best Practices](https://kubernetes.io/docs/concepts/security/)
- [CIS Kubernetes Benchmark](https://www.cisecurity.org/benchmark/kubernetes)
- [OWASP Kubernetes Security Cheat Sheet](https://cheatsheetseries.owasp.org/cheatsheets/Kubernetes_Security_Cheat_Sheet.html)
- [NSA Kubernetes Hardening Guide](https://www.nsa.gov/Press-Room/News-Highlights/Article/Article/2716980/nsa-cisa-release-kubernetes-hardening-guidance/)
