云原生应用最关键的 10 个安全风险以及如何降低这些风险。
OWASP 云原生应用安全 Top 10 确定了在 Kubernetes、Docker 和无服务器架构等平台上运行的云原生应用最突出的安全风险。它涵盖了整个云原生堆栈中的错误配置、供应链风险、机密管理等。
配置错误的云服务、容器和协调器是造成云原生漏洞的主要原因。这包括以 root 身份运行容器、使用默认配置、公开 Kubernetes API 服务器以及未启用云资源审计日志。
攻击者可以利用错误配置获得完全的集群控制权、逃离容器、访问云元数据服务或在整个基础架构中移动。一个配置错误的 S3 桶或开放式 Kubernetes 仪表板就会导致大规模数据泄露。
# Running container as root with no resource limits FROM ubuntu:latest RUN apt-get update && apt-get install -y curl wget COPY app /app # No USER directive — runs as root! CMD ["/app/server"]
FROM gcr.io/distroless/static:nonroot COPY --chown=65534:65534 app /app USER 65534:65534 EXPOSE 8080 ENTRYPOINT ["/app/server"]
云原生应用程序接收的输入不仅来自传统的 HTTP 请求,还来自云事件(SQS、Pub/Sub、EventBridge)、无服务器触发器和服务间通信。这些载体中的任何一个注入漏洞都可能导致命令执行、数据外泄或权限升级。
由云事件触发的无服务器功能可能会在未经验证的情况下处理不受信任的数据,从而导致操作系统命令注入、NoSQL 注入或事件驱动的 SSRF。攻击者可以毒化消息队列或事件总线,从而危及下游服务。
# Lambda function processing S3 event without sanitizing filename import os def handler(event, context): bucket = event['Records'][0]['s3']['bucket']['name'] key = event['Records'][0]['s3']['object']['key'] # Command injection via crafted filename! os.system(f"aws s3 cp s3://{bucket}/{key} /tmp/{key}")
import boto3, re from urllib.parse import unquote_plus def handler(event, context): s3 = boto3.client('s3') bucket = event['Records'][0]['s3']['bucket']['name'] key = unquote_plus(event['Records'][0]['s3']['object']['key']) # Validate key against allowlist pattern if not re.match(r'^[\w\-./]+$', key): raise ValueError(f"Invalid S3 key: {key}") # Use SDK instead of shell commands s3.download_file(bucket, key, f'/tmp/{key.split("/")[-1]}')
云原生环境涉及多个身份层:云 IAM、Kubernetes RBAC、服务网格 mTLS 和应用级认证。任何一层配置错误或过度放任的策略都可能导致横向移动、权限升级或未经授权访问敏感资源。
分配给 pod 的 IAM 角色权限过高,会让工作负载访问整个云账户。缺失的服务对服务身份验证允许任何受损的 pod 冒充其他服务。
# Overly permissive ClusterRoleBinding apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: name: default-admin subjects: - kind: ServiceAccount name: default # Default SA — shared by all pods! namespace: default roleRef: kind: ClusterRole name: cluster-admin # Full cluster access! apiGroup: rbac.authorization.k8s.io
# Dedicated ServiceAccount with minimal Role apiVersion: v1 kind: ServiceAccount metadata: name: order-service namespace: production annotations: eks.amazonaws.com/role-arn: arn:aws:iam::123456:role/order-svc --- apiVersion: rbac.authorization.k8s.io/v1 kind: Role metadata: name: order-service-role namespace: production rules: - apiGroups: [""] resources: ["configmaps"] verbs: ["get"]
CI/CD 管道是高价值目标,因为它们拥有对生产环境的写入访问权限。破坏构建管道会导致供应链攻击、恶意代码注入和部署有后门的映像。不安全的管道配置、中毒的基础镜像和未签名的工件都会导致这种风险。
受到攻击的 CI/CD 管道可将恶意代码部署到所有环境中、窃取存储在管道变量中的机密或篡改容器映像。SolarWinds 式攻击展示了供应链受损的灾难性影响。
# Insecure CI pipeline — unpinned actions, no image signing name: Deploy on: push jobs: deploy: runs-on: ubuntu-latest steps: - uses: actions/checkout@main # Unpinned — could be compromised! - run: | docker build -t myapp:latest . docker push myregistry/myapp:latest # No signing! kubectl apply -f deploy.yaml # No admission control!
name: Secure Deploy on: push permissions: id-token: write jobs: deploy: runs-on: ubuntu-latest steps: - uses: actions/checkout@a5ac7e51b41094c92402da3b24376905380afc29 # Pinned SHA - run: | docker build -t myregistry/myapp:${{ github.sha }} . cosign sign myregistry/myapp:${{ github.sha }} # Sign image - run: trivy image myregistry/myapp:${{ github.sha }} --exit-code 1 - run: kubectl apply -f deploy.yaml # Admission controller verifies signatures
API 密钥、数据库凭据和 TLS 证书等机密通常在源代码中硬编码,存储在明文 ConfigMaps 中,或嵌入在容器映像中。Kubernetes Secrets 默认只采用 base64 编码,而非加密,因此只能提供最低限度的保护。
暴露的秘密可让攻击者直接访问数据库、云账户和第三方服务。git 历史记录、容器层或环境变量中的秘密很容易被发现和利用。
# Secrets in plaintext environment variables apiVersion: v1 kind: Pod metadata: name: my-app spec: containers: - name: app env: - name: DB_PASSWORD value: "SuperSecret123!" # Plaintext in manifest! - name: AWS_SECRET_KEY value: "AKIA..." # Cloud credentials in YAML!
# Use External Secrets Operator with a vault backend apiVersion: external-secrets.io/v1beta1 kind: ExternalSecret metadata: name: my-app-secrets spec: refreshInterval: 1h secretStoreRef: name: vault-backend kind: ClusterSecretStore target: name: my-app-secrets data: - secretKey: db-password remoteRef: key: secret/data/myapp property: db-password
默认情况下,Kubernetes 允许所有 pod 不受限制地相互通信。网络策略、安全组和防火墙规则的缺失或过度放任,会导致群集内部和云服务之间的横向移动。
如果没有网络分段,被入侵的 pod 可以访问集群中的任何其他服务,直接访问数据库,或将数据外泄到外部端点。扁平网络扩大了任何漏洞的爆炸半径。
# No NetworkPolicy — all pods can talk to everything apiVersion: v1 kind: Namespace metadata: name: production # No NetworkPolicy resources defined # All ingress and egress traffic is allowed by default
# Default-deny all traffic, then allow only what's needed apiVersion: networking.k8s.io/v1 kind: NetworkPolicy metadata: name: default-deny-all namespace: production spec: podSelector: {} policyTypes: ["Ingress", "Egress"] --- apiVersion: networking.k8s.io/v1 kind: NetworkPolicy metadata: name: allow-frontend-to-api namespace: production spec: podSelector: matchLabels: app: api-server ingress: - from: - podSelector: matchLabels: app: frontend ports: - port: 8080
云原生应用程序严重依赖开源基础镜像、库和 Kubernetes 操作员。运行带有已知 CVE 的过时或未打补丁的组件会使应用程序暴露于记录详实的漏洞中。容器映像通常包含数百个软件包,每个软件包都可能存在漏洞。
基础镜像中的已知漏洞(如 Log4Shell、OpenSSL 漏洞)可被公开工具利用。攻击者会主动扫描运行有漏洞版本的容器。一个未打补丁的库就可能危及整个工作负载。
# Using outdated base image with known CVEs FROM node:14 # EOL version with known vulns COPY package.json . RUN npm install # No audit, no lockfile verification COPY . . CMD ["node", "server.js"]
# Use current, slim base image with vulnerability scanning FROM node:22-slim AS build WORKDIR /app COPY package.json package-lock.json ./ RUN npm ci --only=production # Reproducible install from lockfile RUN npm audit --audit-level=high # Fail on high+ vulns FROM gcr.io/distroless/nodejs22-debian12 COPY --from=build /app /app CMD ["/app/server.js"]
云原生环境使得启动资源很容易,但跟踪资源却很困难。被遗弃的容器、被遗忘的命名空间、陈旧的云资源以及影子 IT 部署都会产生一个不受管理的攻击面。如果没有适当的资产清单,安全团队就无法保护他们不知道存在的资产。
被遗忘或无人管理的资源通常运行过时的软件,缺乏安全监控,并且拥有过期的凭证。攻击者将这些被忽视的资产作为切入点,因为它们不太可能受到监控或打补丁。
# Resources created without tagging or lifecycle management resource "aws_instance" "test_server" { ami = "ami-0abcdef1234567890" instance_type = "t3.medium" # No tags — who owns this? What's it for? # No lifecycle policy — runs forever } resource "aws_s3_bucket" "temp_data" { bucket = "temp-data-2024" # No expiration, no access logging }
resource "aws_instance" "test_server" { ami = "ami-0abcdef1234567890" instance_type = "t3.medium" tags = { Name = "test-server" Owner = "platform-team" Environment = "staging" ManagedBy = "terraform" ExpiresAt = "2026-04-30" } } resource "aws_s3_bucket" "temp_data" { bucket = "temp-data-2024" tags = { Owner = "data-team" ManagedBy = "terraform" } lifecycle_rule { enabled = true expiration { days = 90 } } }
如果没有适当的资源配额和限制,单个行为不端或受到攻击的工作负载就可能消耗掉所有可用的 CPU、内存或存储空间,从而导致其他应用程序无法提供服务。在没有资源限制的环境中,加密劫持攻击尤其常见。
攻击者可以部署加密挖掘容器或资源密集型工作负载,从而消耗所有集群资源。在没有限制的情况下,一个 pod 中的 fork 炸弹或内存泄漏会导致整个节点崩溃,从而影响所有共用工作负载。
# Pod with no resource limits apiVersion: v1 kind: Pod metadata: name: my-app spec: containers: - name: app image: myapp:latest # No resources block — can consume unlimited CPU/memory!
apiVersion: v1 kind: Pod metadata: name: my-app spec: containers: - name: app image: myapp:1.2.3 resources: requests: cpu: "100m" memory: "128Mi" limits: cpu: "500m" memory: "512Mi" --- # Namespace-level quota apiVersion: v1 kind: ResourceQuota metadata: name: compute-quota namespace: production spec: hard: requests.cpu: "10" requests.memory: "20Gi" limits.cpu: "20" limits.memory: "40Gi"
云原生环境会跨多个层生成日志:应用程序、容器运行时、协调器、服务网格和云平台。如果不进行集中日志记录、跨层关联和运行时威胁检测,安全事件就会未被发现或发现得太晚。
短暂容器在终止时会丢失日志,从而破坏取证证据。如果没有 Kubernetes 审计日志和运行时监控,攻击者就可以创建后门、提升权限或外泄数据,而不会触发任何警报。
# No audit policy, no log forwarding apiVersion: v1 kind: Pod metadata: name: my-app spec: containers: - name: app image: myapp:latest # Logs go to stdout only — lost when pod restarts # No audit logging configured on the cluster # No runtime security monitoring
# Kubernetes audit policy for security events apiVersion: audit.k8s.io/v1 kind: Policy rules: - level: RequestResponse resources: - group: "" resources: ["secrets", "configmaps"] - level: Metadata resources: - group: "rbac.authorization.k8s.io" resources: ["clusterroles", "clusterrolebindings"] - level: Metadata verbs: ["create", "delete", "patch"] # Deploy Falco for runtime threat detection # Forward logs to SIEM via Fluent Bit / Fluentd
| 身份证 | 脆弱性 | 严重性 | 关键缓解措施 |
|---|---|---|---|
| CNAS-1 | 不安全的云/容器/协调配置 | Critical | 无分发图像、Pod 安全标准、IaC 扫描 |
| CNAS-2 | 注入漏洞(云事件) | Critical | 输入验证、SDK over shell、出口过滤 |
| CNAS-3 | 验证和授权不当 | Critical | 最小权限 RBAC、IRSA、mTLS |
| CNAS-4 | CI/CD 管道和供应链缺陷 | High | 插针式 SHA、图像签名、SLSA 出处 |
| CNAS-5 | 不安全的秘密存储 | Critical | 保险库/秘密管理器、KMS 加密、秘密扫描 |
| CNAS-6 | 过度许可的网络政策 | High | 默认-拒绝,出口限制,Calico/Cilium |
| CNAS-7 | 已知存在漏洞的组件 | High | 图像扫描、最小基准图像、SBOM |
| CNAS-8 | 资产管理不当 | Medium | 资源标记、自动清理、IaC |
| CNAS-9 | 计算资源配额不足 | Medium | 资源限制、ResourceQuotas、使用监控 |
| CNAS-10 | 无效的日志记录和监控 | High | 审计日志、Falco/Sysdig、集中式 SIEM |