Incident response plan for Weblate¶

Scope and objectives¶

This IRP covers incidents impacting the confidentiality, integrity, or availability of Weblate-operated deployments.

備註

This plan is specifically designed for deployments operated by Weblate s.r.o. Other deployments need to adapt provider-specific and organizational steps to their own environment.

角色和責任¶

Incident Response Lead (IRL): Coordinates all phases of the response process.
System Administrator: Executes containment and recovery measures.
Security Officer: Evaluates security impact and regulatory consequences.
Data Protection Officer (DPO): Evaluates if personal data (PII) was compromised and manages mandatory GDPR notifications.
Communications Lead: Manages notifications to internal stakeholders and external parties if required.

通訊後勤¶

內部通訊：
- Primary channel is Signal for human-to-human coordination.
- Technical alerts remain outside of Signal to avoid noise.
外部通訊：
- E-mail is used to reach customers.
- Customer contact lists are maintained in several locations to ensure access during service outages.
Public Disclosure:
- If an incident includes a Weblate product vulnerability, follow the product vulnerability reporting process and 漏洞揭露政策 in 漏洞與事件處理.

事件類別與嚴重程度¶

事件啟動¶

Declare an incident when an event is confirmed or strongly suspected to affect the confidentiality, integrity, or availability of the service beyond routine operational noise.
The Security Officer declares the incident, assigns the initial severity, and appoints the Incident Response Lead (IRL).
If the Security Officer is unavailable, any available senior operator may declare the incident and hand over ownership as soon as practical.
Reclassify the incident if the scope or impact changes during investigation.

事件類型¶

Category 1 – Unauthorized Access
Category 2 – Data Integrity Violation
Category 3 – Service Outage or Degradation
Category 4 – Misconfiguration or Deployment Error

嚴重程度級別與 SLA¶

嚴重程度	定義	目標知悉	目標初始動作
Critical	Total outage; Admin compromise; Active data breach; requires immediate containment.	< 30 分鐘	< 4 小時
高	Core feature failure; PII leak of single user.	< 2 小時	12 小時
中	Performance degradation; Minor security issue.	1 個工作天	3 個工作天
低	UI bugs; Staging issues; Non-security errors.	最佳結果	最佳結果

Incident response lifecycle¶

準備¶

Ensure regular daily backups of the PostgreSQL database and the data directory using Weblate's built-in backup with rotation, see 備份和移動 Weblate.
Ensure Weblate uses a properly configured reverse proxy (e.g., NGINX) with HTTPS (TLS 1.2+).
Enable 2FA for all admin-level accounts.
Keep the Weblate instance and its dependencies (Python, Django, Celery, database, etc.) up to date.
Integrate with SIEM systems using the GELF protocol for audit and application log forwarding.

識別¶

Monitor system and application logs (journalctl, reverse proxy logs, Weblate application and audit logs).
Analyze login events, webhook executions, and push/pull failures.
Configure alerting (via Prometheus, Zabbix, or SIEM) for multiple login failures, unexpected restarts, or irregular VCS actions.

抑制¶

Create an incident record with a case ID and record timeline updates as actions are taken.
Coordinate human response in Signal and keep technical alerting in the existing monitoring systems.
For Category 1 or 2 incidents, create a manual Hetzner Cloud Snapshot before taking disruptive action when it is safe to do so.
- Name format: IRP-[CaseID]-[YYYYMMDD]-Evidence.
- These are separate from standard rotating backups and must be preserved for analysis.
Isolate the affected host or service as needed (for example by firewall rules or service isolation).
Disable external integrations (Git/webhooks) if they are part of the attack vector.
Suspend affected user accounts immediately.
Revoke or rotate affected administrative, API, VCS, and webhook credentials as applicable.
Preserve relevant evidence, including system logs, reverse proxy logs, Weblate application and audit logs, affected configuration state, and the list of impacted credentials or integrations.

根除¶

Remove any unauthorized code or data.
Patch known vulnerabilities by upgrading Weblate or server components.
Validate binary and repository integrity using SHA-256 checksums or Git logs.

恢復¶

Restore affected services or data from the latest known-good Weblate backups.
PII Assessment: DPO determines if the breach requires a 72-hour GDPR notification.
Reintroduce services in a phased approach.
Confirm the root cause has been removed or a compensating control is in place before restoring normal traffic.
Rotate affected credentials and verify integrity of the restored system, repositories, and configuration.
The Security Officer and IRL approve returning to normal operations.
Monitor logs and system behavior continuously for at least 72 hours post-recovery.

Post-incident review¶

Timeline: Hold a review meeting within 5 business days of incident closure.
Compile a full incident timeline and actions taken.
Perform Root Cause Analysis (RCA) and document it within 10 business days.
Update security policies and IRP documentation based on findings.
Review the effectiveness of detection and containment mechanisms.
Verify whether escalation, alerting, and external communication followed 漏洞與事件處理 as expected.