Runbook Examples

Use these examples as starting points for creating your own runbooks.

Service Restart

A basic template for safely restarting services.

Title: Service Restart Procedure

Description: Standard procedure for restarting a service with proper checks.

Steps:

Step 1: Check Current Status
- Connect to the server
- Run: systemctl status [service-name]
- Note the current state

Step 2: Notify Team
- Post in team channel: "Restarting [service] on [server]"
- Wait for acknowledgment if needed

Step 3: Stop Service
- Run: sudo systemctl stop [service-name]
- Verify stopped: systemctl status [service-name]

Step 4: Wait for Connections
- Wait 30 seconds for existing connections to close
- Check for remaining processes

Step 5: Start Service
- Run: sudo systemctl start [service-name]
- Check status

Step 6: Is Service Running?
- If "active (running)" - Continue to Step 7
- If failed - Go to Step 10

Step 7: Verify Functionality
- Test main endpoint
- Check logs for errors
- Confirm working properly

Step 8: Notify Complete
- Update team channel: "Service restart complete"
- Close any related tickets

Step 10: Troubleshooting Failed Start
- Check logs for error messages
- Verify configuration files
- Check disk space and memory
- If still failing, escalate

Database Maintenance

Template for routine database maintenance tasks.

Title: Database Maintenance

Description: Regular maintenance including vacuum and index checks.

Steps:

Step 1: Check Database Activity
- Connect to database
- Check for active queries
- If busy, go to Step 10 (Reschedule)

Step 2: Start Maintenance Mode
- Update status page
- Notify team

Step 3: Run Vacuum
- Execute: VACUUM ANALYZE;
- Monitor progress
- Note completion time

Step 4: Check Index Health
- Run bloat check query
- Identify indexes needing attention

Step 5: Need Reindex?
- If indexes bloated - Continue to Step 6
- If not - Skip to Step 7

Step 6: Reindex Tables
- For each bloated index:
- Run: REINDEX INDEX [index_name];
- Track completion

Step 7: Verify Performance
- Run sample queries
- Check execution times
- Compare to baseline

Step 8: Exit Maintenance
- Update status page
- Notify team complete
- Document any issues

Step 10: Reschedule Procedure
- Database too busy for maintenance
- Schedule for off-hours
- Notify team of delay

Deployment Rollback

Quick rollback when deployments have issues.

Title: Emergency Rollback

Description: Rollback to previous version when current deployment has problems.

Steps:

Step 1: Confirm Rollback Needed
- Verify issue is deployment-related
- Get approval if required
- Note the problem for review

Step 2: Identify Previous Version
- Check deployment history
- Find last known good version
- Note version number

Step 3: Stop Current Version
- Disable traffic to affected servers
- Stop application services
- Wait for requests to complete

Step 4: Deploy Previous Version
- Run deployment with old version
- Monitor deployment progress

Step 5: Deployment Successful?
- If success - Continue to Step 6
- If failed - Go to Step 10

Step 6: Start Services
- Start application services
- Enable traffic flow
- Monitor startup logs

Step 7: Verify Functionality
- Test critical endpoints
- Check error rates
- Monitor for 5 minutes

Step 8: Is Everything Stable?
- If yes - Continue to Step 9
- If no - Go to Step 15 (Escalate)

Step 9: Document and Notify
- Update incident ticket
- Notify team of rollback
- Schedule review

Step 10: Deployment Failed
- Check disk space
- Verify permissions
- Try manual deployment
- If still failing - Go to Step 15

Step 15: Escalate
- Contact senior engineer
- Provide all error details
- Stand by to assist

SSL Certificate Renewal

Process for renewing certificates before expiration.

Title: SSL Certificate Renewal

Description: Renew SSL certificates before they expire.

Steps:

Step 1: Check Expiration
- Run: openssl x509 -enddate -noout -in /path/to/cert.pem
- Note expiration date
- Verify this is the correct certificate

Step 2: Generate CSR
- Create new private key if needed
- Generate CSR
- Verify CSR details

Step 3: Submit to Certificate Authority
- Log into CA portal
- Submit CSR
- Select validation method

Step 4: Complete Validation
- Follow CA's validation process
- Wait for approval

Step 5: Download New Certificate
- Download from CA portal
- Save certificate files
- Verify certificate details

Step 6: Install Certificate
- Backup current certificate
- Copy new cert to server
- Update configuration files

Step 7: Restart Services
- Restart web server
- Check service status
- Monitor error logs

Step 8: Verify Installation
- Test with browser
- Check for security warnings
- Verify correct cert displayed

Step 9: Cleanup
- Remove old certificate files
- Update documentation
- Set renewal reminder

Customizing Examples

Replace Placeholders

Update bracketed values with your actual:

Service names
Server addresses
File paths
Command arguments

Adjust for Your Environment

Add or remove steps as needed
Change decision points
Include your specific tools and commands

Test Before Publishing

Execute the runbook in a test environment
Verify each path works
Get feedback from teammates

Creating Runbooks - Build your own runbooks
Decision Steps - Add conditional logic
Runbooks Overview - Understanding runbooks

Runbook Examples

Service Restart

Database Maintenance

Deployment Rollback

SSL Certificate Renewal

Customizing Examples

Replace Placeholders

Adjust for Your Environment

Test Before Publishing

Related