Runbook Examples

Use these examples as starting points for creating your own runbooks.

Service Restart

A basic template for safely restarting services.

Title: Service Restart Procedure

Description: Standard procedure for restarting a service with proper checks.

Steps:

Step 1: Check Current Status
- Connect to the server
- Run: systemctl status [service-name]
- Note the current state

Step 2: Notify Team
- Post in team channel: "Restarting [service] on [server]"
- Wait for acknowledgment if needed

Step 3: Stop Service
- Run: sudo systemctl stop [service-name]
- Verify stopped: systemctl status [service-name]

Step 4: Wait for Connections
- Wait 30 seconds for existing connections to close
- Check for remaining processes

Step 5: Start Service
- Run: sudo systemctl start [service-name]
- Check status

Step 6: Is Service Running?
- If "active (running)" - Continue to Step 7
- If failed - Go to Step 10

Step 7: Verify Functionality
- Test main endpoint
- Check logs for errors
- Confirm working properly

Step 8: Notify Complete
- Update team channel: "Service restart complete"
- Close any related tickets

Step 10: Troubleshooting Failed Start
- Check logs for error messages
- Verify configuration files
- Check disk space and memory
- If still failing, escalate

Database Maintenance

Template for routine database maintenance tasks.

Title: Database Maintenance

Description: Regular maintenance including vacuum and index checks.

Steps:

Step 1: Check Database Activity
- Connect to database
- Check for active queries
- If busy, go to Step 10 (Reschedule)

Step 2: Start Maintenance Mode
- Update status page
- Notify team

Step 3: Run Vacuum
- Execute: VACUUM ANALYZE;
- Monitor progress
- Note completion time

Step 4: Check Index Health
- Run bloat check query
- Identify indexes needing attention

Step 5: Need Reindex?
- If indexes bloated - Continue to Step 6
- If not - Skip to Step 7

Step 6: Reindex Tables
- For each bloated index:
- Run: REINDEX INDEX [index_name];
- Track completion

Step 7: Verify Performance
- Run sample queries
- Check execution times
- Compare to baseline

Step 8: Exit Maintenance
- Update status page
- Notify team complete
- Document any issues

Step 10: Reschedule Procedure
- Database too busy for maintenance
- Schedule for off-hours
- Notify team of delay

Deployment Rollback

Quick rollback when deployments have issues.

Title: Emergency Rollback

Description: Rollback to previous version when current deployment has problems.

Steps:

Step 1: Confirm Rollback Needed
- Verify issue is deployment-related
- Get approval if required
- Note the problem for review

Step 2: Identify Previous Version
- Check deployment history
- Find last known good version
- Note version number

Step 3: Stop Current Version
- Disable traffic to affected servers
- Stop application services
- Wait for requests to complete

Step 4: Deploy Previous Version
- Run deployment with old version
- Monitor deployment progress

Step 5: Deployment Successful?
- If success - Continue to Step 6
- If failed - Go to Step 10

Step 6: Start Services
- Start application services
- Enable traffic flow
- Monitor startup logs

Step 7: Verify Functionality
- Test critical endpoints
- Check error rates
- Monitor for 5 minutes

Step 8: Is Everything Stable?
- If yes - Continue to Step 9
- If no - Go to Step 15 (Escalate)

Step 9: Document and Notify
- Update incident ticket
- Notify team of rollback
- Schedule review

Step 10: Deployment Failed
- Check disk space
- Verify permissions
- Try manual deployment
- If still failing - Go to Step 15

Step 15: Escalate
- Contact senior engineer
- Provide all error details
- Stand by to assist

SSL Certificate Renewal

Process for renewing certificates before expiration.

Title: SSL Certificate Renewal

Description: Renew SSL certificates before they expire.

Steps:

Step 1: Check Expiration
- Run: openssl x509 -enddate -noout -in /path/to/cert.pem
- Note expiration date
- Verify this is the correct certificate

Step 2: Generate CSR
- Create new private key if needed
- Generate CSR
- Verify CSR details

Step 3: Submit to Certificate Authority
- Log into CA portal
- Submit CSR
- Select validation method

Step 4: Complete Validation
- Follow CA's validation process
- Wait for approval

Step 5: Download New Certificate
- Download from CA portal
- Save certificate files
- Verify certificate details

Step 6: Install Certificate
- Backup current certificate
- Copy new cert to server
- Update configuration files

Step 7: Restart Services
- Restart web server
- Check service status
- Monitor error logs

Step 8: Verify Installation
- Test with browser
- Check for security warnings
- Verify correct cert displayed

Step 9: Cleanup
- Remove old certificate files
- Update documentation
- Set renewal reminder

Customizing Examples

Replace Placeholders

Update bracketed values with your actual:

  • Service names
  • Server addresses
  • File paths
  • Command arguments

Adjust for Your Environment

  • Add or remove steps as needed
  • Change decision points
  • Include your specific tools and commands

Test Before Publishing

  • Execute the runbook in a test environment
  • Verify each path works
  • Get feedback from teammates