Developer Guide – Deployment¶
This section explains how to deploy the Pipeline Investigation Kit safely, and more importantly, why the deployment is structured this way.
The deployment strategy prioritizes:
- safety
- reversibility
- observability
Deployment Model¶
The project is deployed using AWS SAM (Serverless Application Model).
All infrastructure is defined in a single template.yaml, including:
- APIs
- Lambda functions
- DynamoDB tables
- S3 bucket
- SQS queue
- IAM roles
- event source mappings
There is no manual AWS setup required.
Environments¶
The system is environment-aware by design.
Typical environments:
devstagingprod
Each environment gets:
- isolated DynamoDB tables
- isolated S3 bucket
- isolated SQS queue
- isolated API Gateway stage
This prevents cross-environment data leakage.
First-Time Deployment¶
Initial deployment should always be done with the processor disabled.
sam deploy --guided
When prompted for parameters:
EnableProcessor = false
Why?¶
Because:
- ingest and replay are safe without processing
- processor consumes messages automatically
- enabling it too early removes control
You want to observe before acting.
Incremental Enablement Strategy¶
Deployment is intentionally split into phases.
Phase 1 – Ingest Only¶
Enabled:
- Ingest API
- DynamoDB metadata tables
- S3 raw storage
Disabled:
- Processor
- Event source mapping
This allows you to verify:
- API works
- deduplication works
- raw data is stored correctly
Phase 2 – Replay Enabled¶
Enabled:
- Replay API
- SQS queue
Still disabled:
- Processor
This allows:
- replay requests
- queue inspection
- message validation
You should manually inspect SQS messages at this stage.
Phase 3 – Processor Enabled¶
Only after verification:
sam deploy --parameter-overrides EnableProcessor=true
This:
- creates the Processor Lambda
- attaches the SQS event source mapping
- begins automatic consumption
At this point the system is fully live.
Safe Rollback Strategy¶
If something goes wrong:
Disable Processor Immediately¶
sam deploy --parameter-overrides EnableProcessor=false
This:
- removes the event source mapping
- stops processing
- preserves queue messages
No data is lost.
Deployment Is Idempotent¶
You can safely re-run deploy commands.
SAM + CloudFormation ensure:
- unchanged resources are not recreated
- data stores are preserved
- IAM roles are updated safely
Zero-Downtime Behavior¶
The system is designed so that:
- APIs remain available during deploy
- SQS buffers messages
- processor restarts are safe
Temporary delays do not cause data loss.
Common Deployment Mistakes¶
❌ Enabling Processor Too Early¶
Symptoms:
- messages disappear unexpectedly
- aggregates look incorrect
- difficult debugging
Fix:
- disable processor
- inspect replay output
- re-enable
❌ Forgetting Environment Isolation¶
Symptoms:
- replay returns unexpected data
- mixed test and prod data
Fix:
- verify stack name
- verify API URL
- verify table names
❌ Deploying Without Observability¶
Always verify:
- CloudWatch logs exist
- metrics are emitted
- DRY_RUN works
If you can’t observe it, don’t deploy it.
Recommended Deployment Checklist¶
Before enabling processor:
- Ingest API responds correctly
- Duplicate events are detected
- Raw S3 objects exist
- Replay returns expected events
- SQS messages look correct
- DRY_RUN works end-to-end
Only then enable processing.
Next: Configuration¶
Next we’ll cover:
- environment variables
- execution modes
- DRY_RUN behavior
- tuning knobs
👉 Continue with Guide → Configuration