Skip to content

Developer Guide – Deployment

This section explains how to deploy the Pipeline Investigation Kit safely, and more importantly, why the deployment is structured this way.

The deployment strategy prioritizes:

  • safety
  • reversibility
  • observability

Deployment Model

The project is deployed using AWS SAM (Serverless Application Model).

All infrastructure is defined in a single template.yaml, including:

  • APIs
  • Lambda functions
  • DynamoDB tables
  • S3 bucket
  • SQS queue
  • IAM roles
  • event source mappings

There is no manual AWS setup required.


Environments

The system is environment-aware by design.

Typical environments:

  • dev
  • staging
  • prod

Each environment gets:

  • isolated DynamoDB tables
  • isolated S3 bucket
  • isolated SQS queue
  • isolated API Gateway stage

This prevents cross-environment data leakage.


First-Time Deployment

Initial deployment should always be done with the processor disabled.

sam deploy --guided

When prompted for parameters:

EnableProcessor = false

Why?

Because:

  • ingest and replay are safe without processing
  • processor consumes messages automatically
  • enabling it too early removes control

You want to observe before acting.


Incremental Enablement Strategy

Deployment is intentionally split into phases.

Phase 1 – Ingest Only

Enabled:

  • Ingest API
  • DynamoDB metadata tables
  • S3 raw storage

Disabled:

  • Processor
  • Event source mapping

This allows you to verify:

  • API works
  • deduplication works
  • raw data is stored correctly

Phase 2 – Replay Enabled

Enabled:

  • Replay API
  • SQS queue

Still disabled:

  • Processor

This allows:

  • replay requests
  • queue inspection
  • message validation

You should manually inspect SQS messages at this stage.


Phase 3 – Processor Enabled

Only after verification:

sam deploy --parameter-overrides EnableProcessor=true

This:

  • creates the Processor Lambda
  • attaches the SQS event source mapping
  • begins automatic consumption

At this point the system is fully live.


Safe Rollback Strategy

If something goes wrong:

Disable Processor Immediately

sam deploy --parameter-overrides EnableProcessor=false

This:

  • removes the event source mapping
  • stops processing
  • preserves queue messages

No data is lost.


Deployment Is Idempotent

You can safely re-run deploy commands.

SAM + CloudFormation ensure:

  • unchanged resources are not recreated
  • data stores are preserved
  • IAM roles are updated safely

Zero-Downtime Behavior

The system is designed so that:

  • APIs remain available during deploy
  • SQS buffers messages
  • processor restarts are safe

Temporary delays do not cause data loss.


Common Deployment Mistakes

❌ Enabling Processor Too Early

Symptoms:

  • messages disappear unexpectedly
  • aggregates look incorrect
  • difficult debugging

Fix:

  • disable processor
  • inspect replay output
  • re-enable

❌ Forgetting Environment Isolation

Symptoms:

  • replay returns unexpected data
  • mixed test and prod data

Fix:

  • verify stack name
  • verify API URL
  • verify table names

❌ Deploying Without Observability

Always verify:

  • CloudWatch logs exist
  • metrics are emitted
  • DRY_RUN works

If you can’t observe it, don’t deploy it.


Before enabling processor:

  • Ingest API responds correctly
  • Duplicate events are detected
  • Raw S3 objects exist
  • Replay returns expected events
  • SQS messages look correct
  • DRY_RUN works end-to-end

Only then enable processing.


Next: Configuration

Next we’ll cover:

  • environment variables
  • execution modes
  • DRY_RUN behavior
  • tuning knobs

👉 Continue with Guide → Configuration