Skip to main content

ADR-0016: S3 As Backend for Persistent Data

Status:ACCEPTED
Date:2022-09-13
Author(s):Jannik Hollenbach jannik.hollenbach@iteratec.com, Sven Strittmatter sven.strittmatter@iteratec.com
info

This ADR should have been written prior to implementation. But we started documenting ADR later. This ADR has therefore been written retrospectively to record the decision made at that time.

Context

secureCodeBox needs to saves two things during the execution of a scan: the "raw results / output" of the scanner (e.g. the XML output of nmap) and the findings the parsers extract from it. In the v1 of the secureCodeBox these were stored in camunda variables which were in turn stored in the underlying database as blobs. As both the raw results and findings can get big (multiple megabytes) this requiered a lot of maintenance work to keep the database working correctly without overflowing.

Decision

In v2 we decided to shift the storage to store the raw results and findings as separate files in an S3 bucket.

S3 was chosen because:

  • it can handle large files without much problems
  • is relatively easy to use
  • is well supported by (most) cloud supporters
  • can be "substituted locally" by minio
  • can be used with presigned urls to easily give individual components temporary read or write access to specific files in the s3 bucket

Consequences

  • The secureCodeBox requires a S3 bucket locally in the cluster.
  • By default the operator sets up a local minio bucket unless access to an external bucket is configured.