ADR-0016: S3 As Backend for Persistent Data
Status: | ACCEPTED |
Date: | 2022-09-13 |
Author(s): | Jannik Hollenbach jannik.hollenbach@iteratec.com, Sven Strittmatter sven.strittmatter@iteratec.com |
This ADR should have been written prior to implementation. But we started documenting ADR later. This ADR has therefore been written retrospectively to record the decision made at that time.
Context
secureCodeBox needs to saves two things during the execution of a scan: the "raw results / output" of the scanner (e.g. the XML output of nmap) and the findings the parsers extract from it. In the v1 of the secureCodeBox these were stored in camunda variables which were in turn stored in the underlying database as blobs. As both the raw results and findings can get big (multiple megabytes) this requiered a lot of maintenance work to keep the database working correctly without overflowing.
Decision
In v2 we decided to shift the storage to store the raw results and findings as separate files in an S3 bucket.
S3 was chosen because:
- it can handle large files without much problems
- is relatively easy to use
- is well supported by (most) cloud supporters
- can be "substituted locally" by minio
- can be used with presigned urls to easily give individual components temporary read or write access to specific files in the s3 bucket
Consequences
- The secureCodeBox requires a S3 bucket locally in the cluster.
- By default the operator sets up a local minio bucket unless access to an external bucket is configured.