Skip to content

Add additional encryption options #27

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ The project is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/)
### Added

- Option `use_externally_managed_dataflow_sa` to be able to use pre-existing externally-managed service account as Dataflow worker service account. User is expected to apply and manage IAM permissions over external resources (e.g. Cloud KMS key or Secret version) outside of this module.
- Options `pubsub_kms_key_name` and `gcs_kms_key_name` to be able to add encryption of created resources with customer managed keys.

### Removed

Expand Down
2 changes: 2 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,9 @@ These deployment templates are provided as is, without warranty. See [Copyright
| <a name="input_dataflow_template_version"></a> [dataflow_template_version](#input_dataflow_template_version) | (Optional) Dataflow template release version (default 'latest'). Override this for version pinning e.g. '2021-08-02-00_RC00'. Must specify version only since template GCS path will be deduced automatically: 'gs://dataflow-templates/`version`/Cloud_PubSub_to_Splunk' | `string` |
| <a name="input_dataflow_worker_service_account"></a> [dataflow_worker_service_account](#input_dataflow_worker_service_account) | (Optional) Name of Dataflow worker service account to be created and used to execute job operations. In the default case of creating a new service account (`use_externally_managed_dataflow_sa=false`), this parameter must be 6-30 characters long, and match the regular expression [a-z]([-a-z0-9]*[a-z0-9]). If the parameter is empty, worker service account defaults to project's Compute Engine default service account. If using external service account (`use_externally_managed_dataflow_sa=true`), this parameter must be the full email address of the external service account. | `string` |
| <a name="input_deploy_replay_job"></a> [deploy_replay_job](#input_deploy_replay_job) | (Optional) Determines if replay pipeline should be deployed or not (default: `false`) | `bool` |
| <a name="input_gcs_kms_key_name"></a> [gcs_kms_key_name](#input_gcs_kms_key_name) | (Optional) The `id` of a Cloud KMS key that will be used to encrypt objects inserted into temporary bucket. User is responsible for permissions to this key for Cloud Storage Service Account. | `string` |
| <a name="input_primary_subnet_cidr"></a> [primary_subnet_cidr](#input_primary_subnet_cidr) | The CIDR Range of the primary subnet | `string` |
| <a name="input_pubsub_kms_key_name"></a> [pubsub_kms_key_name](#input_pubsub_kms_key_name) | (Optional) The resource name of the Cloud KMS CryptoKey to be used to protect access to messages published on created topics. Your project's PubSub service account (`service-{{PROJECT_NUMBER}}@gcp-sa-pubsub.iam.gserviceaccount.com`) must have `roles/cloudkms.cryptoKeyEncrypterDecrypter` to use this feature. The expected format is `projects/*/locations/*/keyRings/*/cryptoKeys/*`. | `string` |
| <a name="input_scoping_project"></a> [scoping_project](#input_scoping_project) | Cloud Monitoring scoping project ID to create dashboard under.<br>This assumes a pre-existing scoping project whose metrics scope contains the `project` where dataflow job is to be deployed.<br>See [Cloud Monitoring settings](https://cloud.google.com/monitoring/settings) for more details on scoping project.<br>If parameter is empty, scoping project defaults to value of `project` parameter above. | `string` |
| <a name="input_splunk_hec_token"></a> [splunk_hec_token](#input_splunk_hec_token) | (Optional) Splunk HEC token. Must be defined if `splunk_hec_token_source` if type of `PLAINTEXT` or `KMS`. | `string` |
| <a name="input_splunk_hec_token_kms_encryption_key"></a> [splunk_hec_token_kms_encryption_key](#input_splunk_hec_token_kms_encryption_key) | (Optional) The Cloud KMS key to decrypt the HEC token string. Required if `splunk_hec_token_source` is type of KMS (default: '') | `string` |
Expand Down
5 changes: 3 additions & 2 deletions main.tf
Original file line number Diff line number Diff line change
Expand Up @@ -91,8 +91,9 @@ locals {
}

resource "google_pubsub_topic" "dataflow_input_pubsub_topic" {
project = var.project
name = local.dataflow_input_topic_name
project = var.project
name = local.dataflow_input_topic_name
kms_key_name = var.pubsub_kms_key_name
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will kms_key_name be ignored if user doesn't provide this input?

}

resource "google_pubsub_subscription" "dataflow_input_pubsub_subscription" {
Expand Down
11 changes: 9 additions & 2 deletions pipeline.tf
Original file line number Diff line number Diff line change
Expand Up @@ -14,8 +14,9 @@


resource "google_pubsub_topic" "dataflow_deadletter_pubsub_topic" {
project = var.project
name = local.dataflow_output_deadletter_topic_name
project = var.project
name = local.dataflow_output_deadletter_topic_name
kms_key_name = var.pubsub_kms_key_name
}

resource "google_pubsub_subscription" "dataflow_deadletter_pubsub_sub" {
Expand All @@ -37,6 +38,12 @@ resource "google_storage_bucket" "dataflow_job_temp_bucket" {
name = local.dataflow_temporary_gcs_bucket_name
location = var.region
storage_class = "REGIONAL"
dynamic "encryption" {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can see the intent behind this dynamic block. However it's not easy to read at all. Is there another better way to apply a conditional here? Maybe two bucket resources with conditional/count, one encrypted, the other is not enccypted.

Also, what is driving this requirement given this is a temporary bucket?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There could be organization policy applied that you are not able to create any buckets despite of their content without CMEK for example.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thx for clarifying what's driving this PR. Org policies requiring use of CMEK for other services (GCS, PubSub or Dataflow) makes sense.

for_each = (var.gcs_kms_key_name == "") ? [] : [1]
content {
default_kms_key_name = var.gcs_kms_key_name
}
}
}

resource "google_storage_bucket_object" "dataflow_job_temp_object" {
Expand Down
4 changes: 4 additions & 0 deletions sample.tfvars
Original file line number Diff line number Diff line change
Expand Up @@ -47,3 +47,7 @@ scoping_project = "[MY_PROJECT]"

# Replay job settings
deploy_replay_job = false

# Security parameters
pubsub_kms_key_name = "projects/[MY_PROJECT]/locations/[MY_REGION]/keyRings/[MY_KEYRING_NAME]/cryptoKeys/[MY_KEY_NAME]"
gcs_kms_key_name = "projects/[MY_PROJECT]/locations/[MY_REGION]/keyRings/[MY_KEYRING_NAME]/cryptoKeys/[MY_KEY_NAME]"
20 changes: 20 additions & 0 deletions variables.tf
Original file line number Diff line number Diff line change
Expand Up @@ -196,3 +196,23 @@ variable "use_externally_managed_dataflow_sa" {
default = false
description = "(Optional) Determines if the worker service account provided by `dataflow_worker_service_account` variable should be created by this module (default) or is managed outside of the module. In the latter case, user is expected to apply and manage the service account IAM permissions over external resources (e.g. Cloud KMS key or Secret version) before running this module."
}

variable "pubsub_kms_key_name" {
type = string
description = "(Optional) The resource name of the Cloud KMS CryptoKey to be used to protect access to messages published on created topics. Your project's PubSub service account (`service-{{PROJECT_NUMBER}}@gcp-sa-pubsub.iam.gserviceaccount.com`) must have `roles/cloudkms.cryptoKeyEncrypterDecrypter` to use this feature. The expected format is `projects/*/locations/*/keyRings/*/cryptoKeys/*`."
default = ""
validation {
condition = can(regex("^projects\\/[^\\n\\r\\/]+\\/locations\\/[^\\n\\r\\/]+\\/keyRings\\/[^\\n\\r\\/]+\\/cryptoKeys\\/[^\\n\\r\\/]+$", var.pubsub_kms_key_name)) || var.pubsub_kms_key_name == ""
error_message = "Pub/Sub KMS key name must match: '^projects\\/[^\\n\\r\\/]+\\/locations\\/[^\\n\\r\\/]+\\/keyRings\\/[^\\n\\r\\/]+\\/cryptoKeys\\/[^\\n\\r\\/]+$' pattern."
}
}

variable "gcs_kms_key_name" {
type = string
description = "(Optional) The `id` of a Cloud KMS key that will be used to encrypt objects inserted into temporary bucket. User is responsible for permissions to this key for Cloud Storage Service Account."
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be helpful to document the permissions required in the README, even if they are not managed by the module. Same for KMS key for PubSub topic.

default = ""
validation {
condition = can(regex("^projects\\/[^\\n\\r\\/]+\\/locations\\/[^\\n\\r\\/]+\\/keyRings\\/[^\\n\\r\\/]+\\/cryptoKeys\\/[^\\n\\r\\/]+$", var.gcs_kms_key_name)) || var.gcs_kms_key_name == ""
error_message = "Cloud Storage KMS key name must match: '^projects\\/[^\\n\\r\\/]+\\/locations\\/[^\\n\\r\\/]+\\/keyRings\\/[^\\n\\r\\/]+\\/cryptoKeys\\/[^\\n\\r\\/]+$' pattern."
}
}