-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Airbyte Pod Intermittently Losing IRSA Permissions and Falling Back to Node IAM Role #53652
Comments
Having the same issue with helm chart version 1.4.1 |
Hey @talhermon as you saw my comment before, I had the same issue, but I managed to resolve it. global:
storage:
type: S3
bucket:
log: "<your bucket>"
state: "<your bucket>"
workloadOutput: "<your bucket>"
activityPayload: "<your bucket>"
s3:
region: <your region>
authenticationType: instanceProfile # This is important I am now using Helm chart version: 1.3.1 |
@jacoblElementor Happy to hear that you solved the issue. So I'm still facing the problem :( |
Does the container restart or something? I have it running for an hour with no issues till now + I ran some connections already |
@jacoblElementor Once I restart the pod it fix the problem for a couple of hours. |
I will monitor my deployment for a few more days to see if it reproduces. Also, what Kubernetes version are you running? |
@jacoblElementor EKS 1.31. |
cc @airbytehq/platform-deployments |
We are using GKE, and we are experiencing a similar issue on 1.4.0 helm chart. We are using gcs for the logs, that's possibly related to it, analogously, and it's a workload identity issue there instead of IRSA in AWS?
We are seeing these warnings in the server pod logs when planning with the airbyte tf provider. Applying afterwards leaves shows no errors in the server logs, but fails to persist the state with provider message: "failure to invoke the API unknown status code returned: Status 504 upstream request timeout". I am commenting here as there is no such issue in the airbyte-terraform-provider repo, and this issue is the closest I have found. |
/bump - This is still an ongoing issue affecting our production environment. Would appreciate any insights from the platform team on potential causes or workarounds. Happy to provide additional debugging information if needed. |
@talhermon The issue has not persisted for me, I have it up and running for a week. serviceAccount:
annotations:
eks.amazonaws.com/role-arn: "arn:aws:iam::<account_id>:role/Airbyte-role" is not under global, right? It should be on the root level of the yaml. |
same here with helm 1.5.1. |
Helm Chart Version
1.3.1
What step the error happened?
Other
Relevant information
I'm encountering an intermittent issue with an Airbyte pod running on EKS, which is configured to assume an AWS IAM role using IRSA. Every few days, the pod appears to lose the service account permissions and instead starts using the IAM role associated with the underlying EKS node. This results in permission-denied errors when attempting to access an S3 bucket.
Expected Behavior:
The Airbyte pod should consistently assume the IAM role associated with its service account via IRSA and retain the expected permissions throughout its lifecycle.
Relevant log output
The text was updated successfully, but these errors were encountered: