Page cover image

EKS Issues

EKS creation fails with 'Cross-account pass role is not allowed.' in activity log

Reason

This error happens if the AWS Cloud Account you have chosen at the time of cluster creation is different from the one where the cluster role and the node pool roles were created.

Resolution

Delete the cluster and recreate it by choosing the right AWS Account.

Application access endpoint is missing for application launched on EKS with ALB

Scenario

When an application is launched on AWS EKS with ALB using gopaddle, the access endpoint for the application does not show a valid URL.

Under the application view page, the endpoint show 'AWS Application Load Balancer'.

Application, Service, its replicas and containers are in a running state. However, under the application Activities page, an IngressWarning is observed. Expanding on the warning, shows the error message "Failed build model due to WebIdentityErr: failed to retrieve credentials caused by: AccessDenied: Not authorized to perform sts:AssumeRoleWithWebIdentity status code: 403"

Reason

The ALB ARN used while creating the EKS cluster does not match the cluster details. Check the cluster view page and check the section Kube Master. Note down the Cluster ID and the region details.

Under the ALB Cloud Formation Template section. Under the AmazonEKSLoadBalancerControllerRole in the Principal section for the ARN. Verify if the cluster ID and the region details match.

This could be because of uploading a wrong ALB Cloud Formation Template through gopaddle UI at the time of installing ALB controller in the newly created EKS cluster.

Resolution

Currently gopaddle does not support updating the ARN. The cluster needs to be deleted and re-created. Once the new cluster is created, download the ALB template and make sure the right ALB template is uploaded while installing the ALB controller.

Creating an EKS Cluster fails with 'Cannot create a VPC'

Scenario

Creating an EKS cluster through gopaddle fails with the error "Cannot create a VPC". The cluster moves to Unknown status and the Activity Logs shows the below messages.

Solution

The above issue could happen for various reasons. To identify the exact cause of failure, select the Stack Logs section and choose VPC Stack from the drop down. In this scenario, you can find the corresponding reason for CREATE_FAILED as "API: ec2:ModifySubnetAttribute You are not authorized to perform this operation."

This indicates that the IAM User used to register the corresponding AWS Cloud Account needs ec2:ModifySubnetAttribute to update the subnets within the VPC. Once the IAM user is updated with the new permission, create a new cluster once again from the gopaddle portal.

Creating an EKS cluster fails with "The security token included in the request is invalid"

Scenario

Creating an EKS cluster through gopaddle fails with the error "The security token included in the request is invalid". The cluster moves to Unknown status and the Activity Logs shows the below messages.

Solution

The above issue happens when either the master or the node pool ARN is incorrect. Recreate the Cluster with valid ARNs.

Node Pool is not created while creating an EKS Cluster

Scenario

While creating an EKS cluster, Cluster moves to Running state but the node pool is not created.

Under Activity Logs, Event GETTING_EKS_CLUSTER_KUBEVERSION fails with timeout message as below:

In the Cloud Account section, Accessibility Check shows Failed status.

This happens when EKS Cluster takes too long to respond with its Kubernetes version. This may happen due to network delays or when EKS cluster is not in ready state.

Solution

Click on Verfiy option to Accessibility Check. Once the Accessibility is verified, you can start creating a node pool under the Node Pool section.

Deleting a Node pool in EKS fails

Scenario

Deleting a node pool in EKS cluster fails and the node pool is moved to "Failed" state, however the nodes within the pool are deleted.

The Activity log show the following failure message.

This happens when an application is deployed on the EKS cluster and is scheduled on the nodepool which is being deleted. The network interfaces for the nodepool are not deleted automatically. Security group has a dependency on the Network interface and thus the node pool deletion fails with a Dependency Violation error.

Solution

Deleting a nodepool when in use can cause unpredictable application behavior.
  1. Detach and Delete the network interfaces from the AWS console directly.

  2. Delete the nodepool from the gopaddle console or from the AWS console.

Last updated