Quantcast
Channel: TAO Toolkit - NVIDIA Developer Forums
Viewing all articles
Browse latest Browse all 497

AWS EKS - Tao Toolkit Install - ERROR: kubernetes cluster unreachable

$
0
0

Please provide the following information when requesting support.

• Hardware: AWS
image

• Network Type - Want to use LPD/LPRNet

• How to reproduce the issue ? (This is for errors. Please share the command line and the detailed log here.)

Followed steps here: Setup - NVIDIA Docs

I do this in a SageMaker Studio image terminal. I install ngc cli first for step 1.

For step 3 in Deployment section: Setup - NVIDIA Docs

  • I dont add anything into optional API params.

For step4:

  • I customise the set-up.sh file (attached) to remove sudo and other minor changes… like hardcoding terraform location.
    setup-no-sudo (1).txt (14.3 KB)

When i run the command to install the .sh file here are my inputs:
(image removed)

I used ssh-keygen for SSH public key
for API chart values - the file is empty:
image

I am met with this error:

What is the issue here? I’ve attached the main.tf file
main_tf.txt (4.8 KB)

In AWS - i see the cluster activate with 1 node. In S3 - i see a folder in the S3 bucket with cluster and config.

Can someone please provide some visbility into whats going on? I could not get the Python Wheels tao installation to work on SM Studio, and this isn’t working either. I’m keen to try out your models + tao toolkit but may need to move to something else.

4 posts - 2 participants

Read full topic


Viewing all articles
Browse latest Browse all 497

Trending Articles