Quantcast
Channel: TAO Toolkit - NVIDIA Developer Forums
Viewing all articles
Browse latest Browse all 497

Centerpose_synthetic quickstart notebook error in sample dataset: KeyError: 'plane_center'

$
0
0

Please provide the following information when requesting support.

  • Hardware: GeForce RTX 4090 Laptop GPU
  • Software: Ubuntu 22.04
  • Network Type: centerpose_fan from centerpose_synth quickstart notebook, no changes made
  • TLT Version (Please run “tlt info --verbose” and share “docker_tag” here): tlt is not installed locally using this notebook, docker tag is ISAAC Sim 4.0.0 and all dockers for nvidia/tao/tao-toolkit 5.5.0 dataset/deploy/model
  • Training spec file: default spec file from notebook train_synthetic.yaml:
results_dir: /results

dataset:
  train_data: /data/results/images
  val_data: /data/results/images
  num_classes: 1
  batch_size: 4
  workers: 8
  category: "pallet"
  num_symmetry: 1
  max_objs: 10

train:
  num_gpus: 1
  validation_interval: 20
  checkpoint_interval: ${train.validation_interval}
  num_epochs: 40
  clip_grad_val: 100.0
  seed: 317
  pretrained_model_path: /results/pretrained_models/centerpose_vtrainable_fan_small/centerpose_trainable_FAN_small.pth
  precision: "fp32"

  optim:
    lr: 6e-05
    lr_steps: [90, 120]

model:
  down_ratio: 4
  use_pretrained: False
  backbone:
    model_type: fan_small
    pretrained_backbone_path: /results/pretrained_models/centerpose_vtrainable_fan_small/centerpose_trainable_FAN_small.pth
  • Tao Mounts file:
{
    "Mounts": [
        {
            "source": "/home/mb/tao-experiments",
            "destination": "/workspace/tao-experiments"
        },
        {
            "source": "/home/mb/tao-experiments/data/centerpose",
            "destination": "/data"
        },
        {
            "source": "/home/mb/tao_tutorials/notebooks/tao_launcher_starter_kit/centerpose/specs",
            "destination": "/specs"
        },
        {
            "source": "/home/mb/tao-experiments/centerpose/results",
            "destination": "/results"
        }
    ],
    "DockerOptions": {
        "shm_size": "16G",
        "ulimits": {
            "memlock": -1,
            "stack": 67108864
        },
        "user": "1000:1000",
        "network": "host"
    }
}

• How to reproduce the issue ?

execute the latest centerpose_synthetic notebook and specifically, the step

print("For multi-GPU, change train.num_gpus in train.yaml based on your machine.")
# If you face out of memory issue, you may reduce the batch size in the spec file by passing dataset. batch_size=2
!tao model centerpose train \
          -e $SPECS_DIR/train_synthetic.yaml \
          results_dir=$RESULTS_DIR/

will throw the error with the default dataset:

Error executing job with overrides: ['results_dir=/results/']Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/nvidia_tao_pytorch/core/decorators/workflow.py", line 69, in _func
    raise e
  File "/usr/local/lib/python3.10/dist-packages/nvidia_tao_pytorch/core/decorators/workflow.py", line 48, in _func
    runner(cfg, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/nvidia_tao_pytorch/cv/centerpose/scripts/train.py", line 84, in main
    run_experiment(experiment_config=cfg,
  File "/usr/local/lib/python3.10/dist-packages/nvidia_tao_pytorch/cv/centerpose/scripts/train.py", line 70, in run_experiment
    trainer.fit(pt_model, dm, ckpt_path=resume_ckpt)
  File "/usr/local/lib/python3.10/dist-packages/pytorch_lightning/trainer/trainer.py", line 543, in fit
    call._call_and_handle_interrupt(
  File "/usr/local/lib/python3.10/dist-packages/pytorch_lightning/trainer/call.py", line 44, in _call_and_handle_interrupt
    return trainer_fn(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/pytorch_lightning/trainer/trainer.py", line 579, in _fit_impl
    self._run(model, ckpt_path=ckpt_path)
  File "/usr/local/lib/python3.10/dist-packages/pytorch_lightning/trainer/trainer.py", line 986, in _run
    results = self._run_stage()
  File "/usr/local/lib/python3.10/dist-packages/pytorch_lightning/trainer/trainer.py", line 1030, in _run_stage
    self._run_sanity_check()
  File "/usr/local/lib/python3.10/dist-packages/pytorch_lightning/trainer/trainer.py", line 1059, in _run_sanity_check
    val_loop.run()
  File "/usr/local/lib/python3.10/dist-packages/pytorch_lightning/loops/utilities.py", line 182, in _decorator
    return loop_run(self, *args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/pytorch_lightning/loops/evaluation_loop.py", line 135, in run
    self._evaluation_step(batch, batch_idx, dataloader_idx, dataloader_iter)
  File "/usr/local/lib/python3.10/dist-packages/pytorch_lightning/loops/evaluation_loop.py", line 396, in _evaluation_step
    output = call._call_strategy_hook(trainer, hook_name, *step_args)
  File "/usr/local/lib/python3.10/dist-packages/pytorch_lightning/trainer/call.py", line 309, in _call_strategy_hook
    output = fn(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/pytorch_lightning/strategies/strategy.py", line 412, in validation_step
    return self.lightning_module.validation_step(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/nvidia_tao_pytorch/cv/centerpose/model/pl_centerpose_model.py", line 136, in validation_step
    self.val_cp_evaluator.evaluate(final_output, batch)
  File "/usr/local/lib/python3.10/dist-packages/nvidia_tao_pytorch/cv/centerpose/utils/centerpose_evaluator.py", line 279, in evaluate
center = np.asarray(anns['AR_data']['plane_center'])KeyError: 'plane_center'

What should be modified to be able to run the default notebook sucessfully?

Best regards

5 posts - 2 participants

Read full topic


Viewing all articles
Browse latest Browse all 497

Trending Articles