arena submit mpijob

Submit MPIjob as training job.

Synopsis

Submit MPIjob as training job.

arena submit mpijob [flags]

Options

  -a, --annotation stringArray     the annotations
      --cpu string                 the cpu resource to use for the training, like 1 for 1 core.
  -d, --data stringArray           specify the datasource to mount to the job, like <name_of_datasource>:<mount_point_on_job>
      --data-dir stringArray       the data dir. If you specify /data, it means mounting hostpath /data into container path /data
  -e, --env stringArray            the environment variables
      --gpus int                   the GPU count of each worker to run the training.
  -h, --help                       help for mpijob
      --image string               the docker image name of training job
      --logdir string              the training logs dir, default is /training_logs (default "/training_logs")
      --memory string              the memory resource to use for the training, like 1Gi.
      --name string                override name
      --rdma                       enable RDMA
      --retry int                  retry times.
      --sync-image string          the docker image of syncImage
      --sync-mode string           syncMode: support rsync, hdfs, git
      --sync-source string         sync-source: for rsync, it's like 10.88.29.56::backup/data/logoRecoTrain.zip; for git, it's like https://github.com/kubeflow/tf-operator.git
      --tensorboard                enable tensorboard
      --tensorboard-image string   the docker image for tensorboard (default "registry.cn-zhangjiakou.aliyuncs.com/tensorflow-samples/tensorflow:1.12.0-devel")
      --workers int                the worker number to run the distributed training. (default 1)
      --working-dir string         working directory to extract the code. If using syncMode, the $workingDir/code contains the code (default "/root")

Options inherited from parent commands

      --arena-namespace string   The namespace of arena system service, like tf-operator (default "arena-system")
      --config string            Path to a kube config. Only required if out-of-cluster
      --loglevel string          Set the logging level. One of: debug|info|warn|error (default "info")
  -n, --namespace string         the namespace of the job (default "default")
      --pprof                    enable cpu profile
      --trace                    enable trace

SEE ALSO

Auto generated by spf13/cobra on 24-Apr-2019