arena submit standalonejob
Submit StandaloneJob as training job. And it will be deprecated soon, please use tfjob instead.
Synopsis
Submit StandaloneJob as training job. And it will be deprecated soon, please use tfjob instead.
arena submit standalonejob [flags]
Options
-a, --annotation stringArray the annotations
--cpu string the cpu resource to use for the training, like 1 for 1 core.
-d, --data stringArray specify the datasource to mount to the job, like <name_of_datasource>:<mount_point_on_job>
--data-dir stringArray the data dir. If you specify /data, it means mounting hostpath /data into container path /data
-e, --env stringArray the environment variables
--gpus int the GPU count of each worker to run the training.
-h, --help help for standalonejob
--image string the docker image name of training job
--memory string the memory resource to use for the training, like 1Gi.
--name string override name
--rdma enable RDMA
--retry int retry times.
--sync-image string the docker image of syncImage
--sync-mode string syncMode: support rsync, hdfs, git
--sync-source string sync-source: for rsync, it's like 10.88.29.56::backup/data/logoRecoTrain.zip; for git, it's like https://github.com/kubeflow/tf-operator.git
--workers int the worker number to run the distributed training. (default 1)
--working-dir string working directory to extract the code. If using syncMode, the $workingDir/code contains the code (default "/root")
Options inherited from parent commands
--arena-namespace string The namespace of arena system service, like tf-operator (default "arena-system")
--config string Path to a kube config. Only required if out-of-cluster
--loglevel string Set the logging level. One of: debug|info|warn|error (default "info")
-n, --namespace string the namespace of the job (default "default")
--pprof enable cpu profile
--trace enable trace
SEE ALSO
- arena submit - Submit a job.