当前位置：首页 > news >正文

EngineAI 1. Start/Resume Training

news 来源：原创 2025/6/15 5:16:26

过段时间再整理吧，最新的可以去看Docs

Start/Resume Training

Args

--exp_name EXP_NAME: Experiment name.
--sub_exp_name SUB_EXP_NAME: Name of the sub-experiment to run or load, default is default.
--run_name RUN_NAME: Name of the run, default is current time %Y-%m-%d_%H-%M-%S.
--log_root：Path of log_root, default is engineai_rl_workspace/logs/{exp_name}/{sub_exp_name}.
--load_run LOAD_RUN: Name of the run to load when resume=True, default is -1. If -1: will load the last run.
--checkpoint CHECKPOINT: Saved model checkpoint number, default is -1. If -1: will load the last checkpoint.
--resume: Resume training from a checkpoint
--run_exist: Run training from an existing run with its config.json.
--debug: In debug mode, no logs will be saved.
--num_envs NUM_ENVS: Number of environments to create.
--seed SEED: Random seed.
--max_iterations MAX_ITERATIONS: Maximum number of training iterations.
--logger LOGGER: Logger module to use. Choice: tensorboard, wandb, neptune.
--upload_model: upload models to wandb or neptune.
--sim_device SIM_DEVICE: Device used by the simulator, (cpu, gpu, cuda:0, cuda:1 etc..), default is cuda:0.
--rl_device RL_DEVICE: Device used by the RL algorithm, (cpu, gpu, cuda:0, cuda:1 etc..), default is cuda:0.
--video: Record video during training. Headless mode also works.
--record_length RECORD_LENGTH: The number of steps to record for videos, default is 200.
--record_interval RECORD_INTERVAL: The number of step as interval to record a video.
--fps FPS: The fps of recorded videos, default is 50.
--frame_size FRAME_SIZE: The size of recorded frame, default is (1280, 720).
--camera_offset CAMERA_OFFSET: The offset of the video filming camera, default is (0, -2, 0).
--camera_rotation CAMREA_ROTATION: The rotation of the video filming camera, default is (0, 0, 90).
--env_idx_record ENV_IDX_RECORD: The env idx to record, default is 0.
--actor_idx_record ACTOR_IDX_RECORD: The actor idx to record, default is 0.
--rigid_body_idx_record RIGID_BODY_IDX_RECORD: The rigid_body idx to record, default is 0.

Examples

From Scratch

Files required to resume the run will be saved for resume or play, which will work even when the code is changed.

# basic

python engineai_rl_workspace/scripts/train.py --exp_name pm01_rough_ppo

# headless

python engineai_rl_workspace/scripts/train.py --exp_name pm01_rough_ppo --headless

# use specific logger

python engineai_rl_workspace/scripts/train.py --exp_name pm01_rough_ppo --headless --logger wandb

# run with params overriden python

engineai_rl_workspace/scripts/train.py --exp_name pm01_rough_ppo --headless --num_envs 4096 --max_iterations 30000 --seed 1

Video Recording

# default

setting python engineai_rl_workspace/scripts/train.py --exp_name pm01_rough_ppo --headless --video # custom setting python engineai_rl_workspace/scripts/train.py --exp_name pm01_rough_ppo --headless --video --record_length 500 --record_interval 100 --fps 100 --frame_size=1920,1080 --camera_offset=-2,0,0 --camera_rotation=0,0,90 --env_idx_record 1 --actor_idx_record 1 --rigid_body_idx_record

From `.json` Config from Scratch

Since a config is saved for each, if you want to start a new run with modification of the .json config of a old run, you can create a new folder copying the old config, modify the config, and run a training from it.

The Algos config files will be converted to .py config files, and used for training.

# from a default sub_exp_name

python engineai_rl_workspace/scripts/train.py --exp_name pm01_rough_ppo --headless --run_exist --load_run 2025-06-03_12-00-00 # from a specific sub_exp_name python engineai_rl_workspace/scripts/train.py --exp_name pm01_rough_ppo --headless --run_exist --sub_exp_name fixed_std --load_run 2025-06-03_12-00-00

Using a Specific Logger (Tensorboard, Wandb, Neptune)

python engineai_rl_workspace/scripts/train.py --exp_name pm01_rough_ppo --headless --logger wandb

Resume a Run

# resume from default log root python engineai_rl_workspace/scripts/train.py --exp_name pm01_rough_ppo --headless --resume --load_run 2025-06-03_12-00-00 # resume from a specific log root python engineai_rl_workspace/scripts/train.py --exp_name pm01_rough_ppo --headless --resume --log_root ~/server/engineai_rl_workspace/logs/pm01_rough_ppo/default --load_run 2025-06-03_12-00-00 # resume from a specific checkpoint python engineai_rl_workspace/scripts/train.py --exp_name pm01_rough_ppo --headless --resume --load_run 2025-06-03_12-00-00 --checkpoint 30000

Debug Mode

Training won't save log files in debug mode, so user can maintain a clean log directory

# debug mode 
python engineai_rl_workspace/scripts/train.py --exp_name pm01_rough_ppo --headless --debug

1.1 Linux 编译FFmpeg 4.4.1

Git详解：初学者完全指南

Git 工作流与版本管理策略

日本生活：日语语言学校-日语作文-沟通无国界（１）-题目：假装写日记

YOLOv8分类的三种C++实现：opencv dnn/libtorch/onnxruntime

java 设计模式_行为型_16访问者模式

深入解析ArrayList源码：从短链项目实战到底层原理

2025年EAAI SCI1区TOP，贪婪策略粒子群算法GS-IPSO+无人机桥梁巡检覆盖路径规划，深度解析+性能实测

【项目实训#08】HarmonyOS知识图谱前端可视化实现

计算机网络-自顶向下—第一章概述重点复习笔记

XMLDecoder、LDAP 注入与修复

WebSocket与XMPP：即时通讯技术的本质区别与选择逻辑优雅草卓伊凡|片翼|许贝贝

[每周一更]-(第147期)：使用 Go 语言实现 JSON Web Token (JWT)

深度学习——基于卷积神经网络的MNIST手写数字识别详解

大规模异步新闻爬虫的分布式实现

【Jmeter】Jmeter 高并发性能压力测试

orb_slam--安装配置

混合云战略规划深度解析：多云管理的技术架构与治理框架

（题目向，随时更新）动态规划算法专题(2) --见识常见的尝试模型

三维激光雷达在智慧工厂物流测量中的应用分析

上海制作企业网站/爱站网官网查询域名

涿鹿镇做网站/百度一下百度一下你就知道

模板网站做外贸可以吗/成都网站快速排名软件

lnmp搭建后怎么做网站/谷歌chrome浏览器

高端网站建设套餐/百度一直不收录网站

网站怎么登陆后台/常用的网络营销方法及效果