2024-10-16 06:02:47,658 INFO MainThread:121 [wandb_setup.py:_flush():76] Current SDK version is 0.17.2 2024-10-16 06:02:47,658 INFO MainThread:121 [wandb_setup.py:_flush():76] Configure stats pid to 121 2024-10-16 06:02:47,658 INFO MainThread:121 [wandb_setup.py:_flush():76] Loading settings from /run/determined/workdir/.config/wandb/settings 2024-10-16 06:02:47,658 INFO MainThread:121 [wandb_setup.py:_flush():76] Loading settings from /run/determined/workdir/wandb/settings 2024-10-16 06:02:47,658 INFO MainThread:121 [wandb_setup.py:_flush():76] Loading settings from environment variables: {'api_key': '***REDACTED***', 'mode': 'offline'} 2024-10-16 06:02:47,658 INFO MainThread:121 [wandb_setup.py:_flush():76] Applying setup settings: {'_disable_service': False} 2024-10-16 06:02:47,658 INFO MainThread:121 [wandb_setup.py:_flush():76] Inferring run settings from compute environment: {'program_relpath': 'openrlhf/train_simconf.py', 'program_abspath': '/run/determined/workdir/openrlhf/train_simconf.py', 'program': '/run/determined/workdir/openrlhf/train_simconf.py'} 2024-10-16 06:02:47,658 INFO MainThread:121 [wandb_init.py:_log_setup():520] Logging user logs to /home/shuai/output/rlhf/sim_conf_zephyr_004/wandb/offline-run-20241016_060247-vm9845jc/logs/debug.log 2024-10-16 06:02:47,658 INFO MainThread:121 [wandb_init.py:_log_setup():521] Logging internal logs to /home/shuai/output/rlhf/sim_conf_zephyr_004/wandb/offline-run-20241016_060247-vm9845jc/logs/debug-internal.log 2024-10-16 06:02:47,658 INFO MainThread:121 [wandb_init.py:init():560] calling init triggers 2024-10-16 06:02:47,658 INFO MainThread:121 [wandb_init.py:init():567] wandb.init called with sweep_config: {} config: {'prompt_data': None, 'prompt_data_probs': '1.0', 'pretrain_data': '/home/shuai/dataset/CONQORD_dataset/conqord_step3_data', 'pretrain_data_probs': '1.0', 'pretrain': '/home/shuai/output/rlhf/sim_conf_sft_zephyr_003_merged', 'save_path': '/home/shuai/output/rlhf/sim_conf_zephyr_004', 'save_steps': 200, 'logging_steps': 1, 'eval_steps': -1, 'ckpt_path': './ckpt/checkpoints_ppo', 'max_ckpt_num': 1, 'max_ckpt_mem': 1000, 'num_episodes': 1, 'prompt_max_len': 384, 'generate_max_len': 384, 'max_len': None, 'max_samples': 100000, 'max_norm': 1.0, 'l2': 0.1, 'advantage_clip': 0.8, 'lambd': 0.95, 'gamma': 1, 'micro_train_batch_size': 8, 'train_batch_size': 256, 'load_checkpoint': '/home/shuai/output/rlhf/sim_conf_zephyr_004/_actor', 'normalize_reward': False, 'top_p': 0.95, 'top_k': 60, 'temperature': 1.0, 'num_return_sequences': 2, 'seed': 42, 'num_workers': 4, 'local_rank': 0, 'zero_stage': 2, 'gradient_checkpointing': True, 'bf16': True, 'fp16': False, 'actor_learning_rate': 1e-06, 'kl_target': None, 'enable_ema': False, 'zpg': 1, 'adam_offload': True, 'actor_init_on_gpu': True, 'flash_attn': True, 'policy_loss_coef': 1.0, 'ptx_loss_coef': 0.0, 'aux_loss_coef': 0, 'grad_accum_dtype': None, 'disable_trace_cache': False, 'load_in_4bit': False, 'load_in_8bit': False, 'lora_rank': 64, 'lora_alpha': 64, 'target_modules': 'all-linear', 'lora_dropout': 0, 'gradient_checkpointing_use_reentrant': False, 'fast_tokenizer': False, 'head_prefix': 'value_head', 'input_key': None, 'output_key': None, 'input_template': 'Human: {}\nAssistant: ', 'apply_chat_template': False, 'use_wandb': 'd9e0bd2b23cec57a1fb22c56be041fe6a8c76a1a', 'wandb_org': None, 'wandb_group': None, 'wandb_project': 'SimpleConfAlign', 'wandb_run_name': 'sim_conf_zephyr_004', 'is_rollout': 1, 'sample_wise_baseline': 1, 'sample_batch_baseline': 0, 'is_train_on_input': 0, 'bert_model_type': '/home/shuai/pretrained/google-bert/bert-large-uncased', 'bert_idf': 1, 'bert_fscore': 0, 'idf_dict_file': '/home/shuai/dataset/bert-large-uncased-conqord_step3_87k_idf.pkl', 'rescale_with_baseline': 0, 'adding_baseline': 1, 'score_coef': 1.0, 'adding_baseline_seperate': 1, 'conf_sample_wise_align': 1, 'conf_reward_alpha': 0.5} 2024-10-16 06:02:47,658 INFO MainThread:121 [wandb_init.py:init():610] starting backend 2024-10-16 06:02:47,658 INFO MainThread:121 [wandb_init.py:init():614] setting up manager 2024-10-16 06:02:47,660 INFO MainThread:121 [backend.py:_multiprocessing_setup():105] multiprocessing start_methods=fork,spawn,forkserver, using: spawn 2024-10-16 06:02:47,661 INFO MainThread:121 [wandb_init.py:init():622] backend started and connected 2024-10-16 06:02:47,665 INFO MainThread:121 [wandb_init.py:init():711] updated telemetry 2024-10-16 06:02:47,665 INFO MainThread:121 [wandb_init.py:init():744] communicating run to backend with 90.0 second timeout 2024-10-16 06:02:47,673 INFO MainThread:121 [wandb_init.py:init():795] starting run threads in backend 2024-10-16 06:02:50,384 INFO MainThread:121 [wandb_run.py:_console_start():2380] atexit reg 2024-10-16 06:02:50,384 INFO MainThread:121 [wandb_run.py:_redirect():2235] redirect: wrap_raw 2024-10-16 06:02:50,384 INFO MainThread:121 [wandb_run.py:_redirect():2300] Wrapping output streams. 2024-10-16 06:02:50,384 INFO MainThread:121 [wandb_run.py:_redirect():2325] Redirects installed. 2024-10-16 06:02:50,385 INFO MainThread:121 [wandb_init.py:init():838] run started, returning control to user process 2024-10-17 18:23:24,143 WARNING MsgRouterThr:121 [router.py:message_loop():77] message_loop has been closed