SCOPE-RL is an open-source Python Software for implementing the end-to-end procedure regarding offline Reinforcement Learning (offline RL), from data collection to offline policy learning, off-policy ...
Abstract: With the wide application of UAV formations in the engineering field, circular UAV formations in electromagnetic silence scenarios have gradually become a research focus due to their fast ...
Figure 1. FIPO vs. baselines on AIME 2024. FIPO shows that pure RL training alone can outperform reproduced pure-RL baselines such as DAPO and DeepSeek-R1-Zero-32B, surpass o1-mini, and produce ...
Abstract: With the advancement of marine resource development and protection activities, tasks such as full geomorphology mapping and underwater area sweeping in turbid water pose serious challenges ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results