帅 申 / North Automatic Control Technology Institute
俊慧 李 / North Automatic Control Technology Institute
经龙 牛 / North Automatic Control Technology Institute
蕃衍 薛 / North Automatic Control Technology Institute
晓红 武 / North Automatic Control Technology Institute
洪涛 王 / North Automatic Control Technology Institute
In the exploration and research of intelligent warfare, cross-domain combats have emerged as an effective means for system disruption and are considered the preferred mode of operation to achieve combat objectives. This study focuses on the reconnaissance-strike task scenario in cross-domain combat and addresses the issues of task prediction allocation and motion planning management in complex and uncertain environments. The research first elaborates on the complexity of task decision-making and the uncertainty of the battlefield environment from the perspectives of battlefield environment model parameterization and typical swarm reconnaissance-strike tasks. Subsequently, the study designs a generalized state space, reward function, action space, and policy network, while incorporating diverse rewards closely related to the reconnaissance-strike task. The action strategy output adopts a subject-verb-object structure to more accurately represent complex operations. The policy network employs an encoder-temporal aggregation-attention mechanism-decoder architecture to effectively integrate feature information. The study utilizes a deep reinforcement learning method based on proximal policy optimization (PPO-DRL) to solve the problem. Finally, simulation experiments validate the feasibility and effectiveness of decision-making for cross-domain combat entities performing reconnaissance-strike tasks under complex and uncertain conditions, demonstrating the intelligence of task prediction allocation and motion planning management.