Scientific reports

D3O-IIoT: deep reinforcement learning-driven dynamic deception orchestration for industrial IoT security.

Usman Wushishi, Altaf Hussain, Muhammad Imran Khalid, Nasir Hussain, Mona Jamjoom, Zahid Ullah

Published: 202510.1038/s41598-025-33426-4

Abstract

Open Access

The industrial Internet of Things (IIoT) systems are under mounting cyber threats that take advantage of the resource shortage and operational vulnerability of industrial systems. The current intrusion detection schemes are based on either the static or passive form of defense that is not dynamically adapted to the changing attacks. This paper presents D3O-IIoT, a progressive reinforcement learning model that dynamically coordinates deception techniques, including honeypot deployment, moving target defense, fake telemetry injection, and node isolation on the basis of real time threat monitoring. The defense problem is formulated as a Markov Decision Process, in which a Dueling Deep Q-Network agent maximizes a multi-objective reward to balance between attack mitigation, deception engagement, false positive control and resource cost. Experiments on three IIoT datasets (CIC-IIoT2025, WUSTL-IIoT2021, TON-IoT) demonstrate that D3O-IIoT has a 13.7% attack mitigation rate with a 0.3% false alarm, which is an improvement of 293-767% (p < 0.0001) over baselines. Generalization is confirmed by cross-dataset validation (97.7% and 77.8% retention on TON-IoT and WUSTL-IIoT, respectively). Results of Ablation determine that the most critical component of reward is false positive control (51.4% degradation upon removal) and that sensitivity analysis indicates the possibility of 46.1% tunability through risk threshold change. The acquired policy favors isolation (71.2 per cent) on confirmed threats and honeypots (15.4 per cent) on reconnaissance with a 2.07ms latency that can be deployed in real time. D3O-IIoT builds upon IIoT cybersecurity by substituting fixed set rule-based defenses with dynamic and learning-based deception orchestration, balancing various practical goals under resource-constrained conditions.

View at DOI