作者: Sanket Shah , Sinha Arunesh , Varakantham Pradeep , Perrault Andrew , Tambe Milind
关键词:
摘要: Large-scale screening for potential threats with limited resources and capacity is a problem of interest at airports, seaports, other ports entry. Adversaries can observe procedures arrive time when there will be gaps in due to resource capacities. To capture this game between adversaries, has been previously represented as Stackelberg game, referred Threat Screening Game (TSG). Given the significant complexity associated solving TSGs uncertainty arrivals customers, existing work assumed that screenees are allocated security beginning time-window. In practice, such airport passengers bursts correlated flight not bound by fixed time-windows. address this, we propose an online threat model which strategy determined adaptively passenger arrives while satisfying hard on acceptable risk threat. solve problem, first reformulate it Markov Decision Process (MDP) translates constraint action space then resultant MDP using Deep Reinforcement Learning (DRL). end, provide novel way efficiently enforce linear inequality constraints output DRL. We show our solution allows us significantly reduce screenee wait without compromising risk.