Ship Itinerary Predictability
Patricia H. Carter, (Naval Surface Warfare Center), firstname.lastname@example.org
Understanding the nature of ship itineraries is a first step towards modeling commercial ship behavior to support Maritime Domain Awareness. The goal is to determine patterns of individual and collective commercial vessel behavior, to establish normal behavior patterns and to accurately detect anomalous behavior. Methods to produce measures of anomaly will facilitate risk evaluation of vessels of interest in tactical situations.
A core model of ship behavior that is susceptible to mathematical and statistical analysis is the ship itinerary. This model can accommodate the association of additional data about the ship, ports and the local and global situation. A further abstraction of the ship itinerary is the sequence of ports visited. Sequences of port visits is the model investigated in this paper. The scope of global shipping traffic is vast: more than 100,000 ships and several thousands of ports. There is a huge amount of variability in the predictability of ship itinerary patterns. The first question to ask is how predictable and regular are individual ship itineraries? In other words, what are the normal patterns? The kinds of patterns one might expect include itineraries that are 1) periodic sequences, 2) sequences that contain periodic subsequences with insertions and deletions, 3) sequences with multiple periodic subsequences, 4) sequences with disordered and non-repeating behavior. Sequence processing is a well developed discipline as it is fundamental in computer science, text processing and bioinformatics. There are many different measures of periodicity and recurrence that can be used to quantify how close a sequence is to exhibiting regular, repeating behavior. These can be used to explore whether the behavior can be clustered into types or if there is a continuous spectrum of behavior. The predictability of port sequences on a ship's itinerary can be investigated by assigning a measure of anomaly to port visits. For each ship, given a sequence of past port visits, one predicts how likely each of a number of choices of ports is the next port to be visited; here a weighted all-gram profile approach is used with the principle of maximizing the recurrence properties of the sequence. An anomaly score is then determined by comparing the actual port visited with its prediction weight. The distribution of anomaly scores that occur over a set of ship itineraries is indicative of the predictability of port sequences.
This approach will be evaluated by applying it to a data set consisting of a little over a year's schedules from about 2000 commercial ships visiting about 700 ports, for a total of 137,000 port visits.
Future and ongoing work addresses the more difficult problem of modeling global behavior. The collective patterns of behavior are a product of a system which is self-organizing rather than hierarchically determined and emerging patterns are both driven and constrained by a myriad of forces and factors. An appropriate starting point would be a dynamic graph model. A limiting factor in the analysis of collective patterns and the development of appropriate models is the difficulty and expense of assembling a sufficient amount of data, because most of the data is owned by private entities. Understanding collective commercial vessel behavior and its trends is critical in the construction of a global maritime situational picture.
The dynamic behavior that is captured by port sequence analysis lies between the micro-scale behavior involved in near real time ship tracking and the macro-scale behavior of the global commercial shipping system. In this paper the utility of various sequence analysis methods applied to port sequence data is explored; these methods are appropriate and useful for the meso-scale behavior but different tools will be required for the micro- and macro-scale behavior of commercial shipping traffic.