Quantitative Methods in Defense and National Security 2012
Abstracts and Presentations

Application of Scientometric Methods to Identify Emerging Technologies
Robert K. Abercrombie, and Bob G. Schlicher, Oak Ridge National Laboratory

This work examines a scientometric model that tracks the emergence of an identified technology from initial discovery (via original scientific and conference literature), through critical discoveries (via original scientific, conference literature and patents), transitioning through Technology Readiness Levels (TRLs) and ultimately on to commercial application. During the period of innovation and technology transfer, the impact of scholarly works, patents and on-line web news sources are identified. As trends develop, currency of citations, collaboration indicators, and on-line news patterns are identified. The combinations of four distinct and separate searchable on-line networked sources (i.e., scholarly publications and citation, patents, news archives, and on-line mapping networks) are assembled to become one collective network (a dataset for analysis of relations). This established network becomes the basis from which to quickly analyze the temporal flow of activity (searchable events) for the example subject domain we investigated.

Statistics and Optimization Methods For Known Cyber Vulnerability Maintenance
Theodore T. Allen, Anthony Afful-Dadzie, Kimiebi Akah

Cyber security ranks among the top issues relevant to national security. It is estimated that over 90% of successful cyber attacks exploit known vulnerabilities. Therefore, in practice, the critical issue is less developing patches and more effective patch deployment or cyber maintenance. We review 67 articles relevant to statistics and optimization and cyber vulnerabilities. Next, we describe preliminary results for charting cyber vulnerabilities based on three real-world organizations using residual control charts. We also propose the application of Markov decision processes with extensions to address issues including limited data for selecting recommended maintenance operations for specific hosts.

Subject Matter Expert Refined Topic Models
Theodore T. Allen, Hui Xiong

It has been estimated that over 80% of data collected is unsupervised either text of images. These “documents” (including images) generally include more than a single topic and the relevant topics of interest overlap, i.e., they use the same words or pixels. Topic models including Latent Dirichlet Allocation are argued to be by far the most relevant generic approach available for identifying meaningful topics or clusters and assigning parts of documents to these topics. (Of course, less generic approaches based on many expert rules and tailored to specific cases have advantages over topic models.) Yet, when applying topic models, the user might like the ability to edit the topic definitions using subject matter experiment (SME) knowledge. We invented a way to do that which is based on Bayes theorem but which can be viewed as an example of a new type of statistics in which ordinary low-level data is combined with high-level data. We illustrate using both unsupervised text and image corpora including consumer reports Toyota Camry user reviews, Technometrics abstracts, and hybrid laser welding images.

Data Farming and the Exploration of Inter-agency, Inter-disciplinary, and International “What If?” Questions
Steve Anderson and Larry Triola, Naval Surface Warfare Center, Gary Horne and Ted Meyer Naval Postgraduate School

Data farming uses simulation modeling, high performance computing, and analysis to examine questions of interest with large possibility spaces. This methodology allows for the examination of whole landscapes of potential outcomes and provides the capability of executing enough experiments so that outliers might be captured and examined for insights. This capability may be quite informative when used to examine the plethora of “What If?” questions that result when examining potential scenarios that our forces may face in the uncertain world of the future. Many of these scenarios most certainly will be challenging and solutions may depend on interagency and international collaboration as well as the need for inter-disciplinary scientific inquiry preceding these events. In this paper we describe data farming and illustrate it in the context of application to questions inherent in military decision-making as we consider alternate future scenarios.

Stability and Instability: Applying Agent Based Modeling to Predict Political Outcomes of the Arab Spring
Amir Bagherpour,Claremont Graduate University

2011 was a seminal year in the history of the Middle East and North Africa (MENA). Popularly referred to as the Arab Spring, the region has experienced a wave of revolutions and instability. It can be classified in three broad categories within 2011: Uprisings that have resulted in the overthrow of standing regimes, uprisings that have failed to overthrow standing regimes, and states that have not experienced popular revolts. In the first category Libya, Egypt, Yemen, and Tunisia have all experienced uprisings resulting in the respective departure of Muamar Gaddafi, Hosni Mubarak, Ali Abdullah Saleh, and Zine Al Abidine Ben Ali. In contrast Syria and Bahrain have experienced uprisings that have not resulted into the toppling of their regimes thus far. Finally, countries such as Saudi Arabia and Iran have experienced none of the instability observed in 2011 within the same time period.

Throughout the course of 2011 I along with a team of political scientists tracked the evolution of the Arab Spring and predicted the outcomes using an agent based mode that incorporates game-theory and expected utility. The model platform is called Senturion (VV&A by DARPA, DIA, and JWAC). In addition to the agent based model, I also apply Monte Carlo analysis to simulate volatility and uncertainty in application to the political outcomes.

Sure independence screening for high-dimensional feature space: A frame theoretic analysis
Waheed Bajwa, Rutgers University

Linear (regression) models appear nowadays in myriad areas, such as genomics, proteomics, tumor classification, network monitoring, hyperspectral imaging, and computer tomography. One of the most fundamental of problems in linear models is that of variable selection: determining a small subset of variables (e.g., genes) that are responsible for majority (or all) of the variation in the response (e.g., the malignancy of a tumor). The two major challenges associated with variable selection in high-dimensional settings include accuracy and computational cost. Recently, Fan and Lv introduced the concept of ‘sure independence screening’ (SIS) for paring down the number of relevant variables to roughly the number of samples. In particular, SIS–followed by efficient variable selection procedures like the lasso and the Dantzig selector–has been shown to reduce the computational cost associated with accurate, high-dimensional variable selection. The SIS analysis of Fan and Lv, however, is limited to variables that are jointly Gaussian, a debatable assumption in a number of real-world applications. In this talk, we will take a frame-theoretic approach to the analysis of SIS and provide a comprehensive understanding of both the limitations and the advantages of SIS in high-dimensional settings, without making a priori assumptions about the statistical distribution of variables.

Polynomial Embeddings for Quadratic Algebraic Equations
Radu V. Balan, University of Maryland

In this talk we analyze the problem of vector reconstruction from magnitudes of frame coefficients. We show how the system of quadratic equations can be embedded in a set of higher order polynomial equations. In particular we analyze the linear independence of the system of homogeneous polynomial equations. We obtain characterizations for the range of tensor frame analysis operator that provide necessary and sufficient conditions for linear independence. Such polynomial embeddings provide closed form solutions for the problem of reconstruction from magnitudes of frame coefficients, which in turn can be used to construct unbiased estimators in the presence of noise.

Large-Scale Systems of Systems Modeling and Simulation Using Graph Theory
Santiago Balestrini-Robinson, Georgia Institute of Technology

The advent of capability-based assessments expanded the tradeoff space that analysts must evaluate, moving away from system-specific trades, to architecting-level trades. This expanded the options, but increased the number of alternatives to be considered. This research proposes a graph theory-based analysis to allow decision makers to trade multiple disparate architectures and compare their relative capabilities in order to defensibly justify the decisions for down-selecting to a family or subset of candidate architectures. The author will then demonstrate how the same analysis can assist decision makers in identifying the critical elements within the architecture whose behavior must be modeled in higher detail in order to minimize modeling effort and maximize modeling accuracy.

The underlying foundation that enables the field of abstract network theory is that the behavior of the network can be systematically influenced by its structure more or less independently of the details of the agents and their interactions. This has been consistently demonstrated in the literature and forms the basis for the analysis techniques described in this work. The method presented in this paper shows how to create representative functional networks from standard DoDAF products, in order to obtain (1) relative assessments from the candidate architectures, and (2) guide the higher-fidelity modeling of the candidate architectures. The techniques rely on spectral graph theory, in particular algorithms to compute the Perron-Frobenius Eigenvector (PFE) and its associated eigenvalue, as well as the Fiedler Vector of the Laplacian of the functional graph. The PFE is a measure of how many cycles a directed network has, and in this case is correlated with the functional cycles that indicate the level of capability a given architecture can achieve. The Fiedler Vector is associated with the dynamic contribution of each node in the graph, and therefore errors in the behavior of said nodes are more likely to produce deviations in the behavior of the networked system. The validity of the first claim will be demonstrated by comparing the results from the network analysis with a constructive simulation of the same engagement developed using an spatially and temporally explicit agent-based framework. The second claim will be demonstrated by inducing errors in a Random Boolean Network according to different node-ranking schemes in order to demonstrate that when errors are induced inversely proportional to the Fiedler Vector ranking, the overall error of the network is reduced.

Modeling Situational Dynamics of Maritime Piracy
Brandon Behlendorf, University of Maryland

Given the considerable attention paid by the international community to maritime piracy, little empirical evidence evaluates the efficacy of situational measures to prevent and deter pirates. Using a comprehensive database on global piracy incidents from 2005-2010, we address three important components of piracy dynamics. First, we examine the situational dynamics between pirate adversaries and commercial shipping crews, constructing individual piracy scripts detailing the action-response mechanisms within a piracy incident. Logistic regression models highlight the role that individual crew actions can take to prevent pirates from successfully boarding ships. Second, we highlight spatial and temporal changes in situational dynamics of piracy incidents during the study period, finding considerable variation in threat response by commercial shipping crew members. Finally, we couple the analysis with a spatially-enabled model of piracy intervention, examining the effect of Internationally Recommended Transit Corridor on the dynamics and outcomes of piracy actions within the Gulf of Aden.

Orthogonal matching pursuit and number theoretic sparsity equations
John J. Benedetto, University of Maryland

A new technique is introduced with the goal of optimal transform-based image compression with regard to sparsity as it affects speed of transmission and efficiency of storage. Sequences u of prime length are constructed with the properties that they are CAZAC (constant amplitude zero-auto-correlation) and that their discrete narrow-band ambiguity functions have minimal uniform upper bound over their domain off dc. This is an important advantage with regard to decoupling issues that arise in the time-doppler domain. The proof requires Weil’s proof of the Riemann hypothesis for finite fields, and the CAZAC property leads to diagonal co-variance matrices, which are often useful. Among the test cases upon which we report, there is a natural Gabor frame associated with u. In this case, the mutual coherence of the associated matrix G is also optimally bounded. Thus, by standard methods highlighted in compressive sensing, OMP can be used to construct sparse exact solutions of the matrix equation, G(x) = b, from a large class of possible solutions.

Vehicle vs. Task-based UAV Control Paradigms
Luca Bertuccelli, United Technologies Research Center

Current Unmanned Aircraft Systems (UASs) require that a pilot, in addition to one or more sensor operators, interact with the unmanned aerial vehicle (UAV) flight control system in order to maintain safe control. A common concept in future systems places a single human operator in control of one or more UAVs, presuming that the tasks will not overload the operator. In most of these future visions, the operator still interacts with each vehicle to dictate flight and sensor envelopes. In this presentation we will juxtapose this notion of “vehicle-based” control, in which the human operator sets targets and waypoints for individual vehicles, to “task-based” control, in which the operator is able to specify a set of tasks for a team of vehicles to complete. Under task-based control, an automated planner selects the vehicle best suited to perform a given task and executes a path planning algorithm, resulting in reduced workload for the operator. By allowing operators to control tasks as opposed the vehicles, we propose that a single operator can supervise many more UAVs. Efforts to represent vehicle versus task-based control paradigms in a discrete event simulation model for operator workload will be presented, as well as experimental results measuring the performance and workload of human operators under both control paradigms for various groupings of vehicles.

A Hidden Markov Model Variant for Sequence Classification
Sam Blasiak

Sequence classification is central to many practical problems within machine learning. Distances metrics between arbitrary pairs of sequences can be hard to define because sequences can vary in length and the information contained in the order of sequence elements is lost when standard metrics such as Euclidean distance are applied. We present a scheme that employs a Hidden Markov Model variant to produce a set of fixed-length description vectors from a set of sequences. We then define three inference algorithms, a Baum-Welch variant, a Gibbs Sampling algorithm, and a variational algorithm, to infer model parameters. Finally, we show experimentally that the fixed length representation produced by these inference methods is useful for classifying sequences of amino acids into structural classes.

Robust distributed processing with random fusion frames
Bernhard Bodmann, University of Houston

This talk discusses the performance of random fusion frames for the distributed processing of data which is robust against failure of nodes. The problem is the synthesis of components of a high-dimensional signal which have been processed in lower dimensional subspaces, with a fraction of the processing nodes failing. The effect of failure amounts to setting the component of a signal in this subspace to zero. We focus on asymptotics in which the dimensions of the signal space and of the lower dimensional components have a fixed ratio while both grow to infinity. We discuss the performance of passive error correction as well as non-linear recovery based on techniques from compressed sensing. This is joint work with Pankaj Singh.

Old and New Equipment Design, Development, Testing, and Fielding (Panel)
Carolyn Carroll, STAT TECH Inc.
Emerging multi-modal dynamically reconfigurable, adaptive hardware-software components and systems
Frederica Darema, USAF AFMC AFOSR
Laura Freeman, Institute for Defense Analysis
Current Model of Design, Development, Test, Field
George Dimitoglou, Hood College
Testing Dynamically Reconfigurable Systems

Historically new programs have been developed following a framework of requirements determination, design, development, system test, and operational test. The old model was used for most of if not all of the systems currently in use. In the last 20 years variations such as prototyping as a part of requirements determination or design and development have appeared. New systems are being built with software and hardware that are dynamically reconfigurable. Initially because of demands for spectrum and bandwidth and later possibly because of operational requirements, communication networks are also likely to be reconfigured on the fly. Still other examples of systems and platforms that may not be the same at any two points of time exist. The objective of this panel discussion is to briefly review the current well tried, successful model and then to focus on these new dynamic systems and sets of systems and what else it may take to develop, field, test, operate, monitor, and maintain these systems.

Low Cost Sensor Alternatives for Harbor and Littoral Areas (Panel)
Carolyn Carroll, STAT TECH Inc.
Hadis Dashtestani, The University of the District of Columbia
Sensor planning and optimum sensor locations based on information functions for underwater acoustic communications
Dairo Abayomi and Nikola Jovic, The University of the District of Columbia
Acoustic Communication Modem
Suresh Regmi, The University of the District of Columbia
Analysis of advance methods for decisions-making on target detection with underwater sensor network in the context of Blind Doppler Shift Estimation and Detection
Paul Cotae, The University of the District of Columbia
Research on underwater sensor networks
Roland Kamdem, The University of the District of Columbia
Threshold Based Stochastic Resonance for the Binary-Input Ternary-Output Discrete Memoryless Channels

Harbor and littoral areas can be characterized as target rich and potentially high risk areas of interest to port and waterway authorities. Lower cost monitoring is now possible because of less expensive electronics, modern positioning and sensing technology, reliable, accurate, low cost low BW, low EM profile sensors, detectors, range finders, locators and reporting methods. Not only are there lower cost alternatives to old systems, these new components and systems can operate within the current systems in spite of being lower in cost to build and operate.

Spectral Tetris Fusion Frame Constructions
Peter G. Casazza, University of Missouri

Fusion frames were designed and are being developed for a variety of military applications including sensor networks, distributed processing, packet transport of data and much more. The theory has suffered from a severe shortage of algorithms for the construction of fusion frames. A recent significant advance was made in this direction with the introduction of spectral tetris constructions. We will look at several new serious modifications of spectral tetris which now allows this method to construct much broader classes of fusion frames including - for the first time - fusion frames with weights. It is known that spectral tetris does not construct all fusion frames but it was never clear why it sometimes works and sometimes fails. We will take care of this problem by giving necessary and sufficient conditions for spectral tetris to work.

An Update on Two Recent CNSTAT Studies
Michael Cohen, American Institutes for Research

There have been two recent studies on defense system development that have been conducted by study panels under the auspices of the Committee on National Statistics of the National Research Council and which were funded by the Director of Operational Test and Evaluation and by the Undersecretary of Defense for Acquisition, Technology, and Logistics. The first study panel, the Panel on Industrial Methods for the Effective Test and Development of Defense Systems, completed its work with the release of its final report in early November, 2011. The goal of this study was to examine engineering practices, for both software and hardware systems, that have been found to be effective for commercial system development, and which could and should be applied to the development of defense systems. A second study, on the examination of reliability growth methods, including design for reliability, reliability growth testing, and reliability growth modeling, attempts to address the fact that a large fraction of recent defense systems have failed to achieve their required level of reliability, with serious consequences in terms of utility of those systems and their greatly increased life cycle costs. The final report for this study is in development. For the first study, we will outline the major findings and recommendations of the panel’s final report. For the second study, we will provide an indication of what was communicated to the panel during its September workshop and we will also provide the themes that will structure the panel’s final report, due out later this year.

Multi-dimensional shearlets and applications
Wojciech Czaja, University of Maryland

We present a new multi-dimensional transformation which contains directional components and which generalizes the concepts of the composite wavelets and shearlet transforms. This is achieved by means of exploiting representations of the extended metaplectic group. We present the properties of this new family of transformations. In particular, we prove an analog of the Calderon admissibility condition for shearlet reproducing functions. We also provide simple constructions of families of such reproducing functions. In addition, we analyze applications of 2-D shearlet constructions for applications in modeling surface tension energy.

Modular, Multi-fidelity Framework for Sea Defense Simulation
Rebecca Douglas, Cengiz Akinli, Georgia Institute of Technology

The defense of naval assets against ballistic missile threats is a novel problem which must address a number of factors not present in the defense of territory and land-based assets. The engagement and decision spaces in naval defense are generally compressed in both space and time, and the mobility of the defended assets precludes the practical implementation of a defense system with an entirely static design. Parametric systems-of-systems analysis allows for the end-game performance to be evaluated over a large design space that includes asset capabilities, placement, and firing doctrine. One critical aspect of systems-of-systems analysis is that there is increased modeling complexity which can significantly affect computational cost and modeling effort. The author describes a modular sea defense analysis environment that gives the analyst more options to address the traditional speed-fidelity trade in a manner that is optimal for the problem at hand but also provides a unique end-game tradeoff capability for “what-if” games across the entire modeling space.

Probabilistic community detection on networks
Jim Ferry, Metron, Inc.

Community detection is an important analysis capability for network data. Most approaches to this problem involve algorithms that return a single, “hard call” community structure: a specification of which nodes belong to which communities, but with no assessment of confidence for these choices. This talk presents a probabilistic approach to community detection. A highly efficient method is derived for approximating the probability that a pair of nodes belongs to the same community. These pairwise co-membership probabilities are then used to enable a variety of network-analysis capabilities, including the visualization of hierarchical and overlapping community structure, and improved, scalable layout algorithms. These capabilities are demonstrated in the tool IGNITE (Inter-Group Network Inference and Tracking Engine). Finally, the principled handling of community structure probabilities allows the detection problem to be generalized to tracking time-varying communities in time-varying network data. The result is a Bayesian filter for community structure which may be thought of as a “Kalman filter for networks”.

Compressive Depth Acquisition Cameras: Principles and Demonstrations
Vivek K. Goyal, Massachusetts Institute of Technology

LIDAR systems and time-of-flight cameras use time elapsed from transmitting a pulse and receiving a reflected response, along with scanning by the illumination source or a 2D sensor array, to acquire depth maps. We introduce depth map acquisition with high spatial and range resolution using a single, omnidirectional, time-resolved photodetector and no scanning components. This opens up possibilities for 3D sensing in compact and mobile devices.

Spatial resolution in our framework is rooted in patterned illumination or patterned reception. In contrast to compressive photography, the information of interest – scene depths – is nonlinearly mixed in the measured data. The depth map construction uses parametric signal modeling to achieve fine depth resolution and to essentially linearize the inverse problem from which spatial resolution is recovered. We have demonstrated depth map reconstruction for both near and medium-range scenes, with and without the presence of a partially-transmissive occluder.

Our compressive depth acquisition camera (CoDAC) framework is an example of broader research themes of exploiting time resolution in optical imaging and identifying and exploiting structure in inverse problems.

Compressive Sensing of Hyperspectral Images
John B. Greer, National Geospatial-Intelligence Agency

The emerging field of Compressive (CS) provides a new way to capture data by shifting the heaviest burden of data collection from the sensor to the computer on the user-end. This new means of sensing requires fewer measurements for a given amount of information than existing sensors. We investigate the efficacy of CS for capturing HyperSpectral Imagery (HSI) remotely. We also introduce a new family of algorithms for constructing HSI from CS measurements. These algorithms combine spatial Total Variation (TV) with smoothing in the spectral dimension. We examine models for three different CS sensors: the Coded Aperture Snapshot Spectral Imager-Single Disperser (CASSI-SD) [Wagadarikar et al.] and Dual Disperser (CASSI-DD) [Gehm et al.] cameras, and a random sensing model closer to CS theory, but not necessarily implementable with existing technology. We simulate the capture of remotely sensed images by applying the sensor forward models to well-known HSI scenes - an AVIRIS image of Cuprite, Nevada and the HYMAP Urban image. To measure accuracy of the CS models, we compare the scenes constructed with our new algorithm to the original AVIRIS and HYMAP cubes. The results demonstrate the possibility of accurately sensing HSI remotely with significantly fewer measurements than standard hyperspectral cameras.

Approaching Dimension-Free Construction of Nearly Orthogonal Latin Hypercubes
Alejandro S. Hernandez, Thomas W. Lucas, and Matthew Carlyle Naval Postgraduate School

We present a new method for constructing nearly orthogonal Latin hypercubes that greatly expands their availability to experimenters. Latin hypercube designs have proven useful for exploring complex, high-dimensional computational models, but can be plagued with unacceptable correlations among input variables. Methodologies to build these efficient designs can have strict limitations on the feasible number of experimental runs and variables. To overcome these restrictions, we develop and incorporate a mixed integer program in a process that generates Latin hypercubes with little or no correlation among their columns for most any determinate run-variable combinationincluding fully saturated designs. Our algorithm shows great resilience, quickly adapting to changing experimental conditions, as well as augmenting existing designs. Moreover, we can construct many designs for a specified number of runs and factorsthereby providing experimenters with a choice of several designs.

Testing Dynamically Configurable and Adaptive Systems: A Robotics View
George Dimitoglou, Hood College


Model-based Integration of Heterogeneous Simulations
Gabor Karsai, Himanshu Neema, Harmon Nine, Vanderbilt University

Simulation-based evaluation of systems and plans is an essential tool for designers and planners. However, often heterogeneous, specialized simulation tools and models need to be integrated in an overarching framework such that meaningful results can be generated. For instance, in a C2 setting, various domains, including vehicle dynamics, communication networks, computing nodes, and human organizations and decision making processes need to be co-simulated. There is no single simulation package that can perform all of the above, but there are well-developed simulation models for the individual domains. The challenge is the semantically correct composition of models in an effective way. The talk will present a model-based approach developed in the course of the Command and Control Wind Tunnel (C2WT) project and further developed during the Course of Action Simulation (CASIM) project that is based on the precise modeling of simulation interactions and the mapping between those, as well as operational sequences that drive the simulation execution framework. The integration models are used to automatically generate the interfaces and translation processes, and configure an instance of the C2WT to execute the simulations in a coordinated manner. The talk will highlight the technical issues of simulation composition and present our solutions.

Large Scale Text Analytics
Bill Ladd, Recorded Future

The types, amount, rate and sources of unstructured text data are all dramatically increasing and overwhelming the ability for interpretation. Text analytic approaches can be used to organize this unstructured data into structured data around entities, events, and time and make it available for a variety of analyses with both operational and long range implications. This talk will provide a description of a text analytic approach and a number of relevant examples.

Multi-Modeling for Course of Action Simulation
Alexander H. Levis, Abbas K. Zaidi, and Lt Gen Robert Elder. USAF (ret), George Mason University

No single model can capture the complexities of Course of Action (COA) evaluation. Three inter-operating models are used to model the mission objectives, develop alternative Courses of Action, and then simulate the execution of these COAs on the C2 Wind Tunnel to determine measures of performance and measures of effectiveness. The three models used are: (a) Timed Influence nets as implemented in the software application Pythia. The TIN models represent the causal relationships between the actionable events that are candidates for inclusion in a COA and the mission objectives. These Bayesian-like networks include deterministic or stochastic time delays. Efficient algorithms determine optimal COAs under a variety of constraints. (b) Colored Petri Nets are used to model an executable model of the mission so that performance measures can be obtained through computational experiments. The CPN models are linked to (c) communication network models expressed in OMNet++ that model the environment that enables communication between the various entities defined in the Petri Net. The linking and resulting interoperation of the CPN and OMNet++ models enables the computational study of the effects that cyber exploits can have on the probability of achieving the desired effects. The multi-modeling approach also enables the representation of multiple players (friendly, adversarial, or neutral). Results from a recent scenario will illustrate the approach.

Adaptive Adversary Modeling in Asymmetric Warfare using the Threat Plan Prediction (TPP) Model
Andy Loerch, Gregory Opas, George Mason University

Historically, Intelligence Preparation of the Battlefield (IPB) efforts have focused considerable effort on the characterization and analysis of possible courses of action available to Red forces in a region of interest. However, the unique and non-traditional character of the engagements that have dominated the first decade of this century highlights the need for IPB activities to consider the preferences of insurgent forces for certain courses of action over others. The Threat Plan Prediction (TPP) model, under development by George Mason University, addresses this need by providing an adaptive adversary characterization, including representation of the preferences of adversary groups in a region of interest with regard to specific types of asymmetric attack (IED, Sniping, etc.) at potential attack sites. The quantified preference set computed in this model is expressed as a site specific measure of the attractiveness of possible attack plans, based on a combination of factors that describe the environment within the region of interest, as well as characteristics of the threat groups, and the level of Blue presence in the region. In computing the measure of attack attractiveness, the presented methodology utilizes a novel integration of attack tree formulations with influence diagramming techniques. The influence diagram construct facilitates the amalgamation of geospatially-keyed datasets describing the region of interest, including human terrain elements (such as tribal structures), threat group value structures (based on elicitation of subject matter expertise), and the physical nature of the operating region (via analysis of available GIS datasets). These data elements, combined with characterization of the capabilities and resources of the adversary groups themselves, are used to prune a superset of possible attack options to a more manageable subset of possible actions, and calculate a measure of attractiveness for each of those attacks; this can be construed as a representation of the relative likelihood of specific types of attack at specific sites within the region of interest. Description of the model structure and interconnectivity, as well as results of an assessment of performance in predicting the attractiveness of specific, historical attack sites from OIF, will be presented.

Uncertainty Principles and Diffusion Processes on Graphs
Yue Lu, Harvard University

The spectral theory of graphs provides a bridge between classical signal processing and the nascent field of graph signal processing. In this talk, I will present a spectral graph analogy to Heisenberg’s celebrated uncertainty principle. Just as the classical result provides a tradeoff between signal localization in time and frequency, the result presented in the talk provides a fundamental tradeoff between a signal’s localization on a graph and in its spectral domain.

Using the eigenvectors of the graph Laplacian as a surrogate Fourier basis, we propose quantitative definitions of graph and spectral “spreads” and provide a complete characterization of the feasibility region of these two quantities. In particular, the lower boundary of the region, referred to as the uncertainty curve, is shown to be attained by eigenvectors associated with the smallest eigenvalues of an affine family of matrices. The convexity of the uncertainty curve allows it to be found to within ϵ by a fast approximation algorithm requiring O(ϵ-12) sparse eigenvalue evaluations. We derive closed-form expressions for the uncertainty curves for some special classes of graphs, and develop an accurate analytical approximation for the expected uncertainty curve of Erdos-Renyi random graphs. These theoretical results are validated by numerical experiments, which also reveal an intriguing connection between diffusion processes on graphs and the uncertainty bounds. (Joint work with Ameya Agaskar.)

Multiscale geometric methods for noisy point clouds in high dimensions
Mauro Maggioni, Duke University

We discuss techniques for analyzing at different scales the geometry of intrinsically low-dimensional point clouds perturbed by high-dimensional noise. We first show how such techniques may be used to estimate the intrinsic dimension, approximate tangent planes, and certain stable notions of curvatures of data sets. We then introduce a novel geometric multiscale transform, based on what we call geometric multi-resolution analysis, that leads to efficient approximation schemes for point clouds, as well as new dictionary learning methods and density estimation for data sets. Applications to anomaly detection, hyperspectral images, and compressive sampling will be discussed.

Inferential Variability in Topic Models
David Marchette, Naval Surface Warfare Center

Topic models for text analysis, particularly Latent Dirichlet Allocation (LDA) and its variants, is a very popular approach that in some circles has usurped the previous tool-of-choice, Latent Semantic Analysis (LSA). Unlike LSA, LDA is a statistical approach (although as we will see there is a statistical variant of LSA that could be viewed as the predecessor of LDA), which involves fitting a Bayesian model to the data: topics are drawn from a Dirichlet distribution, and documents are a mixture (Dirichlet) over the topics. The mechanics of performing this fit produce some counter-intuitive properties, which we will discuss, and these properties must either be mitigated or used to advantage for the inference task. We discuss these issues, provide some insight into why they happen and what can be done about them, and make some recommendations on ways to utilize them for superior inference.

Waivered Recruits: An Evaluation of their Performance and Attrition Risk
Lauren Malone, Center for Naval Analysis

The Office of the Secretary of Defense-Accession Policy asked CNA to identify how the services can minimize the risk of misconduct separation and early attrition among waivered recruits.

If the Services can identify recruit characteristics associated with these, and other, negative outcomes, they can use them as an additional screening mechanism.

In this study, we obtain Service-level waiver and personnel data for FY99-FY08 from the Defense Manpower Data Center. We first characterize the demographic and military characteristics of waivered recruits and then examine whether any of these characteristics are associated with lower risk of misconduct separation or early attrition for waivered recruits. Performance measures include 6-, 24-, and 48-month attrition, as well as the likelihood of being a fast promoter to E5. Overall, we find that waivered recruits are not inherently risky and often perform better than Tier II/III recruits. There are, however, still ways in which the Services could minimize the “riskiness” of the waivered population. For example, some waiver combinations are more likely to lead to early attrition; additional screening or mentoring of these recruits could potentially decrease their attrition risk. Waivers will continue to be essential to manning the AVF. We highlight in this research which waiver groups are in need of better management. In addition, we present a “report card” for each Service in terms of promotion to E5 and 48-month attrition, and identify which waiver groups appear to have been sufficiently screened and which have not. Most importantly, we have presented the Services with a way to think about the risks associated with waivered recruits – it is now important for the Services to determine which risks are most relevant for them and which metrics matter most.

Improving Multi-Target Tracking via Particle Filters
Vasileios Maroulas, University of Tennessee

We present a novel approach for improving particle filters for multi-target tracking with a nonlinear observation model. The suggested approach is based on drift homotopy for stochastic differential equations. Drift homotopy is used to design a Markov Chain Monte Carlo step which is appended to the particle filter and aims to bring the particle filter samples closer to the observations while at the same time respecting the target dynamics. The numerical results show that the suggested approach can improve significantly the performance of a particle filter.The talk is based on a joint work with Panos Stinis.

Mathematical Properties of System Readiness Levels
Eileen McConkie, Naval Surface Warfare Center

Systems Engineers need quantifiable metrics for measuring the readiness of a system. The recently developed system readiness level (SRL) is such a metric. SRL is a function of technology readiness level (TRL) and integration technology readiness level (IRL). The mathematical operations used to define this function have some inherent properties. Four desired mathematical properties of SRL models are developed from these inherent properties and properties suggested from a review of the literature. Matrix algebra, matric scaled and tropical algebra are discussed in the literature as possible mathematical operations. These mathematical operations are reviewed to determine if they meet the desired properties. Tropical algebra (TA) is found to inherently meet these desired properties; therefore a SRL model using TA is introduced. A simple notional fire control system is used for a preliminary comparison of the TA model to two existing SRL models. Future research will be conducted to further refine this TA model and to conduct a case study using real systems to compare and contrast the TA SRL model to existing SRL models.

Case Study: Improving Analysis with the Joint Strike Fighter Program Office Integrated Training Center (ITC) Model
Mary L. McDonald, Stephen C. Upton, Thomas W. Lucas, and Susan M. Sanchez Naval Postgraduate School

As the Joint Strike Fighter (JSF) program moves toward an operational capability, the Joint Program Office (JPO) and the services are preparing resources and making plans for pilot and maintenance training. An important component of this preparation is the Integrated Training Center (ITC) model. ITC simulates pilot training and estimates time to train (TTT) and resource utilization, based on hundreds of input parameters. We demonstrate how effective design and analysis of computer experiments leads to substantially better analysis and insights, as compared to what can be obtained by simply running a small number of hand-crafted alternatives. We will additionally show how performing large-scale experimentation can support model verification and validation and leads to the discovery of critical factors, interesting threshold values, and discovery of robust alternatives.

Additional Evidence of the Effectiveness of SDIP
Molly F. McIntosh, Center for Naval Analysis

Military Pay and Compensation Branch (N130) asked CNA to expand upon a 2010 study of whether Sea Duty Incentive Pay (SDIP) increases manning at sea. In that study, CNA found that SDIP was a cost-effective tool for inducing voluntary sea duty extensions among sailors in eligible ratings and paygrades. In this study, we return to that analysis and use a more appropriate model and set of variable definitions. We continue to find that SDIP is effective at increasing sea duty among eligible ratings and paygrades. In addition, we find that SDIP is even more cost-effective than previously thought.

Full Spark Frames
Dustin G. Mixon, Princeton University

Many applications require deterministic frames with the property that every size-M subcollection of the M-dimensional frame elements is a spanning set. Such frames are called full spark frames, and in this talk, we discuss recent results regarding their construction and verification. We also show how to use full spark frames to perform phaseless reconstruction in the noiseless case.

Information Theory and Thermodynamic Exergy Methods Applied to Network-Centric Mine Countermeasure Operations
Nick Molino, Georgia Institute of Technology

The Navys recent focus on network-centric distributed systems has led to systems of systems (SoS) which oftentimes include disparate technologies and capabilities, e.g., SoS composed of various aerial, surface and underwater unmanned and manned systems. Accordingly, it becomes increasingly difficult to compare and gauge the performance contribution that each system will have on the overall network-centric SoS. The Navy has proposed the need for innovative means to evaluate assimilation, modeling, and simulation methods as applied to the development and use of distributed and autonomous ocean systems, as outlined in the Naval S&T Strategic Plan (2009 update). Therefore, this work focuses on developing novel methods to design and analyze system-of-systems network-centric architectures through modeling and simulation. The advantages of employing unmanned systems for mine countermeasures, anti-submarine warfare, hull inspections, reconnaissance missions and various other naval operations is obvious. However, a systematic way for choosing the appropriate vehicles, technologies, concepts of operations, and network capabilities for each mission, based on the physics of the problem, is imperative. A systematic, physics-based approach is offered here. We propose the application and cultivation of two figures of merit: the work potential of a given asset/technology and the degree of situational awareness of the entire SoS. The relevance of thermodynamic work potential extends from recent developments in the design and analysis of aerospace vehicles. Based on the idea that every vehicle or technology must consume work potential in some form in order to operate, researchers at Georgia Tech were able to show that a generalized theory of vehicle design could be formulated. The applicability of this work potential theory (which is driven largely by the second law of thermodynamics) for mine countermeasures systems is demonstrated. The aggregation of work potential usages and losses will represent how well a network-centric SoS performs. A figure of merit, borrowed from information theory, based on the degree of situational awareness of a SoS is presented. Information entropy quantifies the amount of uncertainty attributed to a process or task (e.g., searching an area). Presented here are preliminary results on the applicability of these figures of merit to mine countermeasures modeling and simulation environments. The end goal of this project is to provide a framework for evaluating SoS based on their work potential usage/loss and information entropy minimization.

Comparing Mitigations Under Uncertainty in Simulated Influenza Outbreak
Leslie M. Moore, Dennis Powell, Jeanne Fair, Rene LeClaire, Los Alamos National Laboratory
Michael Wilson, Sandia National Laboratories

Consequences of an influenza outbreak and effects of factors determining mitigation strategies were studied by computer simulation using statistical design and analysis of experiments methods. Twelve input variables that specify differing degrees of disease mitigation were selected to be evaluated within the context of nine uncertain inputs that characterize influenza expression, e. g., reproductive number, case mortality rate, duration of infective period. Variables were characterized with probability distributions based on assessments of low, high, and typical values from expert opinion and literature review. An approach to experiment design called dispersion array based experiment was developed that incorporates orthogonal array based Latin hypercube sampling. The simulation experiment consisted of 128 combinations of mitigation factors each evaluated over 16 distinct disease characterizations and allowed a quick assessment of simulation results, particularly mitigation efficacy. Sensitivity evaluation of the mitigation factors showed effectiveness against the range of influenza expressions with little stress on the health care system.

Uncertainty Quantification for Carbon Capture Simulation
Leslie Moore, Sham Bhat, Joanne Wendelberger, Los Alamos National Laboratory
David Mebane, National Energy Technology Laboratory

The Carbon Capture Simulation Initiative (CCSI) is developing tools to accelerate identification of reliable and affordable processes for carbon capture from coal-fired power plants using simulation scalable to commercial level use with reduced physical testing. The effort includes implementation of tools for uncertainty quantification (UQ) critical to simulation-based analysis due to the need to understand and manage complex processes and economic impact of incorporation of carbon capture systems in current and future commercial operations. UQ tools include input sensitivity analysis, calibration of input parameters, construction of surrogate models, and propagation of uncertainty. Here we illustrate UQ use to study a solid sorbent process for carbon capture, using preliminary models and thermogravimetric analysis (TGA) data from the National Energy Technology Laboratory (NETL).

Nonlinear signal processing in impulsive noise environments
John Nolan, American University

Standard signal processing techniques perform poorly in the presence of impulsive, heavy tailed noise. We describe recent advances in nonlinear signal processing, which can perform significantly better than linear filters.

Models for Offender Target Location Selection with Explicit Dependency Structures
Mike O’Leary and Jeremiah Tucker, Towson University

The geographic profiling problem is to estimate the location of the home base for a serial offender from knowledge of the various crime site locations. One approach to this problem is to hypothesize that there is a probability distribution P(x|z) that provides the probability density that the offender commits a crime at the location x given a home base at the location z. Then, given an appropriate model P, the locations of a series of crimes x1,x2,,xn, and the assumption that the crime sites are identically distributed independent events, inferences can be drawn on the location of the home base z using a range of techniques, including Bayesian analysis.

However, the evidence suggests that crime sites are not selected independently. We shall present a new class of mixture models for offender target location selection behavior that explicitly allow for non-independence. These models will be compared with data on residential burglary, non-residential burglary, and bank robberies in Baltimore County. Implications for the larger geographic profiling problem will be discussed.

Event based community detection for networks
Patrick ONeil and Michael D. Porter, GeoEye Analytics

Network community detection is often based on the tie-strength between vertices derived from internal network activity. An alternative approach is to consider communities based on the vertices relationship to a set of exogenous events (e.g. suspicious activity, attacks, or crimes). This talk presents a methodology for discovering cooperating communities, clusters of vertices that participated in the same event(s), for covert networks. Because event participants are often hidden in covert networks, we propose “Event Participation Detection”, a method for predicting which nodes were involved in each event. By assuming that network members behave differently when involved in an event, this approach estimates participation based on how anomalous the local network structure of a vertex is around the event times. Clustering the vertices in an event participation network, constructed with edge weights based on the estimated co-participation, is used to discover the event based communities.

Coordinating Complementary Waveforms in MIMO Radar
Ali Pezeshki, Colorado State University

We consider a MIMO radar consisting of multiple collocated transmit/receive elements, where each transmit/receive pair operates at a different subcarrier frequency. Each transmit element element is assumed to be waveform agile, meaning that it can select its waveform across time from a waveform library on a pulse by pulse basis. We consider a waveform library consisting of simply two component waveforms. The component waveforms are Golay complementary and are obtained by phase coding a narrow pulse with a pair of Golay complementary sequences. We show that by properly sequencing Golay complementary waveforms across time and frequency we can annihilate the range sidelobes of the radar point-spread function inside a desired Doppler interval around the zero-Doppler axis. This enables us to extract weak targets, which are located near stronger reflectors.

The Navy Officer Lateral Transfer Process and Retention: A Matched Analysis
Jane Kleyman Pinelis, Center for Naval Analyses

Managing the officer corps well depends on accurately estimating officer losses from the Navy. Many factors affect loss rates, including the ability of officers to transfer laterally from one community to another. In order to successfully lateral transfer, officers must apply and be approved by a lateral transfer board. We examine the loss rates of officers who applied for lateral transfer but were disapproved and compare them with the loss rates of those who applied and were approved.

Whether the officer is “best and fully qualified” is part of the board decision, and data show that there are large differences on observed characteristics between officers approved and denied by the board. To suggest a link between the approval decision and leaving the Navy, we need to properly adjust for these differences.

First, we use logistic regression to model the loss of lateral transfer applicants from the active-duty Navy after the lateral transfer board as a function of the board approval decision, controlling for the effect of their military characteristics and demographics. Our findings support the hypothesis that officers who were disapproved for lateral transfer have a higher 36-month loss rate than those officers who were approved. Although we show the link between the board result and an officer’s decision to leave the Navy, regression analysis is not sufficient to assert causality. We thus supplement our analysis with a method often used in observational studies to search for causality: propensity score matching.

We divide officers into strata using full matching on the estimated probability of board approval (propensity score). We ensure that each stratum has at least one accepted and at least one rejected officer, while using a formal, recently developed covariate balance diagnostic to guide the choice of the matching structure. Our resulting stratification balances all observed variables important to the outcome (e.g. officers’ gender, promotion information, etc.) between the two groups in a way that we might expect to see had the officers been randomly selected for acceptance or rejection. By recovering the “hidden” randomized experiment from this observational study, we link the board results to the officers’ decisions to leave the Navy in a causal way. Our final analysis results are slightly different from those based on logistic regression, although the main conclusions about variables important to the probability of leaving remain the same.

Weighted kernel density for predicting the location of the next event in a series
Michael D. Porter,GeoEye Analytics and Brian J Reich, NCSU

One aspect of tactical crime or terrorism analysis is predicting the location of the next event in a series. The objective of this talk is to present a methodology to identify the optimal parameters and test the performance of temporally weighted kernel density models for predicting the location of the next event in a criminal or terrorist event series. By viewing the event series as a realization from a space-time point process, the next event prediction models can be related to estimating a conditional spatial density function. We use temporal weights that indicate how much influence past events have toward predicting future event locations and can also incorporate uncertainty in the event timing. Results from a set of crime series in Baltimore County, MD indicate that performance can vary greatly by crime type, a little by series length, and is fairly robust to choice of bandwidth.

Real Time Decision Support Using Simultaneous Parallel Distributed DEVS Simulations
Walter Powell, George Mason University, Robert Coop, RTSync Corp, Bernard Zeigler, GMU and RTSync Corp

Many models used in decision support are presented as deterministic and neither propagate nor display any information on the variability of a models set of outputs. Without information on the variability of models output, the single set of outputs may be given inappropriate weight in the final decision. In order to best analyze the potential impact that is represented by the output of a model, or series of models, the decision maker must be informed of the possible variation in that models output. Informing the decision maker of that variation will allow him/her to judge the relative impact of best case, nominal case, and worst case outcomes. Information on the variability of the set of outputs of a network of deterministic models can be generated using discrete event simulation of individual models using random perturbation of model inputs and parameters within specified ranges. The large number of simulations necessary to generate data needed to provide information on the variability of the models outputs can be completed in near real time by running multiple simulations simultaneously with individual model simulations distributed across multiple platforms. In this paper, we report on implementation of near real time variability analysis using the Discrete Event System Specification (DEVS) modeling and simulation platform. This implementation is part of a real-time data-driven system for disaster management command and control being developed for the Department of Homeland Security.

Modeling and Detection of Sudden Spurts in Activity Profile of Terrorist Groups
Vasanthan Raghavan, Aram Galstyan, and Alexander Tartakovsky University of Southern California

The main focus of this work is on developing models for the activity profile of a terrorist group, detecting sudden spurts and downfalls in this profile and, in general, tracking it over a period of time. Towards this goal, a d-state hidden Markov model (HMM) that captures the strength/Capabilities of the group and thus its activity profile is developed. The simplest setting of d = 2 corresponds to the case where the strength is coarsely quantized as Active and Inactive, respectively. Two strategies for spurt detection and tracking are developed here: a model-independent strategy that uses the exponential weighted moving-average (EWMA) filter to track the strength of the group as measured by the number of attacks perpetrated by it, and a state estimation strategy that exploits the underlying HMM structure. The EWMA strategy is robust to modeling uncertainties and errors, and tracks persistent changes (changes that last for a sufficiently long duration) in the strength of the group. On the other hand, the state estimation strategy tracks even non-persistent changes that last only for a short duration at the cost of learning the underlying model. Case-studies with real terrorism data from open-source databases are provided to illustrate the performance of the two strategies.

Data Farming: Designing and Conducting Large-scale Simulation Experiments
Susan M. Sanchez, Paul J. Sanchez, and Thomas W. Lucas Naval Postgraduate School

Computer models are integral to modern scientific research, national defense, industry and manufacturing, and in public policy debates. These computer models tend to be extremely complex, often with thousands of factors and many sources of uncertainty. To understand the impact of these factors and their intricate interactions on model outcomes requires efficient, high-dimensional design of experiments (DOE) but, all to often, many large-scale simulation models continue to be explored in ad hoc ways. This suggests that more modelers and analysts need to be aware of the power of experimental designespecially the recent breakthroughs in large-scale experimental designs. In this talk, we review a portfolio of designs that have been developed and successfully used by NPSs SEED Center for Data Farming to support decision-makers in defense and homeland security. These include single-stage designs appropriate for hundreds of factors, as well as sequential approaches that can be used to screen thousands of factors. We end with a few parting thoughts about the future of experimental design.

Distributed Threat Localization via Sparsity-Cognizant Matrix Decomposition
Ioannis D. Schizas, University of Texas at Arlington

Wireless sensor networks facilitate the collection and processing of data in a variety of differ- ent environments, including large structures, industrial facilities, battlefields and so on. In such settings threats that are generated from e.g., malicious attacks or structural defects, can be unpre- dictable both spatially and temporally. Thus, it is essential to develop pertinent algorithms that can be implemented in sensor networks and have the ability to locate such threats. To this end, a novel sparsity-aware matrix decomposition scheme is developed to determine in a distributed fashion which sensors acquire informative data about potential threats. The proposed framework employs norm-one regularization to decompose the sensor data covariance matrix into sparse factors. The resulting sparsity-cognizant algorithm is used to determine the support (nonzero en- tries) of the sparse covariance factors, and subsequently identify the threat-informative sensors. A centralized minimization formulation is given first. Then, using the notion of missing covariance entries, we obtain an optimization framework that allows distributed estimation of the unknown sparse factors. The corresponding optimization problems are tackled via simple coordinate de- scent iterations. Different from existing approaches, the novel utilization of the sparsity in the data covariance matrix allows the distributed identification of threat-informative sensors, without the need of knowing the parameters of the underlying data model.

An Application of Differentially Private Linear Mixed Modeling
Matthew J. Schneider and John Abowd, Cornell University


An Update on b-Privy Analysis
Jeffrey Solka, Naval Surface Warfare Center

This talk will provide an update on some of our recent efforts in b-privy analysis for pre-emergence identification. Our analysis pipeline will be discussed all of the way from data ingestion to entity disambiguation to quantitative analysis and including web data analysis. The web data analysis portion of the talk will be the subject of an additional presentation at the conference.

Analysis of Colored Petri Net-based Systems Models
Alan Thomas, Naval Surface Warfare Center

Current methods for testing software-intensive systems are ad hoc, labor-initensive, and costly. They lack a mathematically formal basis and are unable to address non-functional requirements (e.g. safety). Formal methods are needed to model software-intensive systems and to derive test cases from system models. Thsi presentation discusses the use of Colored Petri Nets to model systems. State spaces are calculated from the models. Analysis of state space graphs is employed to aid in the selection of test cases.

Uncertainty Quantification Methods and Software for Engineering Systems
Charles Tong, Lawrence Livermore National Laboratory

Uncertainty quantification (UQ) is increasingly recognized as a key component in modeling and simulation-based science and engineering. There are, however, many UQ challenges for large-scale applications including expensive model evaluation, large parameter space with complex correlations, highly nonlinear models, diverse data sources for calibration, etc. This talk describes a software called PSUADE that provides many techniques for comprehensive analysis of uncertainties in engineering systems. We will show how these methods can be applied to several fossil energy applications.

Measures of Statistical Output for Stochastic Simulations
Andrew Turner, Georgia Institute of Technology

Modeling and Simulation (M&S) is an important tool used for the analysis of complex problems ranging from manufacturing plants to social dynamics. Complex M&S environments require complex analysis techniques oftentimes fulfilled by the Response Surface Method (RSM). An important part of the RSM is the creation of meta-models. M&S environments that are stochastic in nature present many problems in the creation of meta-models, one of which is determining what statistical metrics should be measured and fitted. Oftentimes, the mean is the only statistical measure. The argument will be made that the statistical metrics that should be tracked are the mean and various quantiles. This argument will be made by presenting four statistical metrics (mean, variance, binomial proportions, and quantiles) and examining the predictive accuracy of the various metrics confidence intervals for a range of distributions and sample sizes. Finally, heuristics will be presented for the minimum number of repetitions required for the four statistical metrics and for the number of repetitions required for meta-model fits of said metrics.

Agent-Based Approach for Modeling Education
Joanne Wendelberger, Los Alamos National Laboratory

An agent-based modeling approach has been proposed for studying educational infrastructure. Using simulation approaches originally developed for transportation and other infrastructure studies, a new effort is underway to use agent-based modeling to understand educational systems with a focus on the behavior of individuals and their interactions with others. This modeling effort will incorporate educational theory as well as actual school district data, including longitudinal data collected on individual students over time. The goal of this work is to develop a model with predictive capability and quantified uncertainty to assess the impact of various factors on educational outcomes.