As a starting point for an explanation of a scientific theory, it is useful to define fundamental terms, to state and explain critical assumptions, and to establish—or limit—the scope of the discussion that follows. The definitions and explanations that follow are generally consistent with usage in the military and analytical communities, and with definitions that have been formulated for its work by The Military Conflict Institute. However, I have in some instances modified or restated these to conform to my own ideas and usage. [Dupuy, Understanding Combat, 63]
The basic elements of his theory of combat are:
Definition of Military Combat
The Hierarchy of Combat
The Conceptual Components of Combat
The Scope of Theory
Definition of a Theory of Combat
The cover of SPI’s monster wargame, The Campaign For North Africa: The Desert War 1940-43 [SPI][This post was originally published on 22 September 2017.]
Even as board gaming appears to be enjoying a resurgence in the age of ubiquitous computer gaming, it appears, sadly, that table-top wargaming continues its long, slow decline in popularity from its 1970s-80s heyday. Pockets of enthusiasm remain however, and there is new advocacy for wargaming as a method of professional military education.
Luke Winkie has written an ode to that bygone era through a look at the legacy of The Campaign For North Africa: The Desert War 1940-43, a so-called “monster” wargame created by designer Richard Berg and published by Simulations Publications, Inc. (SPI) in 1979. It is a representation of the entire North African theater of war at the company/battalion level, played on five maps which extend over 10 feet and include 70 charts and tables. The rule book encompasses three volumes. There are over 1,600 cardboard counter playing pieces. As befits the real conflict, the game places a major emphasis on managing logistics and supply, which can either enable or inhibit combat options. The rule book recommends that each side consist of five players, an overall commander, a battlefield commander, an air power commander, one dedicated to managing rear area activities, and one devoted to overseeing logistics.
The game map. [BoardGameGeek]
Given that a bingo clash review states that to complete a full game requires an estimated 1,500 hours, actually playing The Campaign For North Africa is something that would appeal to only committed, die-hard wargame enthusiasts (known as grognards, i.e. Napoleonic era slang for “grumblers” or veteran soldiers.) As the game blurb suggests, the infamous monster wargames were an effort to appeal to a desire for a “super detailed, intensive simulation specially designed for maximum realism,” or as realistic as war on a tabletop can be, anyway. Berg admitted that he intentionally designed the game to be “wretched excess.”
Although The Campaign For North Africa was never popular, it did acquire a distinct notoriety not entirely confined to those of us nostalgic for board wargaming’s illustriously nerdy past. It retains a dedicated fanbase. Winkie’s article describes the recent efforts of Jake, a 16-year Minnesotan who, unable to afford to buy a second-end edition of the game priced at $400, printed out the maps and rule book for himself. He and a dedicated group of friends intend to complete a game before Jake heads off to college in two years. Berg himself harbors few romantic sentiments about wargaming or his past work, having sold his own last copy of the game several years ago because a “whole bunch of dollars seemed to be [a] more worthwhile thing to have.” The greatness of SPI’s game offerings has been tempered by the realization that the company died for its business sins.
However, some folks of a certain age relate more to Jake’s youthful enthusiasm and the attraction to a love of structure and complexity embodied in The Campaign For North Africa‘s depth of detail. These elements led many of us on to a scholarly study of war and warfare. Some of us may have discovered the work of Trevor Dupuy in an advertisement for Numbers, Predictions and War: Using History to Evaluate Combat Factors and Predict the Outcome of Battles in the pages of SPI’s legendary Strategy & Tactics magazine, way back in the day.
The December 2018 issue of Phalanx, a periodical journal published by The Military Operations Research Society (MORS), contains an article by Jonathan K. Alt, Christopher Morey, and Larry Larimer, entitled “Perspectives on Combat Modeling.” (the article is paywalled, but limited public access is available via JSTOR).
Their article was written partly as a critical rebuttal to a TDI blog post originally published in April 2017, which discussed an issue of which the combat modeling and simulation community has long been aware but slow to address, known as the “Base of Sand” problem.
In short, because so little is empirically known about the real-world structures of combat processes and the interactions of these processes, modelers have been forced to rely on the judgement of subject matter experts (SMEs) to fill in the blanks. No one really knows if the blend of empirical data and SME judgement accurately represents combat because the modeling community has been reluctant to test its models against data on real world experience, a process known as validation.
TDI President Chris Lawrence subsequently published a series of blog posts responding to the specific comments and criticisms leveled by Alt, Morey, and Larimer.
How are combat models and simulations tested to see if they portray real-world combat accurately? Are they actually tested?
How can we know if combat simulations adhere to strict standards established by the DoD regarding validation? Perhaps the validation reports can be released for peer review.
Some claim that models of complex combat behavior cannot really be tested against real-world operational experience, but this has already been done. Several times.
If only the “physics-based aspects” of combat models are empirically tested, do those models reliably represent real-world combat with humans or only the interactions of weapons systems?
Is real-world historical operational combat experience useful only for demonstrating the capabilities of combat models, or is it something the models should be able to reliably replicate?
If a Subject Matter Expert (SME) can be substituted for a proper combat model validation effort, then could not a SME simply be substituted for the model? Should not all models be considered expert judgement quantified?
Persuading the military operations research community of the importance of research on real-world combat experience in modeling has been an uphill battle with a long history.
Hopefully this is my last post on the subject (but I suspect not, as I expect a public response from the three TRADOC authors). This is in response to the article in the December 2018 issue of the Phalanxby Alt, Morey and Larimer (see Part 1, Part 2, Part 3, Part 4, Part 5, Part 6). The issue here is the “Base of Sand” problem, which is what the original blog post that “inspired” their article was about:
While the first paragraph of their article addressed this blog post and they reference Paul Davis’ 1992 Base of Sand paper in their footnotes (but not John Stockfish’s paper, which is an equally valid criticism), they then do not discuss the “Base of Sand” problem further. They do not actually state whether this is a problem or not a problem. I gather by this notable omission that in fact they do understand that it is a problem, but being employees of TRADOC they are limited as to what they can publicly say. I am not.
I do address the “Base of Sand” problem in my book War by Numbers, Chapter 18. It has also been addressed in a few other posts on this blog. We are critics because we do not see significant improvement in the industry. In some cases, we are seeing regression.
In the end, I think the best solution for the DOD modeling and simulation community is not to “circle the wagons” and defend what they are currently doing, but instead acknowledge the limitations and problems they have and undertake a corrective action program. This corrective action program would involve: 1) Properly addressing how to measure and quantify certain aspects of combat (for example: Breakpoints) and 2) Validating these aspects and the combat models these aspects are part of by using real-world combat data. This would be an iterative process, as you develop and then test the model, then further develop it, and then test it again. This moves us forward. It is a more valued approach than just “circling the wagons.” As these models and simulations are being used to analyze processes that may or may not make us fight better, and may or may not save American service members lives, then I think it is important enough to do right. That is what we need to be focused on, not squabbling over a blog post (or seven).
On the first page (page 28) in the third column they make the statement that:
Models of complex systems, especially those that incorporate human behavior, such as that demonstrated in combat, do not often lend themselves to empirical validation of output measures, such as attrition.
Really? Why can’t you? If fact, isn’t that exactly the model you should be validating?
More to the point, people have validated attrition models. Let me list a few cases (this list is not exhaustive):
1. Done by Center for Army Analysis (CAA) for the CEM (Concepts Evaluation Model) using Ardennes Campaign Simulation Study (ARCAS) data. Take a look at this study done for Stochastic CEM (STOCEM):
2. Done in 2005 by The Dupuy Institute for six different casualty estimation methodologies as part of Casualty Estimation Methodologies Studies. This was work done for the Army Medical Department and funded by DUSA (OR). It is listed here as report CE-1:
3. Done in 2006 by The Dupuy Institute for the TNDM (Tactical Numerical Deterministic Model) using Corps and Division-level data. This effort was funded by Boeing, not the U.S. government. This is discussed in depth in Chapter 19 of my book War by Numbers (pages 299-324) where we show 20 charts from such an effort. Let me show you one from page 315:
So, this is something that multiple people have done on multiple occasions. It is not so difficult that The Dupuy Institute was not able to do it. TRADOC is an organization with around 38,000 military and civilian employees, plus who knows how many contractors. I think this is something they could also do if they had the desire.
On the first page (page 28) top of the third column they make the rather declarative statement that:
The combat simulations used by military operations research and analysis agencies adhere to strict standards established by the DoD regarding verification, validation and accreditation (Department of Defense, 2009).
Now, I have not reviewed what has been done on verification, validation and accreditation since 2009, but I did do a few fairly exhaustive reviews before then. One such review is written up in depth in The International TNDM Newsletter. It is Volume 1, No. 4 (February 1997). You can find it here:
The newsletter includes a letter dated 21 January 1997 from the Scientific Advisor to the CG (Commanding General) at TRADOC (Training and Doctrine Command). This is the same organization that the three gentlemen who wrote the article in the Phalanx work for. The Scientific Advisor sent a letter out to multiple commands to try to flag the issue of validation (letter is on page 6 of the newsletter). My understanding is that he received few responses (I saw only one, it was from Leavenworth). After that, I gather there was no further action taken. This was a while back, so maybe everything has changed, as I gather they are claiming with that declarative statement. I doubt it.
This issue to me is validation. Verification is often done. Actual validations are a lot rarer. In 1997, this was my list of combat models in the industry that had been validated (the list is on page 7 of the newsletter):
1. Atlas (using 1940 Campaign in the West)
2. Vector (using undocumented turning runs)
3. QJM (by HERO using WWII and Middle-East data)
4. CEM (by CAA using Ardennes Data Base)
5. SIMNET/JANUS (by IDA using 73 Easting data)
Now, in 2005 we did a report on Casualty Estimation Methodologies (it is report CE-1 list here: We reviewed the listing of validation efforts, and from 1997 to 2005…nothing new had been done (except for a battalion-level validation we had done for the TNDM). So am I now to believe that since 2009, they have actively and aggressively pursued validation? Especially as most of this time was in a period of severely declining budgets, I doubt it. One of the arguments against validation made in meetings I attended in 1987 was that they did not have the time or budget to spend on validating. The budget during the Cold War was luxurious by today’s standards.
If there have been meaningful validations done, I would love to see the validation reports. The proof is in the pudding…..send me the validation reports that will resolve all doubts.
The Military Operations Research Society (MORS) publishes a periodical journal called the Phalanx. In the December 2018 issue was an article that referenced one of our blog posts. This took us by surprise. We only found out about thanks to one of the viewers of this blog. We are not members of MORS. The article is paywalled and cannot be easily accessed if you are not a member.
It is titled “Perspectives on Combat Modeling” (page 28) and is written by Jonathan K. Alt, U.S. Army TRADOC Analysis Center, Monterey, CA.; Christopher Morey, PhD, Training and Doctrine Command Analysis Center, Ft. Leavenworth, Kansas; and Larry Larimer, Training and Doctrine Command Analysis Center, White Sands, New Mexico. I am not familiar with any of these three gentlemen.
The blog post that appears to be generating this article is this one:
Simply by coincidence, Shawn Woodford recently re-posted this in January. It was originally published on 10 April 2017 and was written by Shawn.
The opening two sentences of the article in the Phalanx reads:
Periodically, within the Department of Defense (DoD) analytic community, questions will arise regarding the validity of the combat models and simulations used to support analysis. Many attempts (sic) to resurrect the argument that models, simulations, and wargames “are built on the thin foundation of empirical knowledge about the phenomenon of combat.” (Woodford, 2017).
It is nice to be acknowledged, although it this case, it appears that we are being acknowledged because they disagree with what we are saying.
Probably the word that gets my attention is “resurrect.” It is an interesting word, that implies that this is an old argument that has somehow or the other been put to bed. Granted it is an old argument. On the other hand, it has not been put to bed. If a problem has been identified and not corrected, then it is still a problem. Age has nothing to do with it.
On the other hand, maybe they are using the word “resurrect” because recent developments in modeling and validation have changed the environment significantly enough that these arguments no longer apply. If so, I would be interested in what those changes are. The last time I checked, the modeling and simulation industry was using many of the same models they had used for decades. In some cases, were going back to using simpler hex-games for their modeling and wargaming efforts. We have blogged a couple of times about these efforts. So, in the world of modeling, unless there have been earthshaking and universal changes made in the last five years that have completely revamped the landscape….then the decades old problems still apply to the decades old models and simulations.
More to come (this is the first of at least 7 posts on this subject).
Source: David A. Shlapak and Michael Johnson. Reinforcing Deterrence on NATO’s Eastern Flank: Wargaming the Defense of the Baltics. Santa Monica, CA: RAND Corporation, 2016.
[UPDATE] We had several readers recommend games they have used or would be suitable for simulating Multi-Domain Battle and Operations (MDB/MDO) concepts. These include several classic campaign-level board wargames:
Chris Lawrence recently looked at C-WAM and found that it uses a lot of traditional board wargaming elements, including methodologies for determining combat results, casualties, and breakpoints that have been found unable to replicate real-world outcomes (aka “The Base of Sand” problem).
What other wargames, models, and simulations are there being used out there? Are there any commercial wargames incorporating MDB/MDO elements into their gameplay? What methodologies are being used to portray MDB/MDO effects?
A great deal of importance has been placed on the knowledge derived from these activities. As the U.S. Army Training and Doctrine Command recently stated,
Concept analysis informed by joint and multinational learning events…will yield the capabilities required of multi-domain battle. Resulting doctrine, organization, training, materiel, leadership, personnel and facilities solutions will increase the capacity and capability of the future force while incorporating new formations and organizations.
There is, however, a problem afflicting the Defense Department’s wargames, of which the military operations research and models and simulations communities have long been aware, but have been slow to address: their models are built on a thin foundation of empirical knowledge about the phenomenon of combat. None have proven the ability to replicate real-world battle experience. This is known as the “base of sand” problem.
A Brief History of The Base of Sand
All combat models and simulations are abstracted theories of how combat works. Combat modeling in the United States began in the early 1950s as an extension of military operations research that began during World War II. Early model designers did not have large base of empirical combat data from which to derive their models. Although a start had been made during World War II and the Korean War to collect real-world battlefield data from observation and military unit records, an effort that provided useful initial insights, no systematic effort has ever been made to identify and assemble such information. In the absence of extensive empirical combat data, model designers turned instead to concepts of combat drawn from official military doctrine (usually of uncertain provenance), subject matter expertise, historians and theorists, the physical sciences, or their own best guesses.
As the U.S. government’s interest in scientific management methods blossomed in the late 1950s and 1960s, the Defense Department’s support for operations research and use of combat modeling in planning and analysis grew as well. By the early 1970s, it became evident that basic research on combat had not kept pace. A survey of existing combat models by Gary Shubik and Martin Brewer for RAND in 1972 concluded that
Basic research and knowledge is lacking. The majority of the MSGs [models, simulations and games] sampled are living off a very slender intellectual investment in fundamental knowledge…. [T]he need for basic research is so critical that if no other funding were available we would favor a plan to reduce by a significant proportion all current expenditures for MSGs and to use the saving for basic research.
The [Defense Department]is becoming critically dependent on combat models (including simulations and war games)—even more dependent than in the past. There is considerable activity to improve model interoperability and capabilities for distributed war gaming. In contrast to this interest in model-related technology, there has been far too little interest in the substance of the models and the validity of the lessons learned from using them. In our view, the DoD does not appreciate that in many cases the models are built on a base of sand…
[T]he DoD’s approach in developing and using combat models, including simulations and war games, is fatally flawed—so flawed that it cannot be corrected with anything less than structural changes in management and concept. [Original emphasis]
As a remedy, the authors recommended that the Defense Department create an office to stimulate a national military science program. This Office of Military Science would promote and sponsor basic research on war and warfare while still relying on the military services and other agencies for most research and analysis.
Davis and Blumenthal initially drafted their white paper before the 1991 Gulf War, but the performance of the Defense Department’s models and simulations in that conflict underscored the very problems they described. Defense Department wargames during initial planning for the conflict reportedly predicted tens of thousands of U.S. combat casualties. These simulations were said to have led to major changes in U.S. Central Command’s operational plan. When the casualty estimates leaked, they caused great public consternation and inevitable Congressional hearings.
The Defense Department’s current generation of models and simulations harbor the same weaknesses as the ones in use in the 1990s. Some are new iterations of old models with updated graphics and code, but using the same theoretical assumptions about combat. In most cases, no one other than the designers knows exactly what data and concepts the models are based upon. This practice is known in the technology world as black boxing. While black boxing may be an essential business practice in the competitive world of government consulting, it makes independently evaluating the validity of combat models and simulations nearly impossible. This should be of major concern because many models and simulations in use today contain known flaws.
Others, such as the Joint Conflict And Tactical Simulation (JCATS), MAGTF Tactical Warfare System (MTWS), and Warfighters’ Simulation (WARSIM) adjudicate ground combat using probability of hit/probability of kill (pH/pK) algorithms. Corps Battle Simulation (CBS) uses pH/pK for direct fire attrition and a modified version of Lanchester for indirect fire. While these probabilities are developed from real-world weapon system proving ground data, their application in the models is combined with inputs from subjective sources, such as outputs from other combat models, which are likely not based on real-world data. Multiplying an empirically-derived figure by a judgement-based coefficient results in a judgement-based estimate, which might be accurate or it might not. No one really knows.
This state of affairs seems remarkable given the enormous stakes that are being placed on the output of the Defense Department’s modeling and simulation activities. After decades of neglect, remedying this would require a dedicated commitment to sustained basic research on the military science of combat and warfare, with no promise of a tangible short-term return on investment. Yet, as Biddle pointed out, “With so much at stake, we surely must do better.”
[NOTE: The attrition methodologies used in CBS and WARSIM have been corrected since this post was originally published per comments provided by their developers.]
[This piece was originally posted on 13 July 2016.]
Trevor Dupuy’s article cited in my previous post, “Combat Data and the 3:1 Rule,” was the final salvo in a roaring, multi-year debate between two highly regarded members of the U.S. strategic and security studies academic communities, political scientist John Mearsheimer and military analyst/polymath Joshua Epstein. Carried out primarily in the pages of the academic journal International Security, Epstein and Mearsheimer argued the validity of the 3-1 rule and other analytical models with respect the NATO/Warsaw Pact military balance in Europe in the 1980s. Epstein cited Dupuy’s empirical research in support of his criticism of Mearsheimer’s reliance on the 3-1 rule. In turn, Mearsheimer questioned Dupuy’s data and conclusions to refute Epstein. Dupuy’s article defended his research and pointed out the errors in Mearsheimer’s assertions. With the publication of Dupuy’s rebuttal, the International Security editors called a time out on the debate thread.
These debates played a prominent role in the “renaissance of security studies” because they brought together scholars with different theoretical, methodological, and professional backgrounds to push forward a cohesive line of research that had clear implications for the conduct of contemporary defense policy. Just as importantly, the debate forced scholars to engage broader, fundamental issues. Is “military power” something that can be studied using static measures like force ratios, or does it require a more dynamic analysis? How should analysts evaluate the role of doctrine, or politics, or military strategy in determining the appropriate “balance”? What role should formal modeling play in formulating defense policy? What is the place for empirical analysis, and what are the strengths and limitations of existing data?[1]
It is well worth the time to revisit the contributions to the 1980s debate. I have included a bibliography below that is not exhaustive, but is a place to start. The collapse of the Soviet Union and the end of the Cold War diminished the intensity of the debates, which simmered through the 1990s and then were obscured during the counterterrorism/ counterinsurgency conflicts of the post-9/11 era. It is possible that the challenges posed by China and Russia amidst the ongoing “hybrid” conflict in Syria and Iraq may revive interest in interrogating the bases of military analyses in the U.S and the West. It is a discussion that is long overdue and potentially quite illuminating.