Mystics & Statistics

A blog on quantitative historical analysis hosted by The Dupuy Institute

Validation by Use

Sacrobosco, Tractatus de Sphaera (1550 AD)

Another argument I have heard over the decades is that models are validated by use. Apparently the argument is that these models have been used for so long, and so many people have worked with their outputs, that they must be fine. I have seen this argument made in writing by a senior army official in 1997 in response to a letter addressing validation that we encouraged TRADOC to be send out:

See: http://www.dupuyinstitute.org/pdf/v1n4.pdf

I doubt that there is any regulation discussing “validation by use,” and I doubt anyone has ever defended this idea in public paper. Still, it is an argument that I have heard used far more than once or twice.

Now, part of the problem is that some of these models have been around a few decades. For example, the core of some of the models used by CAA, for example COSAGE, first came into existence in 1969. They are using a 50-year updated model to model modern warfare. My father worked with this model. RAND’s JICM (Joint Integrated Contingency Model) dates back to the 1980s, so it is at least 30 years old. The irony is that some people argue that one should not use historical warfare examples to validate models of modern warfare. These models now have a considerable legacy.

From a practical point of view, it means that the people who originally designed and developed the model have long since retired. In many cases, the people who intimately knew the inner workings of the model have also retired and have not really been replaced. Some of these models have become “black boxes” where the users do not really know the details of how the models calculate their results. So suddenly, validation by use seems like a reasonable argument, because these models pre-date the analysts, and they assume that there is some validity to them, as people have been using them. They simple inherited the model. Why question it?

Illustration by Bartolomeu Velho, 1568 AD

China and Russia Defeats the USA

A couple of recent articles on that latest wargaming effort done by RAND:

https://www.americanthinker.com/blog/2019/03/rand_corp_wargames_us_loses_to_combined_russiachina_forces.html

The opening line states: “The RAND Corporation’s annual ‘Red on Blue’ wargame simulation found that the United States would be a loser in a conventional confrontation with Russia and China.”

A few other quotes:

  1. “Blue gets its ass handed to it.”
  2. “…the U.S. forces ‘suffer heavy losses in one scenario after another and still can’t stop Russia or China from overrunning U.S. allies in the Baltics or Taiwan:”

Also see: https://www.asiatimes.com/2019/03/article/did-rand-get-it-right-in-its-war-game-exercise/

A few quotes from that article:

  1. “The US and NATO are unable to stop an attack in the Balkans by the Russians,….
  2. “…and the United States and its allies are unable to prevent the takeover of Taiwan by China.

The articles do not state what simulations were used to wargame this. The second article references this RAND study (RAND Report) but my quick perusal of it did not identify what simulations were used. A search on the words “model” and “wargame” produced nothing. The words “simulation” and “gaming” leads to the following:

  1.  “It draws on research, analysis, and gaming that the RAND Corporation has done in recent years, incorporating the efforts of strategists, regional specialists, experts in both conventional and irregular military operations, and those skilled in the use of combat simulation tools.”
  2. “Money, time, and talent must therefore be allocated not only to the development and procurement of new equipment and infrastructure, but also to concept development, gaming and analysis, field experimentation, and exploratory joint force exercises.”

Anyhow, curious as to what wargames they were using (JICM – Joint Integrated Contingency Model?). I was not able to find out with a cursory search.

Dupuy’s Verities: The Effects of Firepower in Combat

A German artillery barrage falling on Allied trenches, probably during the Second Battle of Ypres in 1915, during the First World War. [Wikimedia]

The eleventh of Trevor Dupuy’s Timeless Verities of Combat is:

Firepower kills, disrupts, suppresses, and causes dispersion.

From Understanding War (1987):

It is doubtful if any of the people who are today writing on the effect of technology on warfare would consciously disagree with this statement. Yet, many of them tend to ignore the impact of firepower on dispersion, and as a consequence they have come to believe that the more lethal the firepower, the more deaths, disruption, and suppression it will cause. In fact, as weapons have become more lethal intrinsically, their casualty-causing capability has either declined or remained about the same because of greater dispersion of targets. Personnel and tank loss rates of the 1973 Arab-Israeli War, for example, were quite similar to those of intensive battles of World War II and the casualty rates in both of these wars were less than in World War I. (p. 7)

Research and analysis of real-world historical combat data by Dupuy and TDI has identified at least four distinct combat effects of firepower: infliction of casualties (lethality), disruption, suppression, and dispersion. All of them were found to be heavily influenced—if not determined—by moral (human) factors.

Again, I have written extensively on this blog about Dupuy’s theory about the historical relationship between weapon lethality, dispersion on the battlefield, and historical decline in average daily combat casualty rates. TDI President Chris Lawrence has done further work on the subject as well.

TDI Friday Read: Lethality, Dispersion, And Mass On Future Battlefields

Human Factors In Warfare: Dispersion

Human Factors In Warfare: Suppression

There appears to be a fundamental difference in interpretation of the combat effects of firepower between Dupuy’s emphasis on the primacy of human factors and Defense Department models that account only for the “physics-based” casualty-inflicting capabilities of weapons systems. While U.S. Army combat doctrine accounts for the interaction of firepower and human behavior on the battlefield, it has no clear method for assessing or even fully identifying the effects of such factors on combat outcomes.

NYMAS in Manhattan on Friday, 26 April

I will be presenting my book War by Numbers at the New York Military Affairs Symposium (NYMAS) on Friday, 26 April, at the Soldiers Sailors Club in New York City. The announcement is here: http://www.nymas.org/

The format is that I talk for an hour or so, and then take questions for the next 45 minutes.

There is a presentation on Friday the 12th of “The Grosse Importance of Kleine Krieg: Logistics, Operations, and ‘Little Wars’ in the late 17th Century Low Countries” by John Stapleton, U.S. Military Academy.

Face Validation

The phrase “face validation” shows up in our blog post earlier this week on Combat Adjudication. It is a phrase I have heard many times over the decades, sometimes by very established Operation Researchers (OR). So what does it mean?

Well, it is discussed in the Department of the Army Pamphlet 5-11: Verification, Validation and Accreditation of Army Models and Simulations: Pamphlet 5-11

Their first mention of it is on page 34: “SMEs [Subject Matter Experts] or other recognized individuals in the field of inquiry. The process by which experts compare M&S [Modeling and Simulation] structure and M&S output to their estimation of the real world is called face validation, peer review, or independent review.”

On page 35 they go on to state: “RDA [Research, Development, and Acquisition]….The validation method typically chosen for this category of M&S is face validation.”

And on page 36 under Technical Methods: “Face validation. This is the process of determining whether an M&S, on the surface, seems reasonable to personnel who are knowledgeable about the system or phenomena under study. This method applies the knowledge and understanding of experts in the field and is subject to their biases. It can produce a consensus of the community if the number of breadth of experience of the experts represent the key commands and agencies. Face validation is a point of departure to determine courses of action for more comprehensive validation efforts.” [I put the last part in bold]

Page 36: “Functional decomposition (sometimes known as piecewise validation)….When used in conjunction with face validation of the overall M&S results, functional decomposition is extremely useful in reconfirming previous validation of a recently modified portions of the M&S.”

I have not done a survey of all army, air force, navy, marine, coast guard or Department of Defense (DOD) regulations. This one is enough.

So, “face validation” is asking one or more knowledgeable (or more senior) people if the model looks good. I guess it really depends on whose the expert is and to what depth they look into it. I have never seen a “face validation” report (validation reports are also pretty rare).

Who’s “faces” do they use? Are they outside independent people or people inside the organization (or the model designer himself)? I am kind of an expert, yet, I have never been asked. I do happen to be one of the more experienced model validation people out there, having managed or directly created six+ validation databases and having conducted five validation-like exercises. When you consider that most people have not done one, should I be a “face” they contact? Or is this process often just to “sprinkle holy water” on the model and be done?

In the end, I gather for practical purposes the process of face validation is that if a group of people think it is good, then it is good. In my opinion, “face validation” is often just an argument that allows people to explain away or simply dismiss the need for any rigorous analysis of the model. The pamphlet does note that “Face validation is a point of departure to determine courses of action for more comprehensive validation efforts.” How often have we’ve seen the subsequent comprehensive validation effort? Very, very rarely. It appears that “face validation” is the end point.
Is this really part of the scientific method?

A Time for Crumpets

Charles MacDonald published in 1985 A Time for Trumpets, one of the better books on the Battle of the Bulge (and there are actually a lot of good works on this battle). In there he recounted a story of why the German Panzer Lehr Panzer Division, commanded by General Fritz Bayerlein, was held up for the better part of a day during the Battle for Bastogne. To quote:

For all Bayerlein’s concern about that armored force, he himself was at the point of directing less than full attention to conduct of the battle. In a wood outside Mageret, his troops had found a platoon from an American field hospital, and among the staff, a “young, blonde, and beautiful” American nurse attracted Bayerlein’s attention. Through much of December 19, he “dallied” with the nurse, who “held him spellbound.” [page 295]

Apparently MacDonald’s book was not the only source of this story: http://theminiaturespage.com/boards/msg.mv?id=186079

Now, I don’t know if “dallied” means that they were having tea and crumpets, or involved in something more intimate. The story apparently comes from Bayerlein himself, so something probably happened, but exactly what is not known. He was relieved of command after the failed offensive.

Fritz Bayerlein, March 1944 (Source: Bundesarchiv, Bild 146-1978-033-02/Dinstueler/CC-BY-SA 3.0)

When we met with Charles MacDonald in 1989, I did ask him about this story. He then recounted that he was recently at a U.S. veterans gathering talking to some other people, and some lady came up to him and told him that she knew the nurse in the story. MacDonald said he would get back to her….but then could not locate her later. So this was an opportunity to confirm and get more details of the story, but, it was lost (to history). But it does sort of confirm that there is some basis to Bayerlein’s story.

Now, this discussion with MacDonald is from memory, but I believe (the authors) Jay Karamales,  Richard Anderson and possibly Curt Johnson were also at that dinner, and they may remember the conversation (differently?).

Anyhow, A Time for Strumpets Trumpets is a book worth reading.

Combat Adjudication

As I stated in a previous post, I am not aware of any other major validation efforts done in the last 25 years other than what we have done. Still, there is one other effort that needs to be mentioned. This is described in a 2017 report: Using Combat Adjudication to Aid in Training for Campaign Planning.pdf

I gather this was work by J-7 of the Joint Staff to develop Joint Training Tools (JTT) using the Combat Adjudication Service (CAS) model. There are a few lines in the report that warm my heart:

  1. “It [JTT] is based on and expanded from Dupuy’s Quantified Judgement Method of Analysis (QJMA) and Tactical Deterministic Model.”
  2. “The CAS design used Dupuy’s data tables in whole or in part (e.g. terrain, weather, water obstacles, and advance rates).”
  3. “Non-combat power variables describing the combat environment and other situational information are listed in Table 1, and are a subset of variables (Dupuy, 1985).”
  4. “The authors would like to acknowledge COL Trevor N. Dupuy for getting Michael Robel interested in combat modeling in 1979.”

Now, there is a section labeled verification and validation. Let me quote from that:

CAS results have been “Face validated” against the following use cases:

    1. The 3:1 rules. The rule of thumb postulating an attacking force must have at least three times the combat power of the defending force to be successful.
    2. 1st (US) Infantry Divison vers 26th (IQ) Infantry Division during Desert Storm
    3. The Battle of 73 Easting: 2nd ACR versus elements of the Iraqi Republican Guards
    4. 3rd (US) Infantry Division’s first five days of combat during Operation Iraqi Freedom (OIF)

Each engagement is conducted with several different terrain and weather conditions, varying strength percentages and progresses from a ground only engagement to multi-service engagements to test the effect of CASP [Close Air Support] and interdiction on the ground campaign. Several shortcomings have been detected, but thus far ground and CASP match historical results. However, modeling of air interdiction could not be validated.

So, this is a face validation based upon three cases. This is more than what I have seen anyone else do in the last 25 years.

Dupuy’s Verities: Surprise

The Death of Paulus Aemilius at the Battle of Cannae by John Trumbell (1773). [Wikimedia]

The tenth of Trevor Dupuy’s Timeless Verities of Combat is:

Surprise substantially enhances combat power.

From Understanding War (1987):

Achieving surprise in combat has always been important. It is perhaps more important today than ever. Quantitative analysis of historical combat shows that surprise has increased the combat power of military forces in those engagements in which it was achieved. Surprise has proven to be the greatest of all combat multipliers. It may be the most important of the Principles of War; it is at least as important as Mass and Maneuver.

I have already written quite a bit on Dupuy’s conceptualization of surprise so I won’t go into it in detail here. These previous posts provide a summary:

The Combat Value of Surprise

Human Factors In Warfare: Surprise

Dupuy’s analysis focused on how surprise influenced combat power by enhancing the mobility and reducing the vulnerability of the side with surprise, and by increasing the vulnerability of the side that was surprised. In 2004, TDI undertook a study for the late Andy Marshall’s Office of the Secretary of Defense/Net Assessment to measure the historical combat value of situational awareness (more knowledge by one side than the other) and informational advantage (better knowledge by one side than the other) and how each of these factors related to surprise in combat. Chris Lawrence detailed this research and its conclusions in chapters 10 and 11 in his 2017 book, War by Numbers: Understanding Conventional Combat.

In general, the study found that both superior situational awareness and better information enhanced combat power, though perhaps not quite as much as inferred from the relevant literature. It also confirmed that surprise conferred an even greater combat power benefit, above and beyond that provided by battlefield awareness or informational advantages. It also suggested that the primary benefit of a situational or knowledge advantage in combat was not in achieving surprise over an enemy, but in preventing an opponent from achieving surprise itself.

These results, though quite suggestive, were tentative and more research is necessary. However, no follow on studies on this subject have been funded to date.

NYMAS on 26 April

I will be presenting my book War by Numbers at the New York Military Affairs Symposium (NYMAS) on Friday, 26 April, at the Soldiers Sailors Club in New York City. On the 12th there is a presentation of “The Grosse Importance of Kleine Krieg: Logistics, Operations, and ‘Little Wars’ in the late 17th Century Low Countries” by John Stapleton, U.S. Military Academy. The announcement is here: http://www.nymas.org/