Tag Prediction

Forecasting the Iraqi Insurgency

Shawn Woodford
January 21, 2019

[This piece was originally posted on 27 June 2016.]

Previous posts have detailed casualty estimates by Trevor Dupuy or The Dupuy Institute (TDI) for the 1990-91 Gulf War and the 1995 intervention in Bosnia. Today I will detail TDI’s 2004 forecast for U.S. casualties in the Iraqi insurgency that began in 2003.

In April 2004, as simultaneous Sunni and Shi’a uprisings dramatically expanded the nascent insurgency in Iraq, the U.S. Army Center for Army Analysis (CAA) accepted an unsolicited proposal from TDI President and Executive Director Christopher Lawrence to estimate likely American casualties in the conflict. A four-month contract was finalized in August.

The methodology TDI adopted for the estimate was a comparative case study analysis based on a major data collection effort on insurgencies. 28 cases were selected for analysis based on five criteria:

The conflict had to be post-World War II to facilitate data collection;
It had to have lasted more than a year (as was already the case in Iraq);
It had to be a developed nation intervening in a developing nation;
The intervening nation had to have provided military forces to support or establish an indigenous government; and
There had to be an indigenous guerilla movement (although it could have received outside help).

Extensive data was collected from these 28 cases, including the following ten factors used in the estimate:

Country Area
Orderliness
Population
Intervening force size
Border Length
Insurgency force size
Outside support
Casualty rate
Political concept
Force ratios

Initial analysis compared this data to insurgency outcomes, which revealed some startlingly clear patterns suggesting cause and effect relationships. From this analysis, TDI drew the following conclusions:

It is difficult to control large countries.
It is difficult to control large populations.
It is difficult to control an extended land border.
Limited outside support does not doom an insurgency.
“Disorderly” insurgencies are very intractable and often successful insurgencies.
Insurgencies with large intervening third-party counterinsurgent forces (above 95,000) often succeed.
Higher combat intensities do not doom an insurgency.

In all, TDI assessed that the Iraqi insurgency fell into the worst category in nine of the ten factors analyzed. The outcome would hinge on one fundamental question: was the U.S. facing a regional, factional insurgency in Iraq or a widespread anti-intervention insurgency? Based on the data, if the insurgency was factional or regional, it would fail. If it became a nationalist revolt against a foreign power, it would succeed.

Based on the data and its analytical conclusions, TDI provided CAA with an initial estimate in December 2004, and a final version in January 2005:

Insurgent force strength is probably between 20,000–60,000.
This is a major insurgency.
- It is of medium intensity.
It is a regional or factionalized insurgency and must remain that way.
U.S. commitment can be expected to be relatively steady throughout this insurgency and will not be quickly replaced by indigenous forces.
It will last around 10 or so years.
It may cost the U.S. 5,000 to 10,000 killed.
- It may be higher.
- This assumes no major new problems in the Shiite majority areas.

When TDI made its estimate in December 2004, the conflict had already lasted 21 months, and U.S. casualties were 1,335 killed, 1,038 of them in combat.

When U.S. forces withdrew from Iraq in December 2011, the war had gone on for 105 months (8.7 years), and U.S. casualties had risen to 4,485 fatalities—3,436 in combat. The United Kingdom lost 180 troops killed and Coalition allies lost 139. There were at least 468 contractor deaths from a mix of nationalities. The Iraqi Army and police suffered at least 10,125 deaths. Total counterinsurgent fatalities numbered at least 15,397.

As of this date, the conflict in Iraq that began in 2003 remains ongoing.

NOTES

Christopher A. Lawrence, America’s Modern Wars: Understanding Iraq, Afghanistan and Vietnam (Philadelphia, PA: Casemate, 2015) pp. 11-31; Appendix I.

Casualty estimation Iraq Methodologies

Comparing Force Ratios to Casualty Exchange Ratios

Shawn Woodford
December 14, 2018

“American Marines in Belleau Wood (1918)” by Georges Scott [Wikipedia]

Comparing Force Ratios to Casualty Exchange Ratios
Christopher A. Lawrence

[The article below is reprinted from the Summer 2009 edition of The International TNDM Newsletter.]

There are three versions of force ratio versus casualty exchange ratio rules, such as the three-to-one rule (3-to-1 rule), as it applies to casualties. The earliest version of the rule as it relates to casualties that we have been able to ﬁnd comes from the 1958 version of the U.S. Army Maneuver Control manual, which states: “When opposing forces are in contact, casualties are assessed in inverse ratio to combat power. For friendly forces advancing with a combat power superiority of 5 to 1, losses to friendly forces will be about 1/5 of those suffered by the opposing force.”[1]

The RAND version of the rule (1992) states that: “the famous ‘3:1 rule ’, according to which the attacker and defender suffer equal fractional loss rates at a 3:1 force ratio the battle is in mixed terrain and the defender enjoys ‘prepared ’defenses…” [2]

Finally, there is a version of the rule that dates from the 1967 Maneuver Control manual that only applies to armor that shows:

As the RAND construct also applies to equipment losses, then this formulation is directly comparable to the RAND construct.

Therefore, we have three basic versions of the 3-to-1 rule as it applies to casualties and/or equipment losses. First, there is a rule that states that there is an even fractional loss ratio at 3-to-1 (the RAND version), Second, there is a rule that states that at 3-to-1, the attacker will suffer one-third the losses of the defender. And third, there is a rule that states that at 3-to-1, the attacker and defender will suffer the same losses as the defender. Furthermore, these examples are highly contradictory, with either the attacker suffering three times the losses of the defender, the attacker suffering the same losses as the defender, or the attacker suffering 1/3 the losses of the defender.

Therefore, what we will examine here is the relationship between force ratios and exchange ratios. In this case, we will ﬁrst look at The Dupuy Institute’s Battles Database (BaDB), which covers 243 battles from 1600 to 1900. We will chart on the y-axis the force ratio as measured by a count of the number of people on each side of the forces deployed for battle. The force ratio is the number of attackers divided by the number of defenders. On the x-axis is the exchange ratio, which is a measured by a count of the number of people on each side who were killed, wounded, missing or captured during that battle. It does not include disease and non-battle injuries. Again, it is calculated by dividing the total attacker casualties by the total defender casualties. The results are provided below:

As can be seen, there are a few extreme outliers among these 243 data points. The most extreme, the Battle of Tippennuir (l Sep 1644), in which an English Royalist force under Montrose routed an attack by Scottish Covenanter militia, causing about 3,000 casualties to the Scots in exchange for a single (allegedly self-inﬂicted) casualty to the Royalists, was removed from the chart. This 3,000-to-1 loss ratio was deemed too great an outlier to be of value in the analysis.

As it is, the vast majority of cases are clumped down into the corner of the graph with only a few scattered data points outside of that clumping. If one did try to establish some form of curvilinear relationship, one would end up drawing a hyperbola. It is worthwhile to look inside that clump of data to see what it shows. Therefore, we will look at the graph truncated so as to show only force ratios at or below 20-to-1 and exchange rations at or below 20-to-1.

Again, the data remains clustered in one corner with the outlying data points again pointing to a hyperbola as the only real ﬁtting curvilinear relationship. Let’s look at little deeper into the data by truncating the data on 6-to-1 for both force ratios and exchange ratios. As can be seen, if the RAND version of the 3-to-1 rule is correct, then the data should show at 3-to-1 force ratio a 3-to-1 casualty exchange ratio. There is only one data point that comes close to this out of the 243 points we examined.

If the FM 105-5 version of the rule as it applies to armor is correct, then the data should show that at 3-to-1 force ratio there is a 1-to-1 casualty exchange ratio, at a 4-to-1 force ratio a 1-to-2 casualty exchange ratio, and at a 5-to-1 force ratio a 1-to-3 casualty exchange ratio. Of course, there is no armor in these pre-WW I engagements, but again no such exchange pattern does appear.

If the 1958 version of the FM 105-5 rule as it applies to casualties is correct, then the data should show that at a 3-to-1 force ratio there is 0.33-to-1 casualty exchange ratio, at a 4-to-1 force ratio a .25-to-1 casualty exchange ratio, and at a 5-to-1 force ratio a 0.20-to-5 casualty exchange ratio. As can be seen, there is not much indication of this pattern, or for that matter any of the three patterns.

Still, such a construct may not be relevant to data before 1900. For example, Lanchester claimed in 1914 in Chapter V, “The Principal of Concentration,” of his book Aircraft in Warfare, that there is greater advantage to be gained in modern warfare from concentration of ﬁre.[3] Therefore, we will tap our more modern Division-Level Engagement Database (DLEDB) of 675 engagements, of which 628 have force ratios and exchange ratios calculated for them. These 628 cases are then placed on a scattergram to see if we can detect any similar patterns.

Even though this data covers from 1904 to 1991, with the vast majority of the data coming from engagements after 1940, one again sees the same pattern as with the data from 1600-1900. If there is a curvilinear relationship, it is again a hyperbola. As before, it is useful to look into the mass of data clustered into the corner by truncating the force and exchange ratios at 20-to-1. This produces the following:

Again, one sees the data clustered in the corner, with any curvilinear relationship again being a hyperbola. A look at the data further truncated to a 10-to-1 force or exchange ratio does not yield anything more revealing.

And, if this data is truncated to show only 5-to-1 force ratio and exchange ratios, one again sees:

Again, this data appears to be mostly just noise, with no clear patterns here that support any of the three constructs. In the case of the RAND version of the 3-to-1 rule, there is again only one data point (out of 628) that is anywhere close to the crossover point (even fractional exchange rate) that RAND postulates. In fact, it almost looks like the data conspires to make sure it leaves a noticeable “hole” at that point. The other postulated versions of the 3-to-1 rules are also given no support in these charts.

Also of note, that the relationship between force ratios and exchange ratios does not appear to signiﬁcantly change for combat during 1600-1900 when compared to the data from combat from 1904-1991. This does not provide much support for the intellectual construct developed by Lanchester to argue for his N-square law.

While we can attempt to torture the data to ﬁnd a better ﬁt, or can try to argue that the patterns are obscured by various factors that have not been considered, we do not believe that such a clear pattern and relationship exists. More advanced mathematical methods may show such a pattern, but to date such attempts have not ferreted out these alleged patterns. For example, we refer the reader to Janice Fain’s article on Lanchester equations, The Dupuy Institute’s Capture Rate Study, Phase I & II, or any number of other studies that have looked at Lanchester.[4]

The fundamental problem is that there does not appear to be a direct cause and effect between force ratios and exchange ratios. It appears to be an indirect relationship in the sense that force ratios are one of several independent variables that determine the outcome of an engagement, and the nature of that outcome helps determines the casualties. As such, there is a more complex set of interrelationships that have not yet been fully explored in any study that we know of, although it is brieﬂy addressed in our Capture Rate Study, Phase I & II.

NOTES

[1] FM 105-5, Maneuver Control (1958), 80.

[2] Patrick Allen, “Situational Force Scoring: Accounting for Combined Arms Effects in Aggregate Combat Models,” (N-3423-NA, The RAND Corporation, Santa Monica, CA, 1992), 20.

[3] F. W. Lanchester, Aircraft in Warfare: The Dawn of the Fourth Arm (Lanchester Press Incorporated, Sunnyvale, Calif., 1995), 46-60. One notes that Lanchester provided no data to support these claims, but relied upon an intellectual argument based upon a gross misunderstanding of ancient warfare.

[4] In particular, see page 73 of Janice B. Fain, “The Lanchester Equations and Historical Warfare: An Analysis of Sixty World War II Land Engagements,” Combat Data Subscription Service (HERO, Arlington, Va., Spring 1975).

Lessons of History Methodologies Modeling, Simulation & Wargaming

Questioning The Validity Of The 3-1 Rule Of Combat

Shawn Woodford
September 3, 2018

Canadian soldiers going “over the top” during an assault in the First World War. [History.com]

[This post was originally published on 1 December 2017.]

How many troops are needed to successfully attack or defend on the battlefield? There is a long-standing rule of thumb that holds that an attacker requires a 3-1 preponderance over a defender in combat in order to win. The aphorism is so widely accepted that few have questioned whether it is actually true or not.

Trevor Dupuy challenged the validity of the 3-1 rule on empirical grounds. He could find no historical substantiation to support it. In fact, his research on the question of force ratios suggested that there was a limit to the value of numerical preponderance on the battlefield.

TDI President Chris Lawrence has also challenged the 3-1 rule in his own work on the subject.

The validity of the 3-1 rule is no mere academic question. It underpins a great deal of U.S. military policy and warfighting doctrine. Yet, the only time the matter was seriously debated was in the 1980s with reference to the problem of defending Western Europe against the threat of Soviet military invasion.

It is probably long past due to seriously challenge the validity and usefulness of the 3-1 rule again.

Lessons of History Methodologies Research & Analysis

Chris Lawrence Interviewed About America’s Modern Wars

Shawn Woodford
July 4, 2018

TDI President Chris Lawrence was recently interviewed on The Donna Seebo Show about his 2015 book, America’s Modern War: Understanding Iraq, Afghanistan and Vietnam.

The 27 June 2018 interview can be listed to below.

Insurgency & Counterinsurgency

Are There Only Three Ways of Assessing Military Power?

military-power [This article was originally posted on 11 October 2016]

In 2004, military analyst and academic Stephen Biddle published Military Power: Explaining Victory and Defeat in Modern Battle, a book that addressed the fundamental question of what causes victory and defeat in battle. Biddle took to task the study of the conduct of war, which he asserted was based on “a weak foundation” of empirical knowledge. He surveyed the existing literature on the topic and determined that the plethora of theories of military success or failure fell into one of three analytical categories: numerical preponderance, technological superiority, or force employment.

Numerical preponderance theories explain victory or defeat in terms of material advantage, with the winners possessing greater numbers of troops, populations, economic production, or financial expenditures. Many of these involve gross comparisons of numbers, but some of the more sophisticated analyses involve calculations of force density, force-to-space ratios, or measurements of quality-adjusted “combat power.” Notions of threshold “rules of thumb,” such as the 3-1 rule, arise from this. These sorts of measurements form the basis for many theories of power in the study of international relations.

The next most influential means of assessment, according to Biddle, involve views on the primacy of technology. One school, systemic technology theory, looks at how technological advances shift balances within the international system. The best example of this is how the introduction of machine guns in the late 19^th century shifted the advantage in combat to the defender, and the development of the tank in the early 20^th century shifted it back to the attacker. Such measures are influential in international relations and political science scholarship.

The other school of technological determinacy is dyadic technology theory, which looks at relative advantages between states regardless of posture. This usually involves detailed comparisons of specific weapons systems, tanks, aircraft, infantry weapons, ships, missiles, etc., with the edge going to the more sophisticated and capable technology. The use of Lanchester theory in operations research and combat modeling is rooted in this thinking.

Biddle identified the third category of assessment as subjective assessments of force employment based on non-material factors including tactics, doctrine, skill, experience, morale or leadership. Analyses on these lines are the stock-in-trade of military staff work, military historians, and strategic studies scholars. However, international relations theorists largely ignore force employment and operations research combat modelers tend to treat it as a constant or omit it because they believe its effects cannot be measured.

The common weakness of all of these approaches, Biddle argued, is that “there are differing views, each intuitively plausible but none of which can be considered empirically proven.” For example, no one has yet been able to find empirical support substantiating the validity of the 3-1 rule or Lanchester theory. Biddle notes that the track record for predictions based on force employment analyses has also been “poor.” (To be fair, the problem of testing theory to see if applies to the real world is not limited to assessments of military power, it afflicts security and strategic studies generally.)

So, is Biddle correct? Are there only three ways to assess military outcomes? Are they valid? Can we do better?

Response To “CEV Calculations in Italy, 1943”

German infantry defending against the allied landing at Anzio pass a damaged “Elefant” tank destroyer, March 1944. [Wikimedia/Bundesarchiv]

[The article below is reprinted from August 1997 edition of The International TNDM Newsletter. It was written in response to an article by Mr. Zetterling originally published in the June 1997 edition of The International TNDM Newsletter]

Response to Niklas Zetterling’s Article
by Christopher A. Lawrence

Mr. Zetterling is currently a professor at the Swedish War College and previously worked at the Swedish National Defense Research Establishment. As I have been having an ongoing dialogue with Prof. Zetterling on the Battle of Kursk, I have had the opportunity to witness his approach to researching historical data and the depth of research. I would recommend that all of our readers take a look at his recent article in the Journal of Slavic Military Studies entitled “Loss Rates on the Eastern Front during World War II.” Mr. Zetterling does his German research directly from the Captured German Military Records by purchasing the rolls of microﬁlm from the US National Archives. He is using the same German data sources that we are. Let me attempt to address his comments section by section:

The Database on Italy 1943-44:

Unfortunately, the Italian combat data was one of the early HERO research projects, with the results ﬁrst published in 1971. I do not know who worked on it nor the specifics of how it was done. There are references to the Captured German Records, but significantly, they only reference division ﬁles for these battles. While I have not had the time to review Prof. Zetterling‘s review of the original research. I do know that some of our researchers have complained about parts of the Italian data. From what I’ve seen, it looks like the original HERO researchers didn’t look into the Corps and Army ﬁles, and assumed what the attached Corps artillery strengths were. Sloppy research is embarrassing, although it does occur, especially when working under severe ﬁnancial constraints (for example, our Battalion-level Operations Database). If the research is sloppy or hurried, or done from secondary sources, then hopefully the errors are random, and will effectively counterbalance each other, and not change the results of the analysis. If the errors are all in one direction, then this will produce a biased result.

I have no basis to believe that Prof. Zetterling’s criticism is wrong, and do have many reasons to believe that it is correct. Until l can take the time to go through the Corps and Army ﬁles, I intend to operate under the assumption that Prof. Zetterling’s corrections are good. At some point I will need to go back through the Italian Campaign data and correct it and update the Land Warfare Database. I did compare Prof. Zetterling‘s list of battles with what was declared to be the forces involved in the battle (according to the Combat Data Subscription Service) and they show the following attached artillery:

It is clear that the battles were based on the assumption that here was Corps-level German artillery. A strength comparison between the two sides is displayed in the chart on the next page.

The Result Formula:

CEV is calculated from three factors. Therefore a consistent 20% error in casualties will result in something less than a 20% error in CEV. The mission effectiveness factor is indeed very “fuzzy,” and these is simply no systematic method or guidance in its application. Sometimes, it is not based upon the assigned mission of the unit, but its perceived mission based upon the analyst’s interpretation. But, while l have the same problems with the mission accomplishment scores as Mr. Zetterling, I do not have a good replacement. Considering the nature of warfare, I would hate to create CEVs without it. Of course, Trevor Dupuy was experimenting with creating CEVs just from casualty effectiveness, and by averaging his two CEV scores (CEVt and CEVI) he heavily weighted the CEV calculation for the TNDM towards measuring primarily casualty effectiveness (see the article in issue 5 of the Newsletter, “Numerical Adjustment of CEV Results: Averages and Means“). At this point, I would like to produce a new, single formula for CEV to replace the current two and its averaging methodology. I am open to suggestions for this.

Supply Situation:

The different ammunition usage rate of the German and US Armies is one of the reasons why adding a logistics module is high on my list of model corrections. This was discussed in Issue 2 of the Newsletter, “Developing a Logistics Model for the TNDM.” As Mr. Zetterling points out, “It is unlikely that an increase in artillery ammunition expenditure will result in a proportional increase in combat power. Rather it is more likely that there is some kind of diminished return with increased expenditure.” This parallels what l expressed in point 12 of that article: “It is suspected that this increase [in OLIs] will not be linear.”

The CEV does include “logistics.” So in effect, if one had a good logistics module, the difference in logistics would be accounted for, and the Germans (after logistics is taken into account) may indeed have a higher CEV.

General Problems with Non-Divisional Units Tooth-to-Tail Ratio

Point taken. The engagements used to test the TNDM have been gathered over a period of over 25 years, by different researchers and controlled by different management. What is counted when and where does change from one group of engagements to the next. While l do think this has not had a significant result on the model outcomes, it is “sloppy” and needs to be addressed.

The Effects of Defensive Posture

This is a very good point. If the budget was available, my ﬁrst step in “redesigning” the TNDM would be to try to measure the effects of terrain on combat through the use of a large LWDB-type database and regression analysis. I have always felt that with enough engagements, one could produce reliable values for these ﬁgures based upon something other than judgement. Prof. Zetterling’s proposed methodology is also a good approach, easier to do, and more likely to get a conclusive result. I intend to add this to my list of model improvements.

Conclusions

There is one other problem with the Italian data that Prof. Zetterling did not address. This was that the Germans and the Allies had different reporting systems for casualties. Quite simply, the Germans did not report as casualties those people who were lightly wounded and treated and returned to duty from the divisional aid station. The United States and England did. This shows up when one compares the wounded to killed ratios of the various armies, with the Germans usually having in the range of 3 to 4 wounded for every one killed, while the allies tend to have 4 to 5 wounded for every one killed. Basically, when comparing the two reports, the Germans “undercount” their casualties by around 17 to 20%. Therefore, one probably needs to use a multiplier of 20 to 25% to match the two casualty systems. This was not taken into account in any the work HERO did.

Because Trevor Dupuy used three factors for measuring his CEV, this error certainly resulted in a slightly higher CEV for the Germans than should have been the case, but not a 20% increase. As Prof. Zetterling points out, the correction of the count of artillery pieces should result in a higher CEV than Col. Dupuy calculated. Finally, if Col. Dupuy overrated the value of defensive terrain, then this may result in the German CEV being slightly lower.

As you may have noted in my list of improvements (Issue 2, “Planned Improvements to the TNDM”), I did list “revalidating” to the QJM Database. [NOTE: a summary of the QJM/TNDM validation efforts can be found here.] As part of that revalidation process, we would need to review the data used in the validation data base ﬁrst, account for the casualty differences in the reporting systems, and determine if the model indeed overrates the effect of terrain on defense.

CEV Calculations in Italy, 1943

Tip of the Avalanche by Keith Rocco. Soldiers from the U.S. 36th Infantry Division landing at Salerno, Italy, September 1943.

[The article below is reprinted from June 1997 edition of The International TNDM Newsletter. Chris Lawrence’s response from the August 1997 edition of The International TNDM Newsletter will be posted on Friday.]

CEV Calculations in Italy, 1943
by Niklas Zetterling

Perhaps one of the most debated results of the TNDM (and its predecessors) is the conclusion that the German ground forces on average enjoyed a measurable qualitative superiority over its US and British opponents. This was largely the result of calculations on situations in Italy in 1943-44, even though further engagements have been added since the results were ﬁrst presented. The calculated German superiority over the Red Army, despite the much smaller number of engagements, has not aroused as much opposition. Similarly, the calculated Israeli effectiveness superiority over its enemies seems to have surprised few.

However, there are objections to the calculations on the engagements in Italy 1943. These concern primarily the database, but there are also some questions to be raised against the way some of the calculations have been made, which may possibly have consequences for the TNDM.

Here it is suggested that the German CEV [combat effectiveness value] superiority was higher than originally calculated. There are a number of ﬂaws in the original calculations, each of which will be discussed separately below. With the exception of one issue, all of them, if corrected, tend to give a higher German CEV.

The Database on Italy 1943-44

According to the database the German divisions had considerable ﬁre support from GHQ artillery units. This is the only possible conclusion from the fact that several pieces of the types 15cm gun, 17cm gun, 21cm gun, and 15cm and 21cm Nebelwerfer are included in the data for individual engagements. These types of guns were almost exclusively conﬁned to GHQ units. An example from the database are the three engagements Port of Salerno, Amphitheater, and Sele-Calore Corridor. These take place simultaneously (9-11 September 1943) with the German 16th Pz Div on the Axis side in all of them (no other division is included in the battles). Judging from the manpower ﬁgures, it seems to have been assumed that the division participated with one quarter of its strength in each of the two former battles and half its strength in the latter. According to the database, the number of guns were:

15cm gun	28
17cm gun	12
21cm gun	12
15cm NbW	27
21cm NbW	21

This would indicate that the 16th Pz Div was supported by the equivalent of more than ﬁve non-divisional artillery battalions. For the German army this is a suspiciously high number, usually there were rather something like one GHQ artillery battalion for each division, or even less. Research in the German Military Archives conﬁrmed that the number of GHQ artillery units was far less than indicated in the HERO database. Among the useful documents found were a map showing the dispositions of 10th Army artillery units. This showed clearly that there was only one non-divisional artillery unit south of Rome at the time of the Salerno landings, the III/71 Nebelwerfer Battalion. Also the 557th Artillery Battalion (17cm gun) was present, it was included in the artillery regiment (33rd Artillery Regiment) of 15th Panzergrenadier Division during the second half of 1943. Thus the number of German artillery pieces in these engagements is exaggerated to an extent that cannot be considered insigniﬁcant. Since OLI values for artillery usually constitute a signiﬁcant share of the total OLI of a force in the TNDM, errors in artillery strength cannot be dismissed easily.

While the example above is but one, further archival research has shown that the same kind of error occurs in all the engagements in September and October 1943. It has not been possible to check the engagements later during 1943, but a pattern can be recognized. The ratio between the numbers of various types of GHQ artillery pieces does not change much from battle to battle. It seems that when the database was developed, the researchers worked with the assumption that the German corps and army organizations had organic artillery, and this assumption may have been used as a “rule of thumb.” This is wrong, however; only artillery staffs, command and control units were included in the corps and army organizations, not ﬁring units. Consequently we have a systematic error, which cannot be corrected without changing the contents of the database. It is worth emphasizing that we are discussing an exaggeration of German artillery strength of about 100%, which certainly is significant. Comparing the available archival records with the database also reveals errors in numbers of tanks and antitank guns, but these are much smaller than the errors in artillery strength. Again these errors do always inﬂate the German strength in those engagements l have been able to check against archival records. These errors tend to inﬂate German numerical strength, which of course affects CEV calculations. But there are further objections to the CEV calculations.

The Result Formula

The “result formula” weighs together three factors: casualties inﬂicted, distance advanced, and mission accomplishment. It seems that the ﬁrst two do not raise many objections, even though the relative weight of them may always be subject to argumentation.

The third factor, mission accomplishment, is more dubious however. At ﬁrst glance it may seem to be natural to include such a factor. Alter all, a combat unit is supposed to accomplish the missions given to it. However, whether a unit accomplishes its mission or not depends both on its own qualities as well as the realism of the mission assigned. Thus the mission accomplishment factor may reﬂect the qualities of the combat unit as well as the higher HQs and the general strategic situation. As an example, the Rapido crossing by the U.S. 36th Infantry Division can serve. The division did not accomplish its mission, but whether the mission was realistic, given the circumstances, is dubious. Similarly many German units did probably, in many situations, receive unrealistic missions, particularly during the last two years of the war (when most of the engagements in the database were fought). A more extreme example of situations in which unrealistic missions were given is the battle in Belorussia, June-July 1944, where German units were regularly given impossible missions. Possibly it is a general trend that the side which is ﬁghting at a strategic disadvantage is more prone to give its combat units unrealistic missions.

On the other hand it is quite clear that the mission assigned may well affect both the casualty rates and advance rates. If, for example, the defender has a withdrawal mission, advance may become higher than if the mission was to defend resolutely. This must however not necessarily be handled by including a missions factor in a result formula.

I have made some tentative runs with the TNDM, testing with various CEV values to see which value produced an outcome in terms of casualties and ground gained as near as possible to the historical result. The results of these runs are very preliminary, but the tendency is that higher German CEVs produce more historical outcomes, particularly concerning combat.

Supply Situation

According to scattered information available in published literature, the U.S. artillery ﬁred more shells per day per gun than did German artillery. In Normandy, US 155mm M1 howitzers ﬁred 28.4 rounds per day during July, while August showed slightly lower consumption, 18 rounds per day. For the 105mm M2 howitzer the corresponding ﬁgures were 40.8 and 27.4. This can be compared to a German OKH study which, based on the experiences in Russia 1941-43, suggested that consumption of 105mm howitzer ammunition was about 13-22 rounds per gun per day, depending on the strength of the opposition encountered. For the 150mm howitzer the ﬁgures were 12-15.

While these ﬁgures should not be taken too seriously, as they are not from primary sources and they do also reﬂect the conditions in different theaters, they do at least indicate that it cannot be taken for granted that ammunition expenditure is proportional to the number of gun barrels. In fact there also exist further indications that Allied ammunition expenditure was greater than the German. Several German reports from Normandy indicate that they were astonished by the Allied ammunition expenditure.

It is unlikely that an increase in artillery ammunition expenditure will result in a proportional increase combat power. Rather it is more likely that there is some kind of diminished return with increased expenditure.

General Problems with Non-Divisional Units

A division usually (but not necessarily) includes various support services, such as maintenance, supply, and medical services. Non-divisional combat units have to a greater extent to rely on corps and army for such support. This makes it complicated to include such units, since when entering, for example, the manpower strength and truck strength in the TNDM, it is difficult to assess their contribution to the overall numbers.

Furthermore, the amount of such forces is not equal on the German and Allied sides. In general the Allied divisional slice was far greater than the German. In Normandy the US forces on 25 July 1944 had 812,000 men on the Continent, while the number of divisions was 18 (including the 5th Armored, which was in the process of landing on the 25th). This gives a divisional slice of 45,000 men. By comparison the German 7th Army mustered 16 divisions and 231,000 men on 1 June 1944, giving a slice of 14,437 men per division. The main explanation for the difference is the non-divisional combat units and the logistical organization to support them. In general, non-divisional combat units are composed of powerful, but supply-consuming, types like armor, artillery, antitank and antiaircraft. Thus their contribution to combat power and strain on the logistical apparatus is considerable. However I do not believe that the supporting units’ manpower and vehicles have been included in TNDM calculations.

There are however further problems with non-divisional units. While the whereabouts of tank and tank destroyer units can usually be established with sufficient certainty, artillery can be much harder to pin down to a speciﬁc division engagement. This is of course a greater problem when the geographical extent of a battle is small.

Tooth-to-Tail Ratio

Above was discussed the lack of support units in non-divisional combat units. One effect of this is to create a force with more OLI per man. This is the result of the unit‘s “tail” belonging to some other part of the military organization.

In the TNDM there is a mobility formula, which tends to favor units with many weapons and vehicles compared to the number of men. This became apparent when I was performing a great number of TNDM runs on engagements between Swedish brigades and Soviet regiments. The Soviet regiments usually contained rather few men, but still had many AFVs, artillery tubes, AT weapons, etc. The Mobility Formula in TNDM favors such units. However, I do not think this reﬂects any phenomenon in the real world. The Soviet penchant for lean combat units, with supply, maintenance, and other services provided by higher echelons, is not a more effective solution in general, but perhaps better suited to the particular constraints they were experiencing when forming units, training men, etc. In effect these services were existing in the Soviet army too, but formally not with the combat units.

This problem is to some extent reminiscent to how density is calculated (a problem discussed by Chris Lawrence in a recent issue of the Newsletter). It is comparatively easy to deﬁne the frontal limit of the deployment area of force, and it is relatively easy to deﬁne the lateral limits too. It is, however, much more difficult to say where the rear limit of a force is located.

When entering forces in the TNDM a rear limit is, perhaps unintentionally, drawn. But if the combat unit includes support units, the rear limit is pushed farther back compared to a force whose combat units are well separated from support units.

To what extent this affects the CEV calculations is unclear. Using the original database values, the German forces are perhaps given too high combat strength when the great number of GHQ artillery units is included. On the other hand, if the GHQ artillery units are not included, the opposite may be true.

The Effects of Defensive Posture

The posture factors are difficult to analyze, since they alone do not portray the advantages of defensive position. Such effects are also included in terrain factors.

It seems that the numerical values for these factors were assigned on the basis of professional judgement. However, when the QJM was developed, it seems that the developers did not assume the German CEV superiority. Rather, the German CEV superiority seems to have been discovered later. It is possible that the professional judgement was about as wrong on the issue of posture effects as they were on CEV. Since the British and American forces were predominantly on the offensive, while the Germans mainly defended themselves, a German CEV superiority may, at least partly, be hidden in two high effects for defensive posture.

When using corrected input data on the 20 situations in Italy September-October 1943, there is a tendency that the German CEV is higher when they attack. Such a tendency is also discernible in the engagements presented in Hitler’s Last Gamble. Appendix H, even though the number of engagements in the latter case is very small.

As it stands now this is not really more than a hypothesis, since it will take an analysis of a greater number of engagements to conﬁrm it. However, if such an analysis is done, it must be done using several sets of data. German and Allied attacks must be analyzed separately, and preferably the data would be separated further into sets for each relevant terrain type. Since the effects of the defensive posture are intertwined with terrain factors, it is very much possible that the factors may be correct for certain terrain types, while they are wrong for others. It may also be that the factors can be different for various opponents (due to differences in training, doctrine, etc.). It is also possible that the factors are different if the forces are predominantly composed of armor units or mainly of infantry.

One further problem with the effects of defensive position is that it is probably strongly affected by the density of forces. It is likely that the main effect of the density of forces is the inability to use effectively all the forces involved. Thus it may be that this factor will not inﬂuence the outcome except when the density is comparatively high. However, what can be regarded as “high” is probably much dependent on terrain, road net quality, and the cross-country mobility of the forces.

Conclusions

While the TNDM has been criticized here, it is also ﬁtting to praise the model. The very fact that it can be criticized in this way is a testimony to its openness. In a sense a model is also a theory, and to use Popperian terminology, the TNDM is also very testable.

It should also be emphasized that the greatest errors are probably those in the database. As previously stated, I can only conclude safely that the data on the engagements in Italy in 1943 are wrong; later engagements have not yet been checked against archival documents. Overall the errors do not represent a dramatic change in the CEV values. Rather, the Germans seem to have (in Italy 1943) a superiority on the order of 1.4-1.5, compared to an original ﬁgure of 1.2-1.3.

During September and October 1943, almost all the German divisions in southern Italy were mechanized or parachute divisions. This may have contributed to a higher German CEV. Thus it is not certain that the conclusions arrived at here are valid for German forces in general, even though this factor should not be exaggerated, since many of the German divisions in Italy were either newly raised (e.g., 26th Panzer Division) or rebuilt after the Stalingrad disaster (16th Panzer Division plus 3rd and 29th Panzergrenadier Divisions) or the Tunisian debacle (15th Panzergrenadier Division).

The Third World War of 1985

Hackett

[This article was originally posted on 5 August 2016]

The seeming military resurgence of Vladimir Putin’s Russia has renewed concerns about the military balance between East and West in Europe. These concerns have evoked memories of the decades-long Cold War confrontation between NATO and the Warsaw Pact along the inner-German frontier. One of the most popular expressions of this conflict came in the form of a book titled The Third World War: August 1985, by British General Sir John Hackett. The book, a hypothetical account of a war between the Soviet Union, the United States, and assorted allies set in the near future, became an international best-seller.

Jeffrey H Michaels, a Senior Lecturer in Defence Studies at the British the Joint Services Command and Staff College, has published a detailed look at how Hackett and several senior NATO and diplomatic colleagues constructed the scenario portrayed in the book. Scenario construction is an important aspect of institutional war gaming. A war game will only be as useful if the assumptions that underpin it are valid. As Michaels points out,

Regrettably, far too many scenarios and models, whether developed by military organizations, political scientists, or fiction writers, tend to focus their attention on the battlefield and the clash of armies, navies, air forces, and especially their weapons systems. By contrast, the broader context of the war – the reasons why hostilities erupted, the political and military objectives, the limits placed on military action, and so on – are given much less serious attention, often because they are viewed by the script-writers as a distraction from the main activity that occurs on the battlefield.

Modelers and war gamers always need to keep in mind the fundamental importance of context in designing their simulations.

It is quite easy to project how one weapon system might fare against another, but taken out of a broader strategic context, such a projection is practically meaningless (apart from its marketing value), or worse, misleading. In this sense, even if less entertaining or exciting, the degree of realism of the political aspects of the scenario, particularly policymakers’ rationality and cost-benefit calculus, and the key decisions that are taken about going to war, the objectives being sought, the limits placed on military action, and the willingness to incur the risks of escalation, should receive more critical attention than the purely battlefield dimensions of the future conflict.

These are crucially important points to consider when deciding how to asses the outcomes of hypothetical scenarios.

Scoring Weapons And Aggregation In Trevor Dupuy’s Combat Models

Shawn Woodford
May 7, 2018

[The article below is reprinted from the October 1997 edition of The International TNDM Newsletter.]

Consistent Scoring of Weapons and Aggregation of Forces:
The Cornerstone of Dupuy’s Quantitative Analysis of Historical Land Battles
by
James G. Taylor, PhD,
Dept. of Operations Research, Naval Postgraduate School

Introduction

Col. Trevor N. Dupuy was an American original, especially as regards the quantitative study of warfare. As with many prophets, he was not entirely appreciated in his own land, particularly its Military Operations Research (OR) community. However, after becoming rather familiar with the details of his mathematical modeling of ground combat based on historical data, I became aware of the basic scientiﬁc soundness of his approach. Unfortunately, his documentation of methodology was not always accepted by others, many of whom appeared to confuse lack of mathematical sophistication in his documentation with lack of scientiﬁc validity of his basic methodology.

The purpose of this brief paper is to review the salient points of Dupuy’s methodology from a system’s perspective, i.e., to view his methodology as a system, functioning as an organic whole to capture the essence of past combat experience (with an eye towards extrapolation into the future). The advantage of this perspective is that it immediately leads one to the conclusion that if one wants to use some functional relationship derived from Dupuy’s work, then one should use his methodologies for scoring weapons, aggregating forces, and adjusting for operational circumstances; since this consistency is the only guarantee of being able to reproduce historical results and to project them into the future.

Implications (of this system’s perspective on Dupuy’s work) for current DOD models will be discussed. In particular, the Military OR community has developed quantitative methods for imputing values to weapon systems based on their attrition capability against opposing forces and force interactions.[1] One such approach is the so-called antipotential-potential method[2] used in TACWAR[3] to score weapons. However, one should not expect such scores to provide valid casualty estimates when combined with historically derived functional relationships such as the so-called ATLAS casualty-rate curves[4] used in TACWAR, because a different “yard-stick” (i.e. measuring system for estimating the relative combat potential of opposing forces) was used to develop such a curve.

Overview of Dupuy’s Approach

This section brieﬂy outlines the salient features of Dupuy’s approach to the quantitative analysis and modeling of ground combat as embodied in his Tactical Numerical Deterministic Model (TNDM) and its predecessor the Quantiﬁed Judgment Model (QJM). The interested reader can ﬁnd details in Dupuy [1979] (see also Dupuy [1985][5], [1987], [1990]). Here we will view Dupuy’s methodology from a system approach, which seeks to discern its various components and their interactions and to view these components as an organic whole. Essentially Dupuy’s approach involves the development of functional relationships from historical combat data (see Fig. 1) and then using these functional relationships to model future combat (see Fig, 2).

At the heart of Dupuy’s method is the investigation of historical battles and comparing the relationship of inputs (as quantiﬁed by relative combat power, denoted as P_a/P_d for that of the attacker relative to that of the defender in Fig. l)(e.g. see Dupuy [1979, pp. 59-64]) to outputs (as quantiﬁed by extent of mission accomplishment, casualty effectiveness, and territorial effectiveness; see Fig. 2) (e.g. see Dupuy [1979, pp. 47-50]), The salient point is that within this scheme, the main input[6] (i.e. relative combat power) to a historical battle is a derived quantity. It is computed from formulas that involve three essential aspects: (1) the scoring of weapons (e.g, see Dupuy [1979, Chapter 2 and also Appendix A]), (2) aggregation methodology for a force (e.g. see Dupuy [1979, pp. 43-46 and 202-203]), and (3) situational-adjustment methodology for determining the relative combat power of opposing forces (e.g. see Dupuy [1979, pp. 46-47 and 203-204]). In the force-aggregation step the effects on weapons of Dupuy’s environmental variables and one operational variable (air superiority) are considered[7], while in the situation-adjustment step the effects on forces of his behavioral variables[8] (aggregated into a single factor called the relative combat effectiveness value (CEV)) and also the other operational variables are considered (Dupuy [1987, pp. 86-89])

Moreover, any functional relationships developed by Dupuy depend (unless shown otherwise) on his computational system for derived quantities, namely OLls, force strengths, and relative combat power. Thus, Dupuy’s results depend in an essential manner on his overall computational system described immediately above. Consequently, any such functional relationship (e.g. casualty-rate curve) directly or indirectly derivative from Dupuy‘s work should still use his computational methodology for determination of independent-variable values.

Fig l also reveals another important aspect of Dupuy’s work, the development of reliable data on historical battles, Military judgment plays an essential role in this development of such historical data for a variety of reasons. Dupuy was essentially the only source of new secondary historical data developed from primary sources (see McQuie [1970] for further details). These primary sources are well known to be both incomplete and inconsistent, so that military judgment must be used to ﬁll in the many gaps and reconcile observed inconsistencies. Moreover, military judgment also generates the working hypotheses for model development (e.g. identiﬁcation of signiﬁcant variables).

At the heart of Dupuy’s quantitative investigation of historical battles and subsequent model development is his own weapons-scoring methodology, which slowly evolved out of study efforts by the Historical Evaluation Research Organization (HERO) and its successor organizations (cf. HERO [1967] and compare with Dupuy [1979]). Early HERO [1967, pp. 7-8] work revealed that what one would today call weapons scores developed by other organizations were so poorly documented that HERO had to create its own methodology for developing the relative lethality of weapons, which eventually evolved into Dupuy’s Operational Lethality Indices (OLIs). Dupuy realized that his method was arbitrary (as indeed is its counterpart, called the operational definition, in formal scientific work), but felt that this would be ameliorated if the weapons-scoring methodology be consistently applied to historical battles. Unfortunately, this point is not clearly stated in Dupuy’s formal writings, although it was clearly (and compellingly) made by him in numerous brieﬁngs that this author heard over the years.

In other words, from a system’s perspective, the functional relationships developed by Colonel Dupuy are part of his analysis system that includes this weapons-scoring methodology consistently applied (see Fig. l again). The derived functional relationships do not stand alone (unless further empirical analysis shows them to hold for any weapons-scoring methodology), but function in concert with computational procedures. Another essential part of this system is Dupuy‘s aggregation methodology, which combines numbers, environmental circumstances, and weapons scores to compute the strength (S) of a military force. A key innovation by Colonel Dupuy [1979, pp. 202- 203] was to use a nonlinear (more precisely, a piecewise-linear) model for certain elements of force strength. This innovation precluded the occurrence of military absurdities such as air ﬁrepower being fully substitutable for ground ﬁrepower, antitank weapons being fully effective when armor targets are lacking, etc‘ The ﬁnal part of this computational system is Dupuy’s situational-adjustment methodology, which combines the effects of operational circumstances with force strengths to determine relative combat power, e.g. P_a/P_d.

To recapitulate, the determination of an Operational Lethality Index (OLI) for a weapon involves the combination of weapon lethality, quantiﬁed in terms of a Theoretical Lethality Index (TLI) (e.g. see Dupuy [1987, p. 84]), and troop dispersion[9] (e.g. see Dupuy [1987, pp. 84- 85]). Weapons scores (i.e. the OLIs) are then combined with numbers (own side and enemy) and combat- environment factors to yield force strength. Six[10] different categories of weapons are aggregated, with nonlinear (i.e. piecewise-linear) models being used for the following three categories of weapons: antitank, air defense, and air ﬁrepower (i.e. c1ose—air support). Operational, e.g. mobility, posture, surprise, etc. (Dupuy [1987, p. 87]), and behavioral variables (quantiﬁed as a relative combat effectiveness value (CEV)) are then applied to force strength to determine a side’s combat-power potential.

Requirement for Consistent Scoring of Weapons, Force Aggregation, and Situational Adjustment for Operational Circumstances

The salient point to be gleaned from Fig.1 and 2 is that the same (or at least consistent) weapons—scoring, aggregation, and situational—adjustment methodologies be used for both developing functional relationships and then playing them to model future combat. The corresponding computational methods function as a system (organic whole) for determining relative combat power, e.g. P_a/P_d. For the development of functional relationships from historical data, a force ratio (relative combat power of the two opposing sides, e.g. attacker’s combat power divided by that of the defender, P_a/P_d is computed (i.e. it is a derived quantity) as the independent variable, with observed combat outcome being the dependent variable. Thus, as discussed above, this force ratio depends on the methodologies for scoring weapons, aggregating force strengths, and adjusting a force’s combat power for the operational circumstances of the engagement. It is a priori not clear that different scoring, aggregation, and situational-adjustment methodologies will lead to similar derived values. If such different computational procedures were to be used, these derived values should be recomputed and the corresponding functional relationships rederived and replotted.

However, users of the Tactical Numerical Deterministic Model (TNDM) (or for that matter, its predecessor, the Quantiﬁed Judgment Model (QJM)) need not worry about this point because it was apparently meticulously observed by Colonel Dupuy in all his work. However, portions of his work have found their way into a surprisingly large number of DOD models (usually not explicitly acknowledged), but the context and range of validity of historical results have been largely ignored by others. The need for recalibration of the historical data and corresponding functional relationships has not been considered in applying Dupuy’s results for some important current DOD models.

Implications for Current DOD Models

A number of important current DOD models (namely, TACWAR and JICM discussed below) make use of some of Dupuy’s historical results without recalibrating functional relationships such as loss rates and rates of advance as a function of some force ratio (e.g. P_a/P_d). As discussed above, it is not clear that such a procedure will capture the essence of past combat experience. Moreover, in calculating losses, Dupuy ﬁrst determines personnel losses (expressed as a percent loss of personnel strength, i.e., number of combatants on a side) and then calculates equipment losses as a function of this casualty rate (e.g., see Dupuy [1971, pp. 219-223], also [1990, Chapters 5 through 7][11]). These latter functional relationships are apparently not observed in the models discussed below. In fact, only Dupuy (going back to Dupuy [1979][12] takes personnel losses to depend on a force ratio and other pertinent variables, with materiel losses being taken as derivative from this casualty rate.

For example, TACWAR determines personnel losses[13] by computing a force ratio and then consulting an appropriate casualty-rate curve (referred to as empirical data), much in the same fashion as ATLAS did[14]. However, such a force ratio is computed using a linear model with weapon values determined by the so-called antipotential-potential method[15]. Unfortunately, this procedure may not be consistent with how the empirical data (i.e. the casualty-rate curves) was developed. Further research is required to demonstrate that valid casualty estimates are obtained when different weapon scoring, aggregation, and situational-adjustment methodologies are used to develop casualty-rate curves from historical data and to use them to assess losses in aggregated combat models. Furthermore, TACWAR does not use Dupuy’s model for equipment losses (see above), although it does purport, as just noted above, to use “historical data” (e.g., see Kerlin et al. [1975, p. 22]) to compute personnel losses as a function (among other things) of a force ratio (given by a linear relationship), involving close air support values in a way never used by Dupuy. Although their force-ratio determination methodology does have logical and mathematical merit, it is not the way that the historical data was developed.

Moreover, RAND (Allen [1992]) has more recently developed what is called the situational force scoring (SFS) methodology for calculating force ratios in large-scale, aggregated-force combat situations to determine loss and movement rates. Here, SFS refers essentially to a force- aggregation and situation-adjustment methodology, which has many conceptual elements in common with Dupuy‘s methodology (except, most notably, extensive testing against historical data, especially documentation of such efforts). This SFS was originally developed for RSAS[16] and is today used in JICM[17]. It also apparently uses a weapon-scoring system developed at RAND[18]. It purports (no documentation given [citation of unpublished work]) to be consistent with historical data (including the ATLAS casualty-rate curves) (Allen [1992, p.41]), but again no consideration is given to recalibration of historical results for different weapon scoring, force-aggregation, and situational-adjustment methodologies. SFS emphasizes adjusting force strengths according to operational circumstances (the “situation”) of the engagement (including surprise), with many innovative ideas (but in some major ways has little connection with previous work of others[19]). The resulting model contains many more details than historical combat data would support. It also is methodology that differs in many essential ways from that used previously by any investigator. In particular, it is doubtful that it develops force ratios in a manner consistent with Dupuy’s work.

Final Comments

Use of (sophisticated) mathematics for modeling past historical combat (and extrapolating it into the future for planning purposes) is no reason for ignoring Dupuy’s work. One would think that the current Military OR community would try to understand Dupuy’s work before trying to improve and extend it. In particular, Colonel Dupuy’s various computational procedures (including constants) must be considered as an organic whole (i.e. a system) supporting the development of functional relationships. If one ignores this computational system and simply tries to use some isolated aspect, the result may be interesting and even logically sound, but it probably lacks any scientiﬁc validity.

REFERENCES

P. Allen, “Situational Force Scoring: Accounting for Combined Arms Effects in Aggregate Combat Models,” N-3423-NA, The RAND Corporation, Santa Monica, CA, 1992.

L. B. Anderson, “A Brieﬁng on Anti-Potential Potential (The Eigen-value Method for Computing Weapon Values), WP-2, Project 23-31, Institute for Defense Analyses, Arlington, VA, March 1974.

B. W. Bennett, et al, “RSAS 4.6 Summary,” N-3534-NA, The RAND Corporation, Santa Monica, CA, 1992.

B. W. Bennett, A. M. Bullock, D. B. Fox, C. M. Jones, J. Schrader, R. Weissler, and B. A. Wilson, “JICM 1.0 Summary,” MR-383-NA, The RAND Corporation, Santa Monica, CA, 1994.

P. K. Davis and J. A. Winnefeld, “The RAND Strategic Assessment Center: An Overview and Interim Conclusions About Utility and Development Options,” R-2945-DNA, The RAND Corporation, Santa Monica, CA, March 1983.

T.N, Dupuy, Numbers. Predictions and War: Using History to Evaluate Combat Factors and Predict the Outcome of Battles, The Bobbs-Merrill Company, Indianapolis/New York, 1979,

T.N. Dupuy, Numbers Predictions and War, Revised Edition, HERO Books, Fairfax, VA 1985.

T.N. Dupuy, Understanding War: History and Theory of Combat, Paragon House Publishers, New York, 1987.

T.N. Dupuy, Attrition: Forecasting Battle Casualties and Equipment Losses in Modem War, HERO Books, Fairfax, VA, 1990.

General Research Corporation (GRC), “A Hierarchy of Combat Analysis Models,” McLean, VA, January 1973.

Historical Evaluation and Research Organization (HERO), “Average Casualty Rates for War Games, Based on Historical Data,” 3 Volumes in 1, Dunn Loring, VA, February 1967.

E. P. Kerlin and R. H. Cole, “ATLAS: A Tactical, Logistical, and Air Simulation: Documentation and User’s Guide,” RAC-TP-338, Research Analysis Corporation, McLean, VA, April 1969 (AD 850 355).

E.P. Kerlin, L.A. Schmidt, A.J. Rolfe, M.J. Hutzler, and D,L. Moody, “The IDA Tactical Warfare Model: A Theater-Level Model of Conventional, Nuclear, and Chemical Warfare, Volume II- Detailed Description” R-21 1, Institute for Defense Analyses, Arlington, VA, October 1975 (AD B009 692L).

R. McQuie, “Military History and Mathematical Analysis,” Military Review 50, No, 5, 8-17 (1970).

S.M. Robinson, “Shadow Prices for Measures of Effectiveness, I: Linear Model,” Operations Research 41, 518-535 (1993).

J.G. Taylor, Lanchester Models of Warfare. Vols, I & II. Operations Research Society of America, Alexandria, VA, 1983. (a)

J.G. Taylor, “A Lanchester-Type Aggregated-Force Model of Conventional Ground Combat,” Naval Research Logistics Quarterly 30, 237-260 (1983). (b)

NOTES

[1] For example, see Taylor [1983a, Section 7.18], which contains a number of examples. The basic references given there may be more accessible through Robinson [I993].

[2] This term was apparently coined by L.B. Anderson [I974] (see also Kerlin et al. [1975, Chapter I, Section D.3]).

[3] The Tactical Warfare (TACWAR) model is a theater-level, joint-warfare, computer-based combat model that is currently used for decision support by the Joint Staff and essentially all CINC staffs. It was originally developed by the Institute for Defense Analyses in the mid-1970s (see Kerlin et al. [1975]), originally referred to as TACNUC, which has been continually upgraded until (and including) the present day.

[4] For example, see Kerlin and Cole [1969], GRC [1973, Fig. 6-6], or Taylor [1983b, Fig. 5] (also Taylor [1983a, Section 7.13]).

[5] The only apparent difference between Dupuy [1979] and Dupuy [1985] is the addition of an appendix (Appendix C “Modiﬁed Quantiﬁed Judgment Analysis of the Bekaa Valley Battle”) to the end of the latter (pp. 241-251). Hence, the page content is apparently the same for these two books for pp. 1-239.

[6] Technically speaking, one also has the engagement type and possibly several other descriptors (denoted in Fig. 1 as reduced list of operational circumstances) as other inputs to a historical battle.

[7] In Dupuy [1979, e.g. pp. 43-46] only environmental variables are mentioned, although basically the same formulas underlie both Dupuy [1979] and Dupuy [1987]. For simplicity, Fig. 1 and 2 follow this usage and employ the term “environmental circumstances.”

[8] In Dupuy [1979, e.g. pp. 46-47] only operational variables are mentioned, although basically the same formulas underlie both Dupuy [1979] and Dupuy [1987]. For simplicity, Fig. 1 and 2 follow this usage and employ the term “operational circumstances.”

[9] Chris Lawrence has kindly brought to my attention that since the same value for troop dispersion from an historical period (e.g. see Dupuy [1987, p. 84]) is used for both the attacker and also the defender, troop dispersion does not actually affect the determination of relative combat power PM/Pd.

[10] Eight different weapon types are considered, with three being classiﬁed as infantry weapons (e.g. see Dupuy [1979, pp, 43-44], [1981 pp. 85-86]).

[11] Chris Lawrence has kindly informed me that Dupuy‘s work on relating equipment losses to personnel losses goes back to the early 1970s and even earlier (e.g. see HERO [1966]). Moreover, Dupuy‘s [1992] book Future Wars gives some additional empirical evidence concerning the dependence of equipment losses on casualty rates.

[12] But actually going back much earlier as pointed out in the previous footnote.

[13] See Kerlin et al. [1975, Chapter I, Section D.l].

[14] See Footnote 4 above.

[15] See Kerlin et al. [1975, Chapter I, Section D.3]; see also Footnotes 1 and 2 above.

[16] The RAND Strategy Assessment System (RSAS) is a multi-theater aggregated combat model developed at RAND in the early l980s (for further details see Davis and Winnefeld [1983] and Bennett et al. [1992]). It evolved into the Joint Integrated Contingency Model (JICM), which is a post-Cold War redesign of the RSAS (starting in FY92).

[17] The Joint Integrated Contingency Model (JICM) is a game-structured computer-based combat model of major regional contingencies and higher-level conﬂicts, covering strategic mobility, regional conventional and nuclear warfare in multiple theaters, naval warfare, and strategic nuclear warfare (for further details, see Bennett et al. [1994]).

[18] RAND apparently replaced one weapon-scoring system by another (e.g. see Allen [1992, pp. 9, l5, and 87-89]) without making any other changes in their SFS System.

[19] For example, both Dupuy’s early HERO work (e.g. see Dupuy [1967]), reworks of these results by the Research Analysis Corporation (RAC) (e.g. see RAC [1973, Fig. 6-6]), and Dupuy’s later work (e.g. see Dupuy [1979]) considered daily fractional casualties for the attacker and also for the defender as basic casualty-outcome descriptors (see also Taylor [1983b]). However, RAND does not do this, but considers the defender’s loss rate and a casualty exchange ratio as being the basic casualty-production descriptors (Allen [1992, pp. 41-42]). The great value of using the former set of descriptors (i.e. attacker and defender fractional loss rates) is that not only is casualty assessment more straight forward (especially development of functional relationships from historical data) but also qualitative model behavior is readily deduced (see Taylor [1983b] for further details).

Methodologies Modeling, Simulation & Wargaming

Abstraction and Aggregation in Wargame Modeling

Geoffrey Clark
April 12, 2018
4 Comments

“All models are wrong, some models are useful.” – George Box

Models, no matter what their subjects, must always be an imperfect copy of the original. The term “model” inherently has this connotation. If the subject is exact and precise, then it is a duplicate, a replica, a clone, or a copy, but not a “model.” The most common dimension to be compromised is generally size, or more literally the three spatial dimensions of length, width and height. A good example of this would be a scale model airplane, generally available in several ratios from the original, such as 1/144, 1/72 or 1/48 (which are interestingly all factors of 12 … there are also 1/100 for the more decimal-minded). These mean that the model airplane at 1/72 scale would be 72 times smaller … take the length, width and height measurements of the real item, and divide by 72 to get the model’s value.

If we take the real item’s weight and divide by 72, we would not expect our model to weight 72 times less! Not unless the same or similar materials would be used, certainly. Generally, the model has a different purpose than replicating the subject’s functionality. It is helping to model the subject’s qualities, or to mimic them in some useful way. In the case of the 1/72 plastic model airplane of the F-15J fighter, this might be replicating the sight of a real F-15J, to satisfy the desire of the youth to look at the F-15J and to imagine themselves taking flight. Or it might be for pilots at a flight school to mimic air combat with models instead of ha

The model aircraft is a simple physical object; once built, it does not change over time (unless you want to count dropping it and breaking it…). A real F-15J, however, is a dynamic physical object, which changes considerably over the course of its normal operation. It is loaded with fuel, ordnance, both of which have a huge effect on its weight, and thus its performance characteristics. Also, it may be occupied by different crew members, whose experience and skills may vary considerably. These qualities of the unit need to be taken into account, if the purpose of the model is to represent the aircraft. The classic example of this is a flight envelope model of an F-15A/C:

This flight envelope itself is a model, it represents the flight characteristics of the F-15 using two primary quantitative axes – altitude and speed (in numbers of mach), and also throttle setting. Perhaps the most interesting thing about this is the realization than an F-15 slows down as it descends. Are these particular qualities of an F-15 required to model air combat involving such and aircraft?

How to Apply This Modeling Process to a Wargame?

The purpose of the war game is to model or represent the possible outcome of a real combat situation, played forward in the model at whatever pace and scale the designer has intended.

As mentioned previously, my colleague and I are playing Asian Fleet, a war game that covers several types of naval combat, including those involving air units, surface units and submarine units. This was published in 2007, and updated in 2010. We’ve selected a scenario that has only air units on either side. The premise of this scenario is quite simple:

The Chinese air force, in trying to prevent the United States from intervening in a Taiwan invasion, will carry out an attack on the SDF as well as the US military base on Okinawa. Forces around Shanghai consisting of state-of-the-art fighter bombers and long-range attack aircraft have been placed for the invasion of Taiwan, and an attack on Okinawa would be carried out with a portion of these forces. [Asian Fleet Scenario Book]

Of course, this game is a model of reality. The infinite geospatial and temporal possibilities of space-time which is so familiar to us has been replaced by highly aggregated discreet buckets, such as turns that may last for a day, or eight hours. Latitude, longitude and altitude are replaced with a two-dimensional hexagonal “honey comb” surface. Hence, distance is no longer computed in miles or meters, but rather in “hexes”, each of which is about 50 nautical miles. Aircraft are effectively aloft, or on the ground, although a “high mission profile” will provide endurance benefits. Submarines are considered underwater, or may use “deep mode” attempting to hide from sonar searches.

Maneuver units are represented by “counters” or virtual chits to be moved about the map as play progresses. Their level of aggregation varies from large and powerful ships and subs represented individually, to smaller surface units and weaker subs grouped and represented by a single counter (a “flotilla”), to squadrons or regiments of aircraft represented by a single counter. Depending upon the nation and the military branch, this may be a few as 3-5 aircraft in a maritime patrol aircraft (MPA) detachment (“recon” in this game), to roughly 10-12 aircraft in a bomber unit, to 24 or even 72 aircraft in a fighter unit (“interceptor” in this game).

Enough Theory, What Happened?!

The Chinese Air Force mobilized their H6H bomber, escorted by large numbers of Flankers (J11 and Su-30MK2 fighters from the Shanghai area, and headed East towards Okinawa. The US Air Force F-15Cs supported by airborne warning and control system (AWACS) detected this inbound force and delayed engagement until their Japanese F-15J unit on combat air patrol (CAP) could support them, and then engaged the Chinese force about 50 miles from the AWACS orbits. In this game, air combat is broken down into two phases, long-range air to air (LRAA) combat (aka beyond visual range, BVR), and “regular” air combat, or within visual range (WVR) combat.

In BVR combat, only units marked as equipped with BVR capability may attack:

2 x F-15C units have a factor of 32; scoring a hit in 5 out of 10 cases, or roughly 50%.
Su-30MK2 unit has a factor of 16; scoring a hit in 4 out of 10 cases, ~40%.

To these numbers a modifier of +2 exists when the attacker is supported by AWACS, so the odds to score a hit increase to roughly 70% for the F-15Cs … but in our example they miss, and the Chinese shot misses as well. Thus, the combat proceeds to WVR.

In WVR combat, each opposing side sums their aerial combat factors:

2 x F-15C (32) + F-15J (13) = 45
Su-30MK2 (15) + J11 (13) + H6H (1) = 29

These two numbers are then expressed as a ratio, attacker-to-defender (45:29), and rounded down in favor of the defender (1:1), and then a ten-sided-die (d10) is rolled to consult the Air-to-Air Combat Results Table, on the “CAP/AWACS Interception” line. The die was rolled, and a result of “0/0r” was achieved, which basically says that neither side takes losses, but the defender is turned back from the mission (“r” being code for “return to base”). Given the +2 modifier for the AWACS, the worst outcome for the Allies would be a mutual return to base result (“0r/0r”). The best outcome would be inflicting two “steps” of damage, and sending the rest home (“0/2r”). A step of loss is about one half of an air unit, represented by flipping over the counter or chit, and operating with the combat factors at about half strength.

To sum this up, as the Allied commander, my conclusion was that the Americans were hung-over or asleep for this engagement.

I am encouraged by some similarities between this game and the fantastic detail that TDI has just posted about the DACM model, here and here. Thus, I plan to not only dissect this Asian Fleet game (VGAF), but also go a gap analysis between VGAF and DACM.