Response to Niklas Zetterling’s Article by Christopher A. Lawrence
Mr. Zetterling is currently a professor at the Swedish War College and previously worked at the Swedish National Defense Research Establishment. As I have been having an ongoing dialogue with Prof. Zetterling on the Battle of Kursk, I have had the opportunity to witness his approach to researching historical data and the depth of research. I would recommend that all of our readers take a look at his recent article in the Journal of Slavic Military Studies entitled “Loss Rates on the Eastern Front during World War II.” Mr. Zetterling does his German research directly from the Captured German Military Records by purchasing the rolls of microfilm from the US National Archives. He is using the same German data sources that we are. Let me attempt to address his comments section by section:
The Database on Italy 1943-44:
Unfortunately, the Italian combat data was one of the early HERO research projects, with the results first published in 1971. I do not know who worked on it nor the specifics of how it was done. There are references to the Captured German Records, but significantly, they only reference division files for these battles. While I have not had the time to review Prof. Zetterling‘s review of the original research. I do know that some of our researchers have complained about parts of the Italian data. From what I’ve seen, it looks like the original HERO researchers didn’t look into the Corps and Army files, and assumed what the attached Corps artillery strengths were. Sloppy research is embarrassing, although it does occur, especially when working under severe financial constraints (for example, our Battalion-level Operations Database). If the research is sloppy or hurried, or done from secondary sources, then hopefully the errors are random, and will effectively counterbalance each other, and not change the results of the analysis. If the errors are all in one direction, then this will produce a biased result.
I have no basis to believe that Prof. Zetterling’s criticism is wrong, and do have many reasons to believe that it is correct. Until l can take the time to go through the Corps and Army files, I intend to operate under the assumption that Prof. Zetterling’s corrections are good. At some point I will need to go back through the Italian Campaign data and correct it and update the Land Warfare Database. I did compare Prof. Zetterling‘s list of battles with what was declared to be the forces involved in the battle (according to the Combat Data Subscription Service) and they show the following attached artillery:
It is clear that the battles were based on the assumption that here was Corps-level German artillery. A strength comparison between the two sides is displayed in the chart on the next page.
The Result Formula:
CEV is calculated from three factors. Therefore a consistent 20% error in casualties will result in something less than a 20% error in CEV. The mission effectiveness factor is indeed very “fuzzy,” and these is simply no systematic method or guidance in its application. Sometimes, it is not based upon the assigned mission of the unit, but its perceived mission based upon the analyst’s interpretation. But, while l have the same problems with the mission accomplishment scores as Mr. Zetterling, I do not have a good replacement. Considering the nature of warfare, I would hate to create CEVs without it. Of course, Trevor Dupuy was experimenting with creating CEVs just from casualty effectiveness, and by averaging his two CEV scores (CEVt and CEVI) he heavily weighted the CEV calculation for the TNDM towards measuring primarily casualty effectiveness (see the article in issue 5 of the Newsletter, “Numerical Adjustment of CEV Results: Averages and Means“). At this point, I would like to produce a new, single formula for CEV to replace the current two and its averaging methodology. I am open to suggestions for this.
Supply Situation:
The different ammunition usage rate of the German and US Armies is one of the reasons why adding a logistics module is high on my list of model corrections. This was discussed in Issue 2 of the Newsletter, “Developing a Logistics Model for the TNDM.” As Mr. Zetterling points out, “It is unlikely that an increase in artillery ammunition expenditure will result in a proportional increase in combat power. Rather it is more likely that there is some kind of diminished return with increased expenditure.” This parallels what l expressed in point 12 of that article: “It is suspected that this increase [in OLIs] will not be linear.”
The CEV does include “logistics.” So in effect, if one had a good logistics module, the difference in logistics would be accounted for, and the Germans (after logistics is taken into account) may indeed have a higher CEV.
General Problems with Non-Divisional Units Tooth-to-Tail Ratio
Point taken. The engagements used to test the TNDM have been gathered over a period of over 25 years, by different researchers and controlled by different management. What is counted when and where does change from one group of engagements to the next. While l do think this has not had a significant result on the model outcomes, it is “sloppy” and needs to be addressed.
The Effects of Defensive Posture
This is a very good point. If the budget was available, my first step in “redesigning” the TNDM would be to try to measure the effects of terrain on combat through the use of a large LWDB-type database and regression analysis. I have always felt that with enough engagements, one could produce reliable values for these figures based upon something other than judgement. Prof. Zetterling’s proposed methodology is also a good approach, easier to do, and more likely to get a conclusive result. I intend to add this to my list of model improvements.
Conclusions
There is one other problem with the Italian data that Prof. Zetterling did not address. This was that the Germans and the Allies had different reporting systems for casualties. Quite simply, the Germans did not report as casualties those people who were lightly wounded and treated and returned to duty from the divisional aid station. The United States and England did. This shows up when one compares the wounded to killed ratios of the various armies, with the Germans usually having in the range of 3 to 4 wounded for every one killed, while the allies tend to have 4 to 5 wounded for every one killed. Basically, when comparing the two reports, the Germans “undercount” their casualties by around 17 to 20%. Therefore, one probably needs to use a multiplier of 20 to 25% to match the two casualty systems. This was not taken into account in any the work HERO did.
Because Trevor Dupuy used three factors for measuring his CEV, this error certainly resulted in a slightly higher CEV for the Germans than should have been the case, but not a 20% increase. As Prof. Zetterling points out, the correction of the count of artillery pieces should result in a higher CEV than Col. Dupuy calculated. Finally, if Col. Dupuy overrated the value of defensive terrain, then this may result in the German CEV being slightly lower.
As you may have noted in my list of improvements (Issue 2, “Planned Improvements to the TNDM”), I did list “revalidating” to the QJM Database. [NOTE: a summary of the QJM/TNDM validation efforts can be found here.] As part of that revalidation process, we would need to review the data used in the validation data base first, account for the casualty differences in the reporting systems, and determine if the model indeed overrates the effect of terrain on defense.
Perhaps one of the most debated results of the TNDM (and its predecessors) is the conclusion that the German ground forces on average enjoyed a measurable qualitative superiority over its US and British opponents. This was largely the result of calculations on situations in Italy in 1943-44, even though further engagements have been added since the results were first presented. The calculated German superiority over the Red Army, despite the much smaller number of engagements, has not aroused as much opposition. Similarly, the calculated Israeli effectiveness superiority over its enemies seems to have surprised few.
However, there are objections to the calculations on the engagements in Italy 1943. These concern primarily the database, but there are also some questions to be raised against the way some of the calculations have been made, which may possibly have consequences for the TNDM.
Here it is suggested that the German CEV [combat effectiveness value] superiority was higher than originally calculated. There are a number of flaws in the original calculations, each of which will be discussed separately below. With the exception of one issue, all of them, if corrected, tend to give a higher German CEV.
The Database on Italy 1943-44
According to the database the German divisions had considerable fire support from GHQ artillery units. This is the only possible conclusion from the fact that several pieces of the types 15cm gun, 17cm gun, 21cm gun, and 15cm and 21cm Nebelwerfer are included in the data for individual engagements. These types of guns were almost exclusively confined to GHQ units. An example from the database are the three engagements Port of Salerno, Amphitheater, and Sele-Calore Corridor. These take place simultaneously (9-11 September 1943) with the German 16th Pz Div on the Axis side in all of them (no other division is included in the battles). Judging from the manpower figures, it seems to have been assumed that the division participated with one quarter of its strength in each of the two former battles and half its strength in the latter. According to the database, the number of guns were:
15cm gun
28
17cm gun
12
21cm gun
12
15cm NbW
27
21cm NbW
21
This would indicate that the 16th Pz Div was supported by the equivalent of more than five non-divisional artillery battalions. For the German army this is a suspiciously high number, usually there were rather something like one GHQ artillery battalion for each division, or even less. Research in the German Military Archives confirmed that the number of GHQ artillery units was far less than indicated in the HERO database. Among the useful documents found were a map showing the dispositions of 10th Army artillery units. This showed clearly that there was only one non-divisional artillery unit south of Rome at the time of the Salerno landings, the III/71 Nebelwerfer Battalion. Also the 557th Artillery Battalion (17cm gun) was present, it was included in the artillery regiment (33rd Artillery Regiment) of 15th Panzergrenadier Division during the second half of 1943. Thus the number of German artillery pieces in these engagements is exaggerated to an extent that cannot be considered insignificant. Since OLI values for artillery usually constitute a significant share of the total OLI of a force in the TNDM, errors in artillery strength cannot be dismissed easily.
While the example above is but one, further archival research has shown that the same kind of error occurs in all the engagements in September and October 1943. It has not been possible to check the engagements later during 1943, but a pattern can be recognized. The ratio between the numbers of various types of GHQ artillery pieces does not change much from battle to battle. It seems that when the database was developed, the researchers worked with the assumption that the German corps and army organizations had organic artillery, and this assumption may have been used as a “rule of thumb.” This is wrong, however; only artillery staffs, command and control units were included in the corps and army organizations, not firing units. Consequently we have a systematic error, which cannot be corrected without changing the contents of the database. It is worth emphasizing that we are discussing an exaggeration of German artillery strength of about 100%, which certainly is significant. Comparing the available archival records with the database also reveals errors in numbers of tanks and antitank guns, but these are much smaller than the errors in artillery strength. Again these errors do always inflate the German strength in those engagements l have been able to check against archival records. These errors tend to inflate German numerical strength, which of course affects CEV calculations. But there are further objections to the CEV calculations.
The Result Formula
The “result formula” weighs together three factors: casualties inflicted, distance advanced, and mission accomplishment. It seems that the first two do not raise many objections, even though the relative weight of them may always be subject to argumentation.
The third factor, mission accomplishment, is more dubious however. At first glance it may seem to be natural to include such a factor. Alter all, a combat unit is supposed to accomplish the missions given to it. However, whether a unit accomplishes its mission or not depends both on its own qualities as well as the realism of the mission assigned. Thus the mission accomplishment factor may reflect the qualities of the combat unit as well as the higher HQs and the general strategic situation. As an example, the Rapido crossing by the U.S. 36th Infantry Division can serve. The division did not accomplish its mission, but whether the mission was realistic, given the circumstances, is dubious. Similarly many German units did probably, in many situations, receive unrealistic missions, particularly during the last two years of the war (when most of the engagements in the database were fought). A more extreme example of situations in which unrealistic missions were given is the battle in Belorussia, June-July 1944, where German units were regularly given impossible missions. Possibly it is a general trend that the side which is fighting at a strategic disadvantage is more prone to give its combat units unrealistic missions.
On the other hand it is quite clear that the mission assigned may well affect both the casualty rates and advance rates. If, for example, the defender has a withdrawal mission, advance may become higher than if the mission was to defend resolutely. This must however not necessarily be handled by including a missions factor in a result formula.
I have made some tentative runs with the TNDM, testing with various CEV values to see which value produced an outcome in terms of casualties and ground gained as near as possible to the historical result. The results of these runs are very preliminary, but the tendency is that higher German CEVs produce more historical outcomes, particularly concerning combat.
Supply Situation
According to scattered information available in published literature, the U.S. artillery fired more shells per day per gun than did German artillery. In Normandy, US 155mm M1 howitzers fired 28.4 rounds per day during July, while August showed slightly lower consumption, 18 rounds per day. For the 105mm M2 howitzer the corresponding figures were 40.8 and 27.4. This can be compared to a German OKH study which, based on the experiences in Russia 1941-43, suggested that consumption of 105mm howitzer ammunition was about 13-22 rounds per gun per day, depending on the strength of the opposition encountered. For the 150mm howitzer the figures were 12-15.
While these figures should not be taken too seriously, as they are not from primary sources and they do also reflect the conditions in different theaters, they do at least indicate that it cannot be taken for granted that ammunition expenditure is proportional to the number of gun barrels. In fact there also exist further indications that Allied ammunition expenditure was greater than the German. Several German reports from Normandy indicate that they were astonished by the Allied ammunition expenditure.
It is unlikely that an increase in artillery ammunition expenditure will result in a proportional increase combat power. Rather it is more likely that there is some kind of diminished return with increased expenditure.
General Problems with Non-Divisional Units
A division usually (but not necessarily) includes various support services, such as maintenance, supply, and medical services. Non-divisional combat units have to a greater extent to rely on corps and army for such support. This makes it complicated to include such units, since when entering, for example, the manpower strength and truck strength in the TNDM, it is difficult to assess their contribution to the overall numbers.
Furthermore, the amount of such forces is not equal on the German and Allied sides. In general the Allied divisional slice was far greater than the German. In Normandy the US forces on 25 July 1944 had 812,000 men on the Continent, while the number of divisions was 18 (including the 5th Armored, which was in the process of landing on the 25th). This gives a divisional slice of 45,000 men. By comparison the German 7th Army mustered 16 divisions and 231,000 men on 1 June 1944, giving a slice of 14,437 men per division. The main explanation for the difference is the non-divisional combat units and the logistical organization to support them. In general, non-divisional combat units are composed of powerful, but supply-consuming, types like armor, artillery, antitank and antiaircraft. Thus their contribution to combat power and strain on the logistical apparatus is considerable. However I do not believe that the supporting units’ manpower and vehicles have been included in TNDM calculations.
There are however further problems with non-divisional units. While the whereabouts of tank and tank destroyer units can usually be established with sufficient certainty, artillery can be much harder to pin down to a specific division engagement. This is of course a greater problem when the geographical extent of a battle is small.
Tooth-to-Tail Ratio
Above was discussed the lack of support units in non-divisional combat units. One effect of this is to create a force with more OLI per man. This is the result of the unit‘s “tail” belonging to some other part of the military organization.
In the TNDM there is a mobility formula, which tends to favor units with many weapons and vehicles compared to the number of men. This became apparent when I was performing a great number of TNDM runs on engagements between Swedish brigades and Soviet regiments. The Soviet regiments usually contained rather few men, but still had many AFVs, artillery tubes, AT weapons, etc. The Mobility Formula in TNDM favors such units. However, I do not think this reflects any phenomenon in the real world. The Soviet penchant for lean combat units, with supply, maintenance, and other services provided by higher echelons, is not a more effective solution in general, but perhaps better suited to the particular constraints they were experiencing when forming units, training men, etc. In effect these services were existing in the Soviet army too, but formally not with the combat units.
This problem is to some extent reminiscent to how density is calculated (a problem discussed by Chris Lawrence in a recent issue of the Newsletter). It is comparatively easy to define the frontal limit of the deployment area of force, and it is relatively easy to define the lateral limits too. It is, however, much more difficult to say where the rear limit of a force is located.
When entering forces in the TNDM a rear limit is, perhaps unintentionally, drawn. But if the combat unit includes support units, the rear limit is pushed farther back compared to a force whose combat units are well separated from support units.
To what extent this affects the CEV calculations is unclear. Using the original database values, the German forces are perhaps given too high combat strength when the great number of GHQ artillery units is included. On the other hand, if the GHQ artillery units are not included, the opposite may be true.
The Effects of Defensive Posture
The posture factors are difficult to analyze, since they alone do not portray the advantages of defensive position. Such effects are also included in terrain factors.
It seems that the numerical values for these factors were assigned on the basis of professional judgement. However, when the QJM was developed, it seems that the developers did not assume the German CEV superiority. Rather, the German CEV superiority seems to have been discovered later. It is possible that the professional judgement was about as wrong on the issue of posture effects as they were on CEV. Since the British and American forces were predominantly on the offensive, while the Germans mainly defended themselves, a German CEV superiority may, at least partly, be hidden in two high effects for defensive posture.
When using corrected input data on the 20 situations in Italy September-October 1943, there is a tendency that the German CEV is higher when they attack. Such a tendency is also discernible in the engagements presented in Hitler’s Last Gamble. Appendix H, even though the number of engagements in the latter case is very small.
As it stands now this is not really more than a hypothesis, since it will take an analysis of a greater number of engagements to confirm it. However, if such an analysis is done, it must be done using several sets of data. German and Allied attacks must be analyzed separately, and preferably the data would be separated further into sets for each relevant terrain type. Since the effects of the defensive posture are intertwined with terrain factors, it is very much possible that the factors may be correct for certain terrain types, while they are wrong for others. It may also be that the factors can be different for various opponents (due to differences in training, doctrine, etc.). It is also possible that the factors are different if the forces are predominantly composed of armor units or mainly of infantry.
One further problem with the effects of defensive position is that it is probably strongly affected by the density of forces. It is likely that the main effect of the density of forces is the inability to use effectively all the forces involved. Thus it may be that this factor will not influence the outcome except when the density is comparatively high. However, what can be regarded as “high” is probably much dependent on terrain, road net quality, and the cross-country mobility of the forces.
Conclusions
While the TNDM has been criticized here, it is also fitting to praise the model. The very fact that it can be criticized in this way is a testimony to its openness. In a sense a model is also a theory, and to use Popperian terminology, the TNDM is also very testable.
It should also be emphasized that the greatest errors are probably those in the database. As previously stated, I can only conclude safely that the data on the engagements in Italy in 1943 are wrong; later engagements have not yet been checked against archival documents. Overall the errors do not represent a dramatic change in the CEV values. Rather, the Germans seem to have (in Italy 1943) a superiority on the order of 1.4-1.5, compared to an original figure of 1.2-1.3.
During September and October 1943, almost all the German divisions in southern Italy were mechanized or parachute divisions. This may have contributed to a higher German CEV. Thus it is not certain that the conclusions arrived at here are valid for German forces in general, even though this factor should not be exaggerated, since many of the German divisions in Italy were either newly raised (e.g., 26th Panzer Division) or rebuilt after the Stalingrad disaster (16th Panzer Division plus 3rd and 29th Panzergrenadier Divisions) or the Tunisian debacle (15th Panzergrenadier Division).
This is like saying, “A team can’t score in football unless it has the ball.” Although subsequent verities stress the strength, value, and importance of defense, this should not obscure the essentiality of offensive action to ultimate combat success. Even in instances where a defensive strategy might conceivably assure a favorable war outcome—as was the case of the British against Napoleon, and as the Confederacy attempted in the American Civil War—selective employment of offensive tactics and operations is required if the strategic defender is to have any chance of final victory. [pp. 1-2]
Only offensive action achieves decisive results. Offensive action permits the commander to exploit the initiative and impose his will on the enemy. The defensive may be forced on the commander, but it should be deliberately adopted only as a temporary expedient while awaiting an opportunity for offensive action or for the purpose of economizing forces on a front where a decision is not sought. Even on the defensive the commander seeks every opportunity to seize the initiative and achieve decisive results by offensive action. [Original emphasis]
Interestingly enough, the offensive no longer retains its primary place in current Army doctrinal thought. The Army consigned its list of the principles of war to an appendix in the 2008 edition of FM 3-0 Operations and omitted them entirely from the 2017 revision. As the current edition of FM 3-0 Operations lays it out, the offensive is now placed on the same par as the defensive and stability operations:
Unified land operations are simultaneous offensive, defensive, and stability or defense support of civil authorities’ tasks to seize, retain, and exploit the initiative to shape the operational environment, prevent conflict, consolidate gains, and win our Nation’s wars as part of unified action (ADRP 3-0)…
At the heart of the Army’s operational concept is decisive action. Decisive action is the continuous, simultaneous combinations of offensive, defensive, and stability or defense support of civil authorities tasks (ADRP 3-0). During large-scale combat operations, commanders describe the combinations of offensive, defensive, and stability tasks in the concept of operations. As a single, unifying idea, decisive action provides direction for an entire operation. [p. I-16; original emphasis]
It is perhaps too easy to read too much into this change in emphasis. On the very next page, FM 3-0 describes offensive “tasks” thusly:
Offensive tasks are conducted to defeat and destroy enemy forces and seize terrain, resources, and population centers. Offensive tasks impose the commander’s will on the enemy. The offense is the most direct and sure means of seizing and exploiting the initiative to gain physical and cognitive advantages over an enemy. In the offense, the decisive operation is a sudden, shattering action that capitalizes on speed, surprise, and shock effect to achieve the operation’s purpose. If that operation does not destroy or defeat the enemy, operations continue until enemy forces disintegrate or retreat so they no longer pose a threat. Executing offensive tasks compels an enemy to react, creating or revealing additional weaknesses that an attacking force can exploit. [p. I-17]
The change in emphasis likely reflects recent U.S. military experience where decisive action has not yielded much in the way of decisive outcomes, as is mentioned in FM 3-0’s introduction. Joint force offensives in 2001 and 2003 “achieved rapid initial military success but no enduring political outcome, resulting in protracted counterinsurgency campaigns.” The Army now anticipates a future operating environment where joint forces can expect to “work together and with unified action partners to successfully prosecute operations short of conflict, prevail in large-scale combat operations, and consolidate gains to win enduring strategic outcomes” that are not necessarily predicated on offensive action alone. We may have to wait for the next edition of FM 3-0 to see if the Army has drawn valid conclusions from the recent past or not.
Consistent Scoring of Weapons and Aggregation of Forces: The Cornerstone of Dupuy’s Quantitative Analysis of Historical Land Battles by
James G. Taylor, PhD,
Dept. of Operations Research, Naval Postgraduate School
Introduction
Col. Trevor N. Dupuy was an American original, especially as regards the quantitative study of warfare. As with many prophets, he was not entirely appreciated in his own land, particularly its Military Operations Research (OR) community. However, after becoming rather familiar with the details of his mathematical modeling of ground combat based on historical data, I became aware of the basic scientific soundness of his approach. Unfortunately, his documentation of methodology was not always accepted by others, many of whom appeared to confuse lack of mathematical sophistication in his documentation with lack of scientific validity of his basic methodology.
The purpose of this brief paper is to review the salient points of Dupuy’s methodology from a system’s perspective, i.e., to view his methodology as a system, functioning as an organic whole to capture the essence of past combat experience (with an eye towards extrapolation into the future). The advantage of this perspective is that it immediately leads one to the conclusion that if one wants to use some functional relationship derived from Dupuy’s work, then one should use his methodologies for scoring weapons, aggregating forces, and adjusting for operational circumstances; since this consistency is the only guarantee of being able to reproduce historical results and to project them into the future.
Implications (of this system’s perspective on Dupuy’s work) for current DOD models will be discussed. In particular, the Military OR community has developed quantitative methods for imputing values to weapon systems based on their attrition capability against opposing forces and force interactions.[1] One such approach is the so-called antipotential-potential method[2] used in TACWAR[3] to score weapons. However, one should not expect such scores to provide valid casualty estimates when combined with historically derived functional relationships such as the so-called ATLAS casualty-rate curves[4] used in TACWAR, because a different “yard-stick” (i.e. measuring system for estimating the relative combat potential of opposing forces) was used to develop such a curve.
Overview of Dupuy’s Approach
This section briefly outlines the salient features of Dupuy’s approach to the quantitative analysis and modeling of ground combat as embodied in his Tactical Numerical Deterministic Model (TNDM) and its predecessor the Quantified Judgment Model (QJM). The interested reader can find details in Dupuy [1979] (see also Dupuy [1985][5], [1987], [1990]). Here we will view Dupuy’s methodology from a system approach, which seeks to discern its various components and their interactions and to view these components as an organic whole. Essentially Dupuy’s approach involves the development of functional relationships from historical combat data (see Fig. 1) and then using these functional relationships to model future combat (see Fig, 2).
At the heart of Dupuy’s method is the investigation of historical battles and comparing the relationship of inputs (as quantified by relative combat power, denoted as Pa/Pd for that of the attacker relative to that of the defender in Fig. l)(e.g. see Dupuy [1979, pp. 59-64]) to outputs (as quantified by extent of mission accomplishment, casualty effectiveness, and territorial effectiveness; see Fig. 2) (e.g. see Dupuy [1979, pp. 47-50]), The salient point is that within this scheme, the main input[6] (i.e. relative combat power) to a historical battle is a derived quantity. It is computed from formulas that involve three essential aspects: (1) the scoring of weapons (e.g, see Dupuy [1979, Chapter 2 and also Appendix A]), (2) aggregation methodology for a force (e.g. see Dupuy [1979, pp. 43-46 and 202-203]), and (3) situational-adjustment methodology for determining the relative combat power of opposing forces (e.g. see Dupuy [1979, pp. 46-47 and 203-204]). In the force-aggregation step the effects on weapons of Dupuy’s environmental variables and one operational variable (air superiority) are considered[7], while in the situation-adjustment step the effects on forces of his behavioral variables[8] (aggregated into a single factor called the relative combat effectiveness value (CEV)) and also the other operational variables are considered (Dupuy [1987, pp. 86-89])
Moreover, any functional relationships developed by Dupuy depend (unless shown otherwise) on his computational system for derived quantities, namely OLls, force strengths, and relative combat power. Thus, Dupuy’s results depend in an essential manner on his overall computational system described immediately above. Consequently, any such functional relationship (e.g. casualty-rate curve) directly or indirectly derivative from Dupuy‘s work should still use his computational methodology for determination of independent-variable values.
Fig l also reveals another important aspect of Dupuy’s work, the development of reliable data on historical battles, Military judgment plays an essential role in this development of such historical data for a variety of reasons. Dupuy was essentially the only source of new secondary historical data developed from primary sources (see McQuie [1970] for further details). These primary sources are well known to be both incomplete and inconsistent, so that military judgment must be used to fill in the many gaps and reconcile observed inconsistencies. Moreover, military judgment also generates the working hypotheses for model development (e.g. identification of significant variables).
At the heart of Dupuy’s quantitative investigation of historical battles and subsequent model development is his own weapons-scoring methodology, which slowly evolved out of study efforts by the Historical Evaluation Research Organization (HERO) and its successor organizations (cf. HERO [1967] and compare with Dupuy [1979]). Early HERO [1967, pp. 7-8] work revealed that what one would today call weapons scores developed by other organizations were so poorly documented that HERO had to create its own methodology for developing the relative lethality of weapons, which eventually evolved into Dupuy’s Operational Lethality Indices (OLIs). Dupuy realized that his method was arbitrary (as indeed is its counterpart, called the operational definition, in formal scientific work), but felt that this would be ameliorated if the weapons-scoring methodology be consistently applied to historical battles. Unfortunately, this point is not clearly stated in Dupuy’s formal writings, although it was clearly (and compellingly) made by him in numerous briefings that this author heard over the years.
In other words, from a system’s perspective, the functional relationships developed by Colonel Dupuy are part of his analysis system that includes this weapons-scoring methodology consistently applied (see Fig. l again). The derived functional relationships do not stand alone (unless further empirical analysis shows them to hold for any weapons-scoring methodology), but function in concert with computational procedures. Another essential part of this system is Dupuy‘s aggregation methodology, which combines numbers, environmental circumstances, and weapons scores to compute the strength (S) of a military force. A key innovation by Colonel Dupuy [1979, pp. 202- 203] was to use a nonlinear (more precisely, a piecewise-linear) model for certain elements of force strength. This innovation precluded the occurrence of military absurdities such as air firepower being fully substitutable for ground firepower, antitank weapons being fully effective when armor targets are lacking, etc‘ The final part of this computational system is Dupuy’s situational-adjustment methodology, which combines the effects of operational circumstances with force strengths to determine relative combat power, e.g. Pa/Pd.
To recapitulate, the determination of an Operational Lethality Index (OLI) for a weapon involves the combination of weapon lethality, quantified in terms of a Theoretical Lethality Index (TLI) (e.g. see Dupuy [1987, p. 84]), and troop dispersion[9] (e.g. see Dupuy [1987, pp. 84- 85]). Weapons scores (i.e. the OLIs) are then combined with numbers (own side and enemy) and combat- environment factors to yield force strength. Six[10] different categories of weapons are aggregated, with nonlinear (i.e. piecewise-linear) models being used for the following three categories of weapons: antitank, air defense, and air firepower (i.e. c1ose—air support). Operational, e.g. mobility, posture, surprise, etc. (Dupuy [1987, p. 87]), and behavioral variables (quantified as a relative combat effectiveness value (CEV)) are then applied to force strength to determine a side’s combat-power potential.
Requirement for Consistent Scoring of Weapons, Force Aggregation, and Situational Adjustment for Operational Circumstances
The salient point to be gleaned from Fig.1 and 2 is that the same (or at least consistent) weapons—scoring, aggregation, and situational—adjustment methodologies be used for both developing functional relationships and then playing them to model future combat. The corresponding computational methods function as a system (organic whole) for determining relative combat power, e.g. Pa/Pd. For the development of functional relationships from historical data, a force ratio (relative combat power of the two opposing sides, e.g. attacker’s combat power divided by that of the defender, Pa/Pd is computed (i.e. it is a derived quantity) as the independent variable, with observed combat outcome being the dependent variable. Thus, as discussed above, this force ratio depends on the methodologies for scoring weapons, aggregating force strengths, and adjusting a force’s combat power for the operational circumstances of the engagement. It is a priori not clear that different scoring, aggregation, and situational-adjustment methodologies will lead to similar derived values. If such different computational procedures were to be used, these derived values should be recomputed and the corresponding functional relationships rederived and replotted.
However, users of the Tactical Numerical Deterministic Model (TNDM) (or for that matter, its predecessor, the Quantified Judgment Model (QJM)) need not worry about this point because it was apparently meticulously observed by Colonel Dupuy in all his work. However, portions of his work have found their way into a surprisingly large number of DOD models (usually not explicitly acknowledged), but the context and range of validity of historical results have been largely ignored by others. The need for recalibration of the historical data and corresponding functional relationships has not been considered in applying Dupuy’s results for some important current DOD models.
Implications for Current DOD Models
A number of important current DOD models (namely, TACWAR and JICM discussed below) make use of some of Dupuy’s historical results without recalibrating functional relationships such as loss rates and rates of advance as a function of some force ratio (e.g. Pa/Pd). As discussed above, it is not clear that such a procedure will capture the essence of past combat experience. Moreover, in calculating losses, Dupuy first determines personnel losses (expressed as a percent loss of personnel strength, i.e., number of combatants on a side) and then calculates equipment losses as a function of this casualty rate (e.g., see Dupuy [1971, pp. 219-223], also [1990, Chapters 5 through 7][11]). These latter functional relationships are apparently not observed in the models discussed below. In fact, only Dupuy (going back to Dupuy [1979][12] takes personnel losses to depend on a force ratio and other pertinent variables, with materiel losses being taken as derivative from this casualty rate.
For example, TACWAR determines personnel losses[13] by computing a force ratio and then consulting an appropriate casualty-rate curve (referred to as empirical data), much in the same fashion as ATLAS did[14]. However, such a force ratio is computed using a linear model with weapon values determined by the so-called antipotential-potential method[15]. Unfortunately, this procedure may not be consistent with how the empirical data (i.e. the casualty-rate curves) was developed. Further research is required to demonstrate that valid casualty estimates are obtained when different weapon scoring, aggregation, and situational-adjustment methodologies are used to develop casualty-rate curves from historical data and to use them to assess losses in aggregated combat models. Furthermore, TACWAR does not use Dupuy’s model for equipment losses (see above), although it does purport, as just noted above, to use “historical data” (e.g., see Kerlin et al. [1975, p. 22]) to compute personnel losses as a function (among other things) of a force ratio (given by a linear relationship), involving close air support values in a way never used by Dupuy. Although their force-ratio determination methodology does have logical and mathematical merit, it is not the way that the historical data was developed.
Moreover, RAND (Allen [1992]) has more recently developed what is called the situational force scoring (SFS) methodology for calculating force ratios in large-scale, aggregated-force combat situations to determine loss and movement rates. Here, SFS refers essentially to a force- aggregation and situation-adjustment methodology, which has many conceptual elements in common with Dupuy‘s methodology (except, most notably, extensive testing against historical data, especially documentation of such efforts). This SFS was originally developed for RSAS[16] and is today used in JICM[17]. It also apparently uses a weapon-scoring system developed at RAND[18]. It purports (no documentation given [citation of unpublished work]) to be consistent with historical data (including the ATLAS casualty-rate curves) (Allen [1992, p.41]), but again no consideration is given to recalibration of historical results for different weapon scoring, force-aggregation, and situational-adjustment methodologies. SFS emphasizes adjusting force strengths according to operational circumstances (the “situation”) of the engagement (including surprise), with many innovative ideas (but in some major ways has little connection with previous work of others[19]). The resulting model contains many more details than historical combat data would support. It also is methodology that differs in many essential ways from that used previously by any investigator. In particular, it is doubtful that it develops force ratios in a manner consistent with Dupuy’s work.
Final Comments
Use of (sophisticated) mathematics for modeling past historical combat (and extrapolating it into the future for planning purposes) is no reason for ignoring Dupuy’s work. One would think that the current Military OR community would try to understand Dupuy’s work before trying to improve and extend it. In particular, Colonel Dupuy’s various computational procedures (including constants) must be considered as an organic whole (i.e. a system) supporting the development of functional relationships. If one ignores this computational system and simply tries to use some isolated aspect, the result may be interesting and even logically sound, but it probably lacks any scientific validity.
REFERENCES
P. Allen, “Situational Force Scoring: Accounting for Combined Arms Effects in Aggregate Combat Models,” N-3423-NA, The RAND Corporation, Santa Monica, CA, 1992.
L. B. Anderson, “A Briefing on Anti-Potential Potential (The Eigen-value Method for Computing Weapon Values), WP-2, Project 23-31, Institute for Defense Analyses, Arlington, VA, March 1974.
B. W. Bennett, et al, “RSAS 4.6 Summary,” N-3534-NA, The RAND Corporation, Santa Monica, CA, 1992.
B. W. Bennett, A. M. Bullock, D. B. Fox, C. M. Jones, J. Schrader, R. Weissler, and B. A. Wilson, “JICM 1.0 Summary,” MR-383-NA, The RAND Corporation, Santa Monica, CA, 1994.
P. K. Davis and J. A. Winnefeld, “The RAND Strategic Assessment Center: An Overview and Interim Conclusions About Utility and Development Options,” R-2945-DNA, The RAND Corporation, Santa Monica, CA, March 1983.
T.N, Dupuy, Numbers. Predictions and War: Using History to Evaluate Combat Factors and Predict the Outcome of Battles, The Bobbs-Merrill Company, Indianapolis/New York, 1979,
T.N. Dupuy, Numbers Predictions and War, Revised Edition, HERO Books, Fairfax, VA 1985.
T.N. Dupuy, Understanding War: History and Theory of Combat, Paragon House Publishers, New York, 1987.
T.N. Dupuy, Attrition: Forecasting Battle Casualties and Equipment Losses in Modem War, HERO Books, Fairfax, VA, 1990.
General Research Corporation (GRC), “A Hierarchy of Combat Analysis Models,” McLean, VA, January 1973.
Historical Evaluation and Research Organization (HERO), “Average Casualty Rates for War Games, Based on Historical Data,” 3 Volumes in 1, Dunn Loring, VA, February 1967.
E. P. Kerlin and R. H. Cole, “ATLAS: A Tactical, Logistical, and Air Simulation: Documentation and User’s Guide,” RAC-TP-338, Research Analysis Corporation, McLean, VA, April 1969 (AD 850 355).
E.P. Kerlin, L.A. Schmidt, A.J. Rolfe, M.J. Hutzler, and D,L. Moody, “The IDA Tactical Warfare Model: A Theater-Level Model of Conventional, Nuclear, and Chemical Warfare, Volume II- Detailed Description” R-21 1, Institute for Defense Analyses, Arlington, VA, October 1975 (AD B009 692L).
R. McQuie, “Military History and Mathematical Analysis,” Military Review 50, No, 5, 8-17 (1970).
S.M. Robinson, “Shadow Prices for Measures of Effectiveness, I: Linear Model,” Operations Research 41, 518-535 (1993).
J.G. Taylor, Lanchester Models of Warfare. Vols, I & II. Operations Research Society of America, Alexandria, VA, 1983. (a)
J.G. Taylor, “A Lanchester-Type Aggregated-Force Model of Conventional Ground Combat,” Naval Research Logistics Quarterly 30, 237-260 (1983). (b)
NOTES
[1] For example, see Taylor [1983a, Section 7.18], which contains a number of examples. The basic references given there may be more accessible through Robinson [I993].
[2] This term was apparently coined by L.B. Anderson [I974] (see also Kerlin et al. [1975, Chapter I, Section D.3]).
[3] The Tactical Warfare (TACWAR) model is a theater-level, joint-warfare, computer-based combat model that is currently used for decision support by the Joint Staff and essentially all CINC staffs. It was originally developed by the Institute for Defense Analyses in the mid-1970s (see Kerlin et al. [1975]), originally referred to as TACNUC, which has been continually upgraded until (and including) the present day.
[4] For example, see Kerlin and Cole [1969], GRC [1973, Fig. 6-6], or Taylor [1983b, Fig. 5] (also Taylor [1983a, Section 7.13]).
[5] The only apparent difference between Dupuy [1979] and Dupuy [1985] is the addition of an appendix (Appendix C “Modified Quantified Judgment Analysis of the Bekaa Valley Battle”) to the end of the latter (pp. 241-251). Hence, the page content is apparently the same for these two books for pp. 1-239.
[6] Technically speaking, one also has the engagement type and possibly several other descriptors (denoted in Fig. 1 as reduced list of operational circumstances) as other inputs to a historical battle.
[7] In Dupuy [1979, e.g. pp. 43-46] only environmental variables are mentioned, although basically the same formulas underlie both Dupuy [1979] and Dupuy [1987]. For simplicity, Fig. 1 and 2 follow this usage and employ the term “environmental circumstances.”
[8] In Dupuy [1979, e.g. pp. 46-47] only operational variables are mentioned, although basically the same formulas underlie both Dupuy [1979] and Dupuy [1987]. For simplicity, Fig. 1 and 2 follow this usage and employ the term “operational circumstances.”
[9] Chris Lawrence has kindly brought to my attention that since the same value for troop dispersion from an historical period (e.g. see Dupuy [1987, p. 84]) is used for both the attacker and also the defender, troop dispersion does not actually affect the determination of relative combat power PM/Pd.
[10] Eight different weapon types are considered, with three being classified as infantry weapons (e.g. see Dupuy [1979, pp, 43-44], [1981 pp. 85-86]).
[11] Chris Lawrence has kindly informed me that Dupuy‘s work on relating equipment losses to personnel losses goes back to the early 1970s and even earlier (e.g. see HERO [1966]). Moreover, Dupuy‘s [1992] book Future Wars gives some additional empirical evidence concerning the dependence of equipment losses on casualty rates.
[12] But actually going back much earlier as pointed out in the previous footnote.
[13] See Kerlin et al. [1975, Chapter I, Section D.l].
[14] See Footnote 4 above.
[15] See Kerlin et al. [1975, Chapter I, Section D.3]; see also Footnotes 1 and 2 above.
[16] The RAND Strategy Assessment System (RSAS) is a multi-theater aggregated combat model developed at RAND in the early l980s (for further details see Davis and Winnefeld [1983] and Bennett et al. [1992]). It evolved into the Joint Integrated Contingency Model (JICM), which is a post-Cold War redesign of the RSAS (starting in FY92).
[17] The Joint Integrated Contingency Model (JICM) is a game-structured computer-based combat model of major regional contingencies and higher-level conflicts, covering strategic mobility, regional conventional and nuclear warfare in multiple theaters, naval warfare, and strategic nuclear warfare (for further details, see Bennett et al. [1994]).
[18] RAND apparently replaced one weapon-scoring system by another (e.g. see Allen [1992, pp. 9, l5, and 87-89]) without making any other changes in their SFS System.
[19] For example, both Dupuy’s early HERO work (e.g. see Dupuy [1967]), reworks of these results by the Research Analysis Corporation (RAC) (e.g. see RAC [1973, Fig. 6-6]), and Dupuy’s later work (e.g. see Dupuy [1979]) considered daily fractional casualties for the attacker and also for the defender as basic casualty-outcome descriptors (see also Taylor [1983b]). However, RAND does not do this, but considers the defender’s loss rate and a casualty exchange ratio as being the basic casualty-production descriptors (Allen [1992, pp. 41-42]). The great value of using the former set of descriptors (i.e. attacker and defender fractional loss rates) is that not only is casualty assessment more straight forward (especially development of functional relationships from historical data) but also qualitative model behavior is readily deduced (see Taylor [1983b] for further details).
[The article below is reprinted from History, Numbers And War: A HERO Journal, Vol. 1, No. 1, Spring 1977, pp. 34-52]
The Lanchester Equations and Historical Warfare: An Analysis of Sixty World War II Land Engagements
By Janice B. Fain
Background and Objectives
The method by which combat losses are computed is one of the most critical parts of any combat model. The Lanchester equations, which state that a unit’s combat losses depend on the size of its opponent, are widely used for this purpose.
In addition to their use in complex dynamic simulations of warfare, the Lanchester equations have also sewed as simple mathematical models. In fact, during the last decade or so there has been an explosion of theoretical developments based on them. By now their variations and modifications are numerous, and “Lanchester theory” has become almost a separate branch of applied mathematics. However, compared with the effort devoted to theoretical developments, there has been relatively little empirical testing of the basic thesis that combat losses are related to force sizes.
One of the first empirical studies of the Lanchester equations was Engel’s classic work on the Iwo Jima campaign in which he found a reasonable fit between computed and actual U.S. casualties (Note 1). Later studies were somewhat less supportive (Notes 2 and 3), but an investigation of Korean war battles showed that, when the simulated combat units were constrained to follow the tactics of their historical counterparts, casualties during combat could be predicted to within 1 to 13 percent (Note 4).
Taken together, these various studies suggest that, while the Lanchester equations may be poor descriptors of large battles extending over periods during which the forces were not constantly in combat, they may be adequate for predicting losses while the forces are actually engaged in fighting. The purpose of the work reported here is to investigate 60 carefully selected World War II engagements. Since the durations of these battles were short (typically two to three days), it was expected that the Lanchester equations would show a closer fit than was found in studies of larger battles. In particular, one of the objectives was to repeat, in part, Willard’s work on battles of the historical past (Note 3).
The Data Base
Probably the most nearly complete and accurate collection of combat data is the data on World War II compiled by the Historical Evaluation and Research Organization (HERO). From their data HERO analysts selected, for quantitative analysis, the following 60 engagements from four major Italian campaigns:
Salerno, 9-18 Sep 1943, 9 engagements
Volturno, 12 Oct-8 Dec 1943, 20 engagements
Anzio, 22 Jan-29 Feb 1944, 11 engagements
Rome, 14 May-4 June 1944, 20 engagements
The complete data base is described in a HERO report (Note 5). The work described here is not the first analysis of these data. Statistical analyses of weapon effectiveness and the testing of a combat model (the Quantified Judgment Method, QJM) have been carried out (Note 6). The work discussed here examines these engagements from the viewpoint of the Lanchester equations to consider the question: “Are casualties during combat related to the numbers of men in the opposing forces?”
The variables chosen for this analysis are shown in Table 1. The “winners” of the engagements were specified by HERO on the basis of casualties suffered, distance advanced, and subjective estimates of the percentage of the commander’s objective achieved. Variable 12, the Combat Power Ratio, is based on the Operational Lethality Indices (OLI) of the units (Note 7).
The general characteristics of the engagements are briefly described. Of the 60, there were 19 attacks by British forces, 28 by U.S. forces, and 13 by German forces. The attacker was successful in 34 cases; the defender, in 23; and the outcomes of 3 were ambiguous. With respect to terrain, 19 engagements occurred in flat terrain; 24 in rolling, or intermediate, terrain; and 17 in rugged, or difficult, terrain. Clear weather prevailed in 40 cases; 13 engagements were fought in light or intermittent rain; and 7 in medium or heavy rain. There were 28 spring and summer engagements and 32 fall and winter engagements.
Comparison of World War II Engagements With Historical Battles
Since one purpose of this work is to repeat, in part, Willard’s analysis, comparison of these World War II engagements with the historical battles (1618-1905) studied by him will be useful. Table 2 shows a comparison of the distribution of battles by type. Willard’s cases were divided into two categories: I. meeting engagements, and II. sieges, attacks on forts, and similar operations. HERO’s World War II engagements were divided into four types based on the posture of the defender: 1. delay, 2. hasty defense, 3. prepared position, and 4. fortified position. If postures 1 and 2 are considered very roughly equivalent to Willard’s category I, then in both data sets the division into the two gross categories is approximately even.
The distribution of engagements across force ratios, given in Table 3, indicated some differences. Willard’s engagements tend to cluster at the lower end of the scale (1-2) and at the higher end (4 and above), while the majority of the World War II engagements were found in mid-range (1.5 – 4) (Note 8). The frequency with which the numerically inferior force achieved victory is shown in Table 4. It is seen that in neither data set are force ratios good predictors of success in battle (Note 9).
Results of the Analysis Willard’s Correlation Analysis
There are two forms of the Lanchester equations. One represents the case in which firing units on both sides know the locations of their opponents and can shift their fire to a new target when a “kill” is achieved. This leads to the “square” law where the loss rate is proportional to the opponent’s size. The second form represents that situation in which only the general location of the opponent is known. This leads to the “linear” law in which the loss rate is proportional to the product of both force sizes.
As Willard points out, large battles are made up of many smaller fights. Some of these obey one law while others obey the other, so that the overall result should be a combination of the two. Starting with a general formulation of Lanchester’s equations, where g is the exponent of the target unit’s size (that is, g is 0 for the square law and 1 for the linear law), he derives the following linear equation:
log (nc/mc) = log E + g log (mo/no) (1)
where nc and mc are the casualties, E is related to the exchange ratio, and mo and no are the initial force sizes. Linear regression produces a value for g. However, instead of lying between 0 and 1, as expected, the) g‘s range from -.27 to -.87, with the majority lying around -.5. (Willard obtains several values for g by dividing his data base in various ways—by force ratio, by casualty ratio, by historical period, and so forth.) A negative g value is unpleasant. As Willard notes:
Military theorists should be disconcerted to find g < 0, for in this range the results seem to imply that if the Lanchester formulation is valid, the casualty-producing power of troops increases as they suffer casualties (Note 3).
From his results, Willard concludes that his analysis does not justify the use of Lanchester equations in large-scale situations (Note 10).
Analysis of the World War II Engagements
Willard’s computations were repeated for the HERO data set. For these engagements, regression produced a value of -.594 for g (Note 11), in striking agreement with Willard’s results. Following his reasoning would lead to the conclusion that either the Lanchester equations do not represent these engagements, or that the casualty producing power of forces increases as their size decreases.
However, since the Lanchester equations are so convenient analytically and their use is so widespread, it appeared worthwhile to reconsider this conclusion. In deriving equation (1), Willard used binomial expansions in which he retained only the leading terms. It seemed possible that the poor results might he due, in part, to this approximation. If the first two terms of these expansions are retained, the following equation results:
log (nc/mc) = log E + log (Mo-mc)/(no-nc) (2)
Repeating this regression on the basis of this equation leads to g = -.413 (Note 12), hardly an improvement over the initial results.
A second attempt was made to salvage this approach. Starting with raw OLI scores (Note 7), HERO analysts have computed “combat potentials” for both sides in these engagements, taking into account the operational factors of posture, vulnerability, and mobility; environmental factors like weather, season, and terrain; and (when the record warrants) psychological factors like troop training, morale, and the quality of leadership. Replacing the factor (mo/no) in Equation (1) by the combat power ratio produces the result) g = .466 (Note 13).
While this is an apparent improvement in the value of g, it is achieved at the expense of somewhat distorting the Lanchester concept. It does preserve the functional form of the equations, but it requires a somewhat strange definition of “killing rates.”
Analysis Based on the Differential Lanchester Equations
Analysis of the type carried out by Willard appears to produce very poor results for these World War II engagements. Part of the reason for this is apparent from Figure 1, which shows the scatterplot of the dependent variable, log (nc/mc), against the independent variable, log (mo/no). It is clear that no straight line will fit these data very well, and one with a positive slope would not be much worse than the “best” line found by regression. To expect the exponent to account for the wide variation in these data seems unreasonable.
Here, a simpler approach will be taken. Rather than use the data to attempt to discriminate directly between the square and the linear laws, they will be used to estimate linear coefficients under each assumption in turn, starting with the differential formulation rather than the integrated equations used by Willard.
In their simplest differential form, the Lanchester equations may be written;
Square Law; dA/dt = -kdD and dD/dt = kaA (3)
Linear law: dA/dt = -k’dAD and dD/dt = k’aAD (4)
where
A(D) is the size of the attacker (defender)
dA/dt (dD/dt) is the attacker’s (defender’s) loss rate,
ka, k’a (kd, k’d) are the attacker’s (defender’s) killing rates
For this analysis, the day is taken as the basic time unit, and the loss rate per day is approximated by the casualties per day. Results of the linear regressions are given in Table 5. No conclusions should be drawn from the fact that the correlation coefficients are higher in the linear law case since this is expected for purely technical reasons (Note 14). A better picture of the relationships is again provided by the scatterplots in Figure 2. It is clear from these plots that, as in the case of the logarithmic forms, a single straight line will not fit the entire set of 60 engagements for either of the dependent variables.
To investigate ways in which the data set might profitably be subdivided for analysis, T-tests of the means of the dependent variable were made for several partitionings of the data set. The results, shown in Table 6, suggest that dividing the engagements by defense posture might prove worthwhile.
Results of the linear regressions by defense posture are shown in Table 7. For each posture, the equation that seemed to give a better fit to the data is underlined (Note 15). From this table, the following very tentative conclusions might be drawn:
In an attack on a fortified position, the attacker suffers casualties by the square law; the defender suffers casualties by the linear law. That is, the defender is aware of the attacker’s position, while the attacker knows only the general location of the defender. (This is similar to Deitchman’s guerrilla model. Note 16).
This situation is apparently reversed in the cases of attacks on prepared positions and hasty defenses.
Delaying situations seem to be treated better by the square law for both attacker and defender.
Table 8 summarizes the killing rates by defense posture. The defender has a much higher killing rate than the attacker (almost 3 to 1) in a fortified position. In a prepared position and hasty defense, the attacker appears to have the advantage. However, in a delaying action, the defender’s killing rate is again greater than the attacker’s (Note 17).
Figure 3 shows the scatterplots for these cases. Examination of these plots suggests that a tentative answer to the study question posed above might be: “Yes, casualties do appear to be related to the force sizes, but the relationship may not be a simple linear one.”
In several of these plots it appears that two or more functional forms may be involved. Consider, for example, the defender‘s casualties as a function of the attacker’s initial strength in the case of a hasty defense. This plot is repeated in Figure 4, where the points appear to fit the curves sketched there. It would appear that there are at least two, possibly three, separate relationships. Also on that plot, the individual engagements have been identified, and it is interesting to note that on the curve marked (1), five of the seven attacks were made by Germans—four of them from the Salerno campaign. It would appear from this that German attacks are associated with higher than average defender casualties for the attacking force size. Since there are so few data points, this cannot be more than a hint or interesting suggestion.
Future Research
This work suggests two conclusions that might have an impact on future lines of research on combat dynamics:
Tactics appear to be an important determinant of combat results. This conclusion, in itself, does not appear startling, at least not to the military. However, it does not always seem to have been the case that tactical questions have been considered seriously by analysts in their studies of the effects of varying force levels and force mixes.
Historical data of this type offer rich opportunities for studying the effects of tactics. For example, consideration of the narrative accounts of these battles might permit re-coding the engagements into a larger, more sensitive set of engagement categories. (It would, of course, then be highly desirable to add more engagements to the data set.)
While predictions of the future are always dangerous, I would nevertheless like to suggest what appears to be a possible trend. While military analysis of the past two decades has focused almost exclusively on the hardware of weapons systems, at least part of our future analysis will be devoted to the more behavioral aspects of combat.
Janice Bloom Fain, a Senior Associate of CACI, lnc., is a physicist whose special interests are in the applications of computer simulation techniques to industrial and military operations; she is the author of numerous reports and articles in this field. This paper was presented by Dr. Fain at the Military Operations Research Symposium at Fort Eustis, Virginia.
[5.] HERO, “A Study of the Relationship of Tactical Air Support Operations to Land Combat, Appendix B, Historical Data Base.” Historical Evaluation and Research Organization, report prepared for the Defense Operational Analysis Establishment, U.K.T.S.D., Contract D-4052 (1971).
[6.] T. N. Dupuy, The Quantified Judgment Method of Analysis of Historical Combat Data, HERO Monograph, (January 1973); HERO, “Statistical Inference in Analysis in Combat,” Annex F, Historical Data Research on Tactical Air Operations, prepared for Headquarters USAF, Assistant Chief of Staff for Studies and Analysis, Contract No. F-44620-70-C-0058 (1972).
[7.] The Operational Lethality Index (OLI) is a measure of weapon effectiveness developed by HERO.
[8.] Since Willard’s data did not indicate which side was the attacker, his force ratio is defined to be (larger force/smaller force). The HERO force ratio is (attacker/defender).
[9.] Since the criteria for success may have been rather different for the two sets of battles, this comparison may not be very meaningful.
[10.] This work includes more complex analysis in which the possibility that the two forces may be engaging in different types of combat is considered, leading to the use of two exponents rather than the single one, Stochastic combat processes are also treated.
[11.] Correlation coefficient = -.262;
Intercept = .00115; slope = -.594.
[12.] Correlation coefficient = -.184;
Intercept = .0539; slope = -,413.
[13.] Correlation coefficient = .303;
Intercept = -.638; slope = .466.
[14.] Correlation coefficients for the linear law are inflated with respect to the square law since the independent variable is a product of force sizes and, thus, has a higher variance than the single force size unit in the square law case.
[15.] This is a subjective judgment based on the following considerations Since the correlation coefficient is inflated for the linear law, when it is lower the square law case is chosen. When the linear law correlation coefficient is higher, the case with the intercept closer to 0 is chosen.
[17.] As pointed out by Mr. Alan Washburn, who prepared a critique on this paper, when comparing numerical values of the square law and linear law killing rates, the differences in units must be considered. (See footnotes to Table 7).
After discussing with Chris the series of recent posts on the subject of breakpoints, it seemed appropriate to provide a better definition of exactly what a breakpoint is.
Dorothy Kneeland Clark was the first to define the notion of a breakpoint in her study, Casualties as a Measure of the Loss of Combat Effectiveness of an Infantry Battalion (Operations Research Office, The Johns Hopkins University: Baltimore, 1954). She found it was not quite as clear-cut as it seemed and the working definition she arrived at was based on discussions and the specific combat outcomes she found in her data set [pp 9-12].
DETERMINATION OF BREAKPOINT
The following definitions were developed out of many discussions. A unit is considered to have lost its combat effectiveness when it is unable to carry out its mission. The onset of this inability constitutes a breakpoint. A unit’s mission is the objective assigned in the current operations order or any other instructional directive, written or verbal. The objective may be, for example, to attack in order to take certain positions, or to defend certain positions.
How does one determine when a unit is unable to carry out its mission? The obvious indication is a change in operational directive: the unit is ordered to stop short of its original goal, to hold instead of attack, to withdraw instead of hold. But one or more extraneous elements may cause the issue of such orders:
(1) Some other unit taking part in the operation may have lost its combat effectiveness, and its predicament may force changes, in the tactical plan. For example the inability of one infantry battalion to take a hill may require that the two adjoining battalions be stopped to prevent exposing their flanks by advancing beyond it.
(2) A unit may have been assigned an objective on the basis of a G-2 estimate of enemy weakness which, as the action proceeds, proves to have been over-optimistic. The operations plan may, therefore, be revised before the unit has carried out its orders to the point of losing combat effectiveness.
(3) The commanding officer, for reasons quite apart from the tactical attrition, may change his operations plan. For instance, General Ridgway in May 1951 was obliged to cancel his plans for a major offensive north of the 38th parallel in Korea in obedience to top level orders dictated by political considerations.
(4) Even if the supposed combat effectiveness of the unit is the determining factor in the issuance of a revised operations order, a serious difficulty in evaluating the situation remains. The commanding officer’s decision is necessarily made on the basis of information available to him plus his estimate of his unit’s capacities. Either or both of these bases may be faulty. The order may belatedly recognize a collapse which has in factor occurred hours earlier, or a commanding officer may withdraw a unit which could hold for a much longer time.
It was usually not hard to discover when changes in orders resulted from conditions such as the first three listed above, but it proved extremely difficult to distinguish between revised orders based on a correct appraisal of the unit’s combat effectiveness and those issued in error. It was concluded that the formal order for a change in mission cannot be taken as a definitive indication of the breakpoint of a unit. It seemed necessary to go one step farther and search the records to learn what a given battalion did regardless of provisions in formal orders…
CATEGORIES OF BREAKPOINTS SELECTED
In the engagements studied the following categories of breakpoint were finally selected:
Category of Breakpoint
No. Analyzed
I. Attack [Symbol] rapid reorganization [Symbol] attack
9
II. Attack [Symbol] defense (no longer able to attack without a few days of recuperation and reinforcement
21
III. Defense [Symbol] withdrawal by order to a secondary line
13
IV. Defense [Symbol] collapse
5
Disorganization and panic were taken as unquestionable evidence of loss of combat effectiveness. It appeared, however, that there were distinct degrees of magnitude in these experiences. In addition to the expected breakpoints at attack [Symbol] defense and defense [Symbol] collapse, a further category, I, seemed to be indicated to include situations in which an attacking battalion was ‘pinned down” or forced to withdraw in partial disorder but was able to reorganize in 4 to 24 hours and continue attacking successfully.
Category II includes (a) situations in which an attacking battalion was ordered into the defensive after severe fighting or temporary panic; (b) situations in which a battalion, after attacking successfully, failed to gain ground although still attempting to advance and was finally ordered into defense, the breakpoint being taken as occurring at the end of successful advance. In other words, the evident inability of the unit to fulfill its mission was used as the criterion for the breakpoint whether orders did or did not recognize its inability. Battalions after experiencing such a breakpoint might be able to recuperate in a few days to the point of renewing successful attack or might be able to continue for some time in defense.
The sample of breakpoints coming under category IV, defense [Symbol] collapse, proved to be very small (5) and unduly weighted in that four of the examples came from the same engagement. It was, therefore, discarded as probably not representative of the universe of category IV breakpoints,* and another category (III) was added: situations in which battalions on the defense were ordered withdrawn to a quieter sector. Because only those instances were included in which the withdrawal orders appeared to have been dictated by the condition of the unit itself, it is believed that casualty levels for this category can be regarded as but slightly lower than those associated with defense [Symbol] collapse.
In both categories II and III, “‘defense” represents an active situation in which the enemy is attacking aggressively.
* It had been expected that breakpoints in this category would be associated with very high losses. Such did not prove to be the case. In whatever way the data were approached, most of the casualty averages were only slightly higher than those associated with category II (attack [Symbol] defense), although the spread in data was wider. It is believed that factors other than casualties, such as bad weather, difficult terrain, and heavy enemy artillery fire undoubtedly played major roles in bringing about the collapse in the four units taking part in the same engagement. Furthermore, the casualty figures for the four units themselves is in question because, as the situation deteriorated, many of the men developed severe cases of trench foot and combat exhaustion, but were not evacuated, as they would have been in a less desperate situation, and did not appear in the casualty records until they had made their way to the rear after their units had collapsed.
In 1987-1988, Trevor Dupuy and colleagues at Data Memory Systems, Inc. (DMSi), Janice Fain, Rich Anderson, Gay Hammerman, and Chuck Hawkins sought to create a broader, more generally applicable definition for breakpoints for the study, Forced Changes of Combat Posture (DMSi, Fairfax, VA, 1988) [pp. I-2-3]
The combat posture of a military force is the immediate intention of its commander and troops toward the opposing enemy force, together with the preparations and deployment to carry out that intention. The chief combat postures are attack, defend, delay, and withdraw.
A change in combat posture (or posture change) is a shift from one posture to another, as, for example, from defend to attack or defend to withdraw. A posture change can be either voluntary or forced.
A forced posture change (FPC) is a change in combat posture by a military unit that is brought about, directly or indirectly, by enemy action. Forced posture changes are characteristically and almost always changes to a less aggressive posture. The most usual FPCs are from attack to defend and from defend to withdraw (or retrograde movement). A change from withdraw to combat ineffectiveness is also possible.
Breakpoint is a term sometimes used as synonymous with forced posture change, and sometimes used to mean the collapse of a unit into ineffectiveness or rout. The latter meaning is probably more common in general usage, while forced posture change is the more precise term for the subject of this study. However, for brevity and convenience, and because this study has been known informally since its inception as the “Breakpoints” study, the term breakpoint is sometimes used in this report. When it is used, it is synonymous with forced posture change.
Hopefully this will help clarify the previous discussions of breakpoints on the blog.
One of the least studied aspects of combat is battle termination. Why do units in combat stop attacking or defending? Shifts in combat posture (attack, defend, delay, withdrawal) are usually voluntary, directed by a commander, but they can also be involuntary, as a result of direct or indirect enemy action. Why do involuntary changes in combat posture, known as breakpoints, occur?
As Chris pointed out in a previous post, the topic of breakpoints has only been addressed by two known studies since 1954. Most existing military combat models and wargames address breakpoints in at least a cursory way, usually through some calculation based on personnel casualties. Both of the breakpoints studies suggest that involuntary changes in posture are seldom related to casualties alone, however.
Current U.S. Army doctrine addresses changes in combat posture through discussions of culmination points in the attack, and transitions from attack to defense, defense to counterattack, and defense to retrograde. But these all pertain to voluntary changes, not breakpoints.
Army doctrinal literature has little to say about breakpoints, either in the context of friendly forces or potential enemy combatants. The little it does say relates to the effects of fire on enemy forces and is based on personnel and material attrition.
According to ADRP 1-02 Terms and Military Symbols, an enemy combat unit is considered suppressed after suffering 3% personnel casualties or material losses, neutralized by 10% losses, and destroyed upon sustaining 30% losses. The sources and methodology for deriving these figures is unknown, although these specific terms and numbers have been a part of Army doctrine for decades.
The joint U.S. Army and U.S. Marine Corps vision of future land combat foresees battlefields that are highly lethal and demanding on human endurance. How will such a future operational environment affect combat performance? Past experience undoubtedly offers useful insights but there seems to be little interest in seeking out such knowledge.
The Dupuy Air Campaign Model
by Col. Joseph A. Bulger, Jr., USAF, Ret.
The Dupuy Institute, as part of the DACM [Dupuy Air Campaign Model], created a draft model in a spreadsheet format to show how such a model would calculate attrition. Below are the actual printouts of the “interim methodology demonstration,” which shows the types of inputs, outputs, and equations used for the DACM. The spreadsheet was created by Col. Bulger, while many of the formulae were the work of Robert Shaw.
Air Model Historical Data Study by Col. Joseph A. Bulger, Jr., USAF, Ret
The Air Model Historical Study (AMHS) was designed to lead to the development of an air campaign model for use by the Air Command and Staff College (ACSC). This model, never completed, became known as the Dupuy Air Campaign Model (DACM). It was a team effort led by Trevor N. Dupuy and included the active participation of Lt. Col. Joseph Bulger, Gen. Nicholas Krawciw, Chris Lawrence, Dave Bongard, Robert Schmaltz, Robert Shaw, Dr. James Taylor, John Kettelle, Dr. George Daoust and Louis Zocchi, among others. After Dupuy’s death, I took over as the project manager.
At the first meeting of the team Dupuy assembled for the study, it became clear that this effort would be a serious challenge. In his own style, Dupuy was careful to provide essential guidance while, at the same time, cultivating a broad investigative approach to the unique demands of modeling for air combat. It would have been no surprise if the initial guidance established a focus on the analytical approach, level of aggregation, and overall philosophy of the QJM [Quantified Judgement Model] and TNDM [Tactical Numerical Deterministic Model]. It was clear that Trevor had no intention of steering the study into an air combat modeling methodology based directly on QJM/TNDM. To the contrary, he insisted on a rigorous derivation of the factors that would permit the final choice of model methodology.
At the time of Dupuy’s death in June 1995, the Air Model Historical Data Study had reached a point where a major decision was needed. The early months of the study had been devoted to developing a consensus among the TDI team members with respect to the factors that needed to be included in the model. The discussions tended to highlight three areas of particular interest—factors that had been included in models currently in use, the limitations of these models, and the need for new factors (and relationships) peculiar to the properties and dynamics of the air campaign. Team members formulated a family of relationships and factors, but the model architecture itself was not investigated beyond the surface considerations.
Despite substantial contributions from team members, including analytical demonstrations of selected factors and air combat relationships, no consensus had been achieved. On the contrary, there was a growing sense of need to abandon traditional modeling approaches in favor of a new application of the “Dupuy Method” based on a solid body of air combat data from WWII.
The Dupuy approach to modeling land combat relied heavily on the ratio of force strengths (largely determined by firepower as modified by other factors). After almost a year of investigations by the AMHDS team, it was beginning to appear that air combat differed in a fundamental way from ground combat. The essence of the difference is that in air combat, the outcome of the maneuver battle for platform position must be determined before the firepower relationships may be brought to bear on the battle outcome.
At the time of Dupuy’s death, it was apparent that if the study contract was to yield a meaningful product, an immediate choice of analysis thrust was required. Shortly prior to Dupuy’s death, I and other members of the TDI team recommended that we adopt the overall approach, level of aggregation, and analytical complexity that had characterized Dupuy’s models of land combat. We also agreed on the time-sequenced predominance of the maneuver phase of air combat. When I was asked to take the analytical lead for the contact in Dupuy’s absence, I was reasonably confident that there was overall agreement.
In view of the time available to prepare a deliverable product, it was decided to prepare a model using the air combat data we had been evaluating up to that point—June 1995. Fortunately, Robert Shaw had developed a set of preliminary analysis relationships that could be used in an initial assessment of the maneuver/firepower relationship. In view of the analytical, logistic, contractual, and time factors discussed, we decided to complete the contract effort based on the following analytical thrust:
The contract deliverable would be based on the maneuver/firepower analysis approach as currently formulated in Robert Shaw’s performance equations;
A spreadsheet formulation of outcomes for selected (Battle of Britain) engagements would be presented to the customer in August 1995;
To the extent practical, a working model would be provided to the customer with suggestions for further development.
During the following six weeks, the demonstration model was constructed. The model (programmed for a Lotus 1-2-3 style spreadsheet formulation) was developed, mechanized, and demonstrated to ACSC in August 1995. The final report was delivered in September of 1995.
The working model demonstrated to ACSC in August 1995 suggests the following observations:
A substantial contribution to the understanding of air combat modeling has been achieved.
While relationships developed in the Dupuy Air Combat Model (DACM) are not fully mature, they are analytically significant.
The approach embodied in DACM derives its authenticity from the famous “Dupuy Method” thus ensuring its strong correlations with actual combat data.
Although demonstrated only for air combat in the Battle of Britain, the methodology is fully capable of incorporating modem technology contributions to sensor, command and control, and firepower performance.
The knowledge base, fundamental performance relationships, and methodology contributions embodied in DACM are worthy of further exploration. They await only the expression of interest and a relatively modest investment to extend the analysis methodology into modem air combat and the engagements anticipated for the 21st Century.
One final observation seems appropriate. The DACM demonstration provided to ACSC in August 1995 should not be dismissed as a perhaps interesting, but largely simplistic approach to air combat modeling. It is a significant contribution to the understanding of air combat relationships that will prevail in the 21st Century. The Dupuy Institute is convinced that further development of DACM makes eminent good sense. An exploitation of the maneuver and firepower relationships already demonstrated in DACM will provide a valid basis for modeling air combat with modern technology sensors, control mechanisms, and weapons. It is appropriate to include the Dupuy name in the title of this latest in a series of distinguished combat models. Trevor would be pleased.
“If we maintain our faith in God, love of freedom, and superior global airpower, the future [of the US] looks good.” — U.S. Air Force General Curtis E. LeMay (Commander, U.S. Strategic Command, 1948-1957)
Curtis LeMay was involved in the formation of RAND Corporation after World War II. RAND created several models to measure the dynamics of the US-China military balance over time. Since 1996, this has been computed for two scenarios, differing by range from mainland China: one over Taiwan and the other over the Spratly Islands. The results of the model results for selected years can be seen in the graphic below.
The capabilities listed in the RAND study are interesting, notable in that the air superiority category, rough parity exists as of 2017. Also, the ability to attack air bases has given an advantage to the Chinese forces.
Investigating the methodology used does not yield any precise quantitative modeling examples, as would be expected in a rigorous academic effort, although there is some mention of statistics, simulation and historical examples.
The analysis presented here necessarily simplifies a great number of conflict characteristics. The emphasis throughout is on developing and assessing metrics in each area that provide a sense of the level of difficulty faced by each side in achieving its objectives. Apart from practical limitations, selectivity is driven largely by the desire to make the work transparent and replicable. Moreover, given the complexities and uncertainties in modern warfare, one could make the case that it is better to capture a handful of important dynamics than to present the illusion of comprehensiveness and precision. All that said, the analysis is grounded in recognized conclusions from a variety of historical sources on modern warfare, from the air war over Korea and Vietnam to the naval conflict in the Falklands and SAM hunting in Kosovo and Iraq. [Emphasis added].
We coded most of the scorecards (nine out of ten) using a five-color stoplight scheme to denote major or minor U.S. advantage, a competitive situation, or major or minor Chinese advantage. Advantage, in this case, means that one side is able to achieve its primary objectives in an operationally relevant time frame while the other side would have trouble in doing so. [Footnote] For example, even if the U.S. military could clear the skies of Chinese escort fighters with minimal friendly losses, the air superiority scorecard could be coded as “Chinese advantage” if the United States cannot prevail while the invasion hangs in the balance. If U.S. forces cannot move on to focus on destroying attacking strike and bomber aircraft, they cannot contribute to the larger mission of protecting Taiwan.
All of the dynamic modeling methodology (which involved a mix of statistical analysis, Monte Carlo simulation, and modified Lanchester equations) is publicly available and widely used by specialists at U.S. and foreign civilian and military universities.” [Emphasis added].
As TDI has contended before, the problem with using Lanchester’s equations is that, despite numerous efforts, no one has been able to demonstrate that they accurately represent real-world combat. So, even with statistics and simulation, how good are the results if they have relied on factors or force ratios with no relation to actual combat?
What about new capabilities?
As previously posted, the Kratos Mako Unmanned Combat Aerial Vehicle (UCAV), marketed as the “unmanned wingman,” has recently been cleared for export by the U.S. State Department. This vehicle is specifically oriented towards air-to-air combat, is stated to have unparalleled maneuverability, as it need not abide by limits imposed by human physiology. The Mako “offers fighter-like performance and is designed to function as a wingman to manned aircraft, as a force multiplier in contested airspace, or to be deployed independently or in groups of UASs. It is capable of carrying both weapons and sensor systems.” In addition, the Mako has the capability to be launched independently of a runway, as illustrated below. The price for these vehicles is three million each, dropping to two million each for an order of at least 100 units. Assuming a cost of $95 million for an F-35A, we can imagine a hypothetical combat scenario pitting two F-35As up against 100 of these Mako UCAVs in a drone swarm; a great example of the famous phrase, quantity has a quality all its own.
How to evaluate the effects of these possible UCAV drone swarms?
In building up towards the analysis of all of these capabilities in the full theater, campaign level conflict, some supplemental wargaming may be useful. One game that takes a good shot at modeling these dynamics is Asian Fleet. This is a part of the venerable Fleet Series, published by Victory Games, designed by Joseph Balkoski to model modern (that is Cold War) naval combat. This game system has been extended in recent years, originally by Command Magazine Japan, and then later by Technical Term Gaming Company.