Tag combat models

TDI Friday Read: The Lanchester Equations

Frederick W. Lanchester (1868-1946), British engineer and author of the Lanchester combat attrition equations. [Lanchester.com]

Today’s edition of TDI Friday Read addresses the Lanchester equations and their use in U.S. combat models and simulations. In 1916, British engineer Frederick W. Lanchester published a set of calculations he had derived for determining the results of attrition in combat. Lanchester intended them to be applied as an abstract conceptualization of aerial combat, stating that he did not believe they were applicable to ground combat.

Due to their elegant simplicity, U.S. military operations researchers nevertheless began incorporating the Lanchester equations into their land warfare computer combat models and simulations in the 1950s and 60s. The equations are the basis for many models and simulations used throughout the U.S. defense community today.

The problem with using Lanchester’s equations is that, despite numerous efforts, no one has been able to demonstrate that they accurately represent real-world combat.

Lanchester equations have been weighed….

Really…..Lanchester?

Trevor Dupuy was critical of combat models based on the Lanchester equations because they cannot account for the role behavioral and moral (i.e. human) factors play in combat.

Human Factors In Warfare: Interaction Of Variable Factors

He was also critical of models and simulations that had not been tested to see whether they could reliably represent real-world combat experience. In the modeling and simulation community, this sort of testing is known as validation.

Military History and Validation of Combat Models

The use of unvalidated concepts, like the Lanchester equations, and unvalidated combat models and simulations persists. Critics have dubbed this the “base of sand” problem, and it continues to affect not only models and simulations, but all abstract theories of combat, including those represented in military doctrine.

https://dupuyinstitute.dreamhosters.com/2017/04/10/wargaming-multi-domain-battle-the-base-of-sand-problem/

How Does the U.S. Army Calculate Combat Power? ¯\_(ツ)_/¯

The constituents of combat power as described in current U.S. military doctrine. [The Lightning Press]

One of the fundamental concepts of U.S. warfighting doctrine is combat power. The current U.S. Army definition is “the total means of destructive, constructive, and information capabilities that a military unit or formation can apply at a given time. (ADRP 3-0).” It is the construct commanders and staffs are taught to use to assess the relative effectiveness of combat forces and is woven deeply throughout all aspects of U.S. operational thinking.

To execute operations, commanders conceptualize capabilities in terms of combat power. Combat power has eight elements: leadership, information, mission command, movement and maneuver, intelligence, fires, sustainment, and protection. The Army collectively describes the last six elements as the warfighting functions. Commanders apply combat power through the warfighting functions using leadership and information. [ADP 3-0, Operations]

Yet, there is no formal method in U.S. doctrine for estimating combat power. The existing process is intentionally subjective and largely left up to judgment. This is problematic, given that assessing the relative combat power of friendly and opposing forces on the battlefield is the first step in Course of Action (COA) development, which is at the heart of the U.S. Military Decision-Making Process (MDMP). Estimates of combat power also figure heavily in determining the outcomes of wargames evaluating proposed COAs.

The Existing Process

The Army’s current approach to combat power estimation is outlined in Field Manual (FM) 6-0 Commander and Staff Organization and Operations (2014). Planners are instructed to “make a rough estimate of force ratios of maneuver units two levels below their echelon.” They are then directed to “compare friendly strengths against enemy weaknesses, and vice versa, for each element of combat power.” It is “by analyzing force ratios and determining and comparing each force’s strengths and weaknesses as a function of combat power” that planners gain insight into tactical and operational capabilities, perspectives, vulnerabilities, and required resources.

That is it. Planners are told that “although the process uses some numerical relationships, the estimate is largely subjective. Assessing combat power requires assessing both tangible and intangible factors, such as morale and levels of training.” There is no guidance as to how to determine force ratios [numbers of troops or weapons systems?]. Nor is there any description of how to relate force calculations to combat power. Should force strengths be used somehow to determine a combat power value? Who knows? No additional doctrinal or planning references are provided.

Planners then use these subjective combat power assessments as they shape potential COAs and test them through wargaming. Although explicitly warned not to “develop and recommend COAs based solely on mathematical analysis of force ratios,” they are invited at this stage to consult a table of “minimum historical planning ratios as a starting point.” The table is clearly derived from the ubiquitous 3-1 rule of combat. Contrary to what FM 6-0 claims, neither the 3-1 rule nor the table have a clear historical provenance or any sort of empirical substantiation. There is no proven validity to any of the values cited. It is not even clear whether the “historical planning ratios” apply to manpower, firepower, or combat power.

During this phase, planners are advised to account for “factors that are difficult to gauge, such as impact of past engagements, quality of leaders, morale, maintenance of equipment, and time in position. Levels of electronic warfare support, fire support, close air support, civilian support, and many other factors also affect arraying forces.” FM 6-0 offers no detail as to how these factors should be measured or applied, however.

FM 6-0 also addresses combat power assessment for stability and civil support operations through troop-to-task analysis. Force requirements are to be based on an estimate of troop density, a “ratio of security forces (including host-nation military and police forces as well as foreign counterinsurgents) to inhabitants.” The manual advises that most “most density recommendations fall within a range of 20 to 25 counterinsurgents for every 1,000 residents in an area of operations. A ratio of twenty counterinsurgents per 1,000 residents is often considered the minimum troop density required for effective counterinsurgency operations.”

While FM 6-0 acknowledges that “as with any fixed ratio, such calculations strongly depend on the situation,” it does not mention that any references to force level requirements, tie-down ratios, or troop density were stripped from both Joint and Army counterinsurgency manuals in 2013 and 2014. Yet, this construct lingers on in official staff planning doctrine. (Recent research challenged the validity of the troop density construct but the Defense Department has yet to fund any follow-on work on the subject.)

The Army Has Known About The Problem For A Long Time

The Army has tried several solutions to the problem of combat power estimation over the years. In the early 1970s, the U.S. Army Center for Army Analysis (CAA; known then as the U.S. Army Concepts & Analysis Agency) developed the Weighted Equipment Indices/Weighted Unit Value (WEI/WUV or “wee‑wuv”) methodology for calculating the relative firepower of different combat units. While WEI/WUV’s were soon adopted throughout the Defense Department, the subjective nature of the method gradually led it to be abandoned for official use.

In the 1980s and 1990s, the U.S. Army Command & General Staff College (CGSC) published the ST 100-9 and ST 100-3 student workbooks that contained tables of planning factors that became the informal basis for calculating combat power in staff practice. The STs were revised regularly and then adapted into spreadsheet format in the late 1990s. The 1999 iteration employed WEI/WEVs as the basis for calculating firepower scores used to estimate force ratios. CGSC stopped updating the STs in the early 2000s, as the Army focused on irregular warfare.

With the recently renewed focus on conventional conflict, Army staff planners are starting to realize that their planning factors are out of date. In an attempt to fill this gap, CGSC developed a new spreadsheet tool in 2012 called the Correlation of Forces (COF) calculator. It apparently drew upon analysis done by the U.S. Army Training and Doctrine Command Analysis Center (TRAC) in 2004 to establish new combat unit firepower scores. (TRAC’s methodology is not clear, but if it is based on this 2007 ISMOR presentation, the scores are derived from runs by an unspecified combat model modified by factors derived from the Army’s unit readiness methodology. If described accurately, this would not be an improvement over WEI/WUVs.)

The COF calculator continues to use the 3-1 force ratio tables. It also incorporates a table for estimating combat losses based on force ratios (this despite ample empirical historical analysis showing that there is no correlation between force ratios and casualty rates).

While the COF calculator is not yet an official doctrinal product, CGSC plans to add Marine Corps forces to it for use as a joint planning tool and to incorporate it into the Army’s Command Post of the Future (CPOF). TRAC is developing a stand-alone version for use by force developers.

The incorporation of unsubstantiated and unvalidated concepts into Army doctrine has been a long standing problem. In 1976, Huba Wass de Czege, then an Army major, took both “loosely structured and unscientific analysis” based on intuition and experience and simple counts of gross numbers to task as insufficient “for a clear and rigorous understanding of combat power in a modern context.” He proposed replacing it with a analytical framework for analyzing combat power that accounted for both measurable and intangible factors. Adopting a scrupulous method and language would overcome the simplistic tactical analysis then being taught. While some of the essence of Wass de Czege’s approach has found its way into doctrinal thinking, his criticism of the lack of objective and thorough analysis continues to echo (here, here, and here, for example).

Despite dissatisfaction with the existing methods, little has changed. The problem with this should be self-evident, but I will give the U.S. Naval War College the final word here:

Fundamentally, all of our approaches to force-on-force analysis are underpinned by theories of combat that include both how combat works and what matters most in determining the outcomes of engagements, battles, campaigns, and wars. The various analytical methods we use can shed light on the performance of the force alternatives only to the extent our theories of combat are valid. If our theories are flawed, our analytical results are likely to be equally wrong.

TDI Friday Read: The Validity Of The 3-1 Rule Of Combat

Canadian soldiers going “over the top” during the First World War. [History.com]

Today’s edition of TDI Friday Read addresses the question of force ratios in combat. How many troops are needed to successfully attack or defend on the battlefield? There is a long-standing rule of thumb that holds that an attacker requires a 3-1 preponderance over a defender in combat in order to win. The aphorism is so widely accepted that few have questioned whether it is actually true or not.

Trevor Dupuy challenged the validity of the 3-1 rule on empirical grounds. He could find no historical substantiation to support it. In fact, his research on the question of force ratios suggested that there was a limit to the value of numerical preponderance on the battlefield.

Trevor Dupuy and the 3-1 Rule

Human Factors In Warfare: Diminishing Returns In Combat

TDI President Chris Lawrence has also challenged the 3-1 rule in his own work on the subject.

Force Ratios in Conventional Combat

The 3-to-1 Rule in Histories

Aussie OR

Comparing Force Ratios to Casualty Exchange Ratios

The validity of the 3-1 rule is no mere academic question. It underpins a great deal of U.S. military policy and warfighting doctrine. Yet, the only time the matter was seriously debated was in the 1980s with reference to the problem of defending Western Europe against the threat of Soviet military invasion.

The Great 3-1 Rule Debate

It is probably long past due to seriously challenge the validity and usefulness of the 3-1 rule again.

Command and Combat Effectiveness: The Case of the British 51st Highland Division

Soldiers of the British 51st Highland Division take cover in bocage in Normandy, 1944. [Daily Record (UK)]

While Trevor Dupuy’s concept of combat effectiveness has been considered controversial by some, he was hardly the only one to observe that throughout history, some military forces have fought more successfully on the battlefield than others. While the sources of victory and defeat in battle remain a fertile, yet understudied topic, there is a growing literature on the topic of military effectiveness in the fields of strategic and security studies.

Anthony King, a professor in War Studies at the University of Warwick, has published an outstanding article in the most recent edition of British Journal of Military History, “Why did 51st Highland Division Fail? A case-study in command and combat effectiveness.” In it, he examined military command and combat effectiveness through the experience of the British 51st Highland Division in the 1944 Normandy Campaign. Most usefully, King developed a definition of military command that clarifies its relationship to combat effectiveness: “The function of a commander is to maximise combat power by defining achievable missions and, then, orchestrating subordinates into a cohesive whole committed to mission accomplishment.”

Defining Military Command

In order to analyze the relationship between command and combat effectiveness, King sought to “define the concept of command and to specify its relationship to management and leadership.” The construct he developed drew upon the work of Peter Drucker, an Austrian-born American business consultant and writer who is considered by many to be “the founder of modern management.” From Drucker, King distilled a definition of the function and process of military command: “command always consists of three elements: mission definition, mission management and mission motivation.”

As King explained, “When command is understood in this way, its connection to combat effectiveness begins to become clear.”

[C]ommand is an institutional solution to an organizational problem; it generates cohesion in a formation. Specifically, by uniting decision-making authority in one person and one role, a large military force is able to unite subordinate units, whose troops are not co-present with each other and who, in most cases, do not know each other. Crucially, the combat effectiveness of a formation, as a formation, is substantially dependent upon the ability of its commander to synchronise its disparate efforts in order to generate collective effects. Skillful command has a galvanising influence on a military force; by orchestrating the activities of subordinate units and motivating troops, command is able to create a level of combat power, which supervenes the capabilities of each of the parts. A well-commanded force has properties, which exceed those of its constituent units, fighting alone.

It is through the orchestration, synchronization, and motivation of effort, King concluded, that “command and combat effectiveness are immediately connected. Command fuses a formation together and increases its determination to fulfil its missions.”

Assessing the Combat Effectiveness of the 51st Division

The rest of King’s article is a detailed assessment of the combat effectiveness of the 51st Highland Division in Normandy in June and July 1944 using this military command construct. Observers at the time noted a decline in the division’s combat performance, which had been graded quite highly in North Africa and Sicily. The one obvious difference was the replacement of Major General Douglas Wimberley with Major General Charles Bullen-Smith in August 1943. After concluding that the 51st Division was no longer battleworthy, the commander of the British 21st Army Group, General Bernard Montgomery personally relieved Bullen-Smith in late July 1944.

In reviewing Bullen-Smith’s performance, King concluded that

Although a number of factors contributed to the struggles of the Highland Division in Normandy, there is little doubt that the shortcomings of its commander, Major General Charles Bullen-Smith, were the critical factor. Charles Bullen-Smith failed to fulfill the three essential functions required of a commander… Bullen-Smith’s inadequacies are highly suggestive of a direct relationship between command and combat effectiveness; they demonstrate how command can augment or undermine combat performance.

King’s approach to military studies once again demonstrates the relevance of multi-disciplinary analysis based on solid historical research. His military command model should prove to be a very useful tool for analyzing the elements of combat effectiveness and assessing combat power. Along with Dr. Jonathan Fennell’s work on measuring morale, among others, it appears that good progress is being made on the study of human factors in combat and military operations, at least in the British academic community (even if Tom Ricks thinks otherwise).

Validating Trevor Dupuy’s Combat Models

[The article below is reprinted from Winter 2010 edition of The International TNDM Newsletter.]

A Summation of QJM/TNDM Validation Efforts

By Christopher A. Lawrence

There have been six or seven different validation tests conducted of the QJM (Quantified Judgment Model) and the TNDM (Tactical Numerical Deterministic Model). As the changes to these two models are evolutionary in nature but do not fundamentally change the nature of the models, the whole series of validation tests across both models is worth noting. To date, this is the only model we are aware of that has been through multiple validations. We are not aware of any DOD [Department of Defense] combat model that has undergone more than one validation effort. Most of the DOD combat models in use have not undergone any validation.

The Two Original Validations of the QJM

After its initial development using a 60-engagement WWII database, the QJM was tested in 1973 by application of its relationships and factors to a validation database of 21 World War II engagements in Northwest Europe in 1944 and 1945. The original model proved to be 95% accurate in explaining the outcomes of these additional engagements. Overall accuracy in predicting the results of the 81 engagements in the developmental and validation databases was 93%.[1]

During the same period the QJM was converted from a static model that only predicted success or failure to one capable of also predicting attrition and movement. This was accomplished by adding variables and modifying factor values. The original QJM structure was not changed in this process. The addition of movement and attrition as outputs allowed the model to be used dynamically in successive “snapshot” iterations of the same engagement.

From 1973 to 1979 the QJM’s formulae, procedures, and variable factor values were tested against the results of all of the 52 significant engagements of the 1967 and 1973 Arab-Israeli Wars (19 from the former, 33 from the latter). The QJM was able to replicate all of those engagements with an accuracy of more than 90%?[2]

In 1979 the improved QJM was revalidated by application to 66 engagements. These included 35 from the original 81 engagements (the “development database”), and 31 new engagements. The new engagements included five from World War II and 26 from the 1973 Middle East War. This new validation test considered four outputs: success/failure, movement rates, personnel casualties, and tank losses. The QJM predicted success/failure correctly for about 85% of the engagements. It predicted movement rates with an error of 15% and personnel attrition with an error of 40% or less. While the error rate for tank losses was about 80%, it was discovered that the model consistently underestimated tank losses because input data included all kinds of armored vehicles, but output data losses included only numbers of tanks.[3]

This completed the original validations efforts of the QJM. The data used for the validations, and parts of the results of the validation, were published, but no formal validation report was issued. The validation was conducted in-house by Colonel Dupuy’s organization, HERO [Historical Evaluation Research Organization]. The data used were mostly from division-level engagements, although they included some corps- and brigade-level actions. We count these as two separate validation efforts.

The Development of the TNDM and Desert Storm

In 1990 Col. Dupuy, with the collaborative assistance of Dr. James G. Taylor (author of Lanchester Models of Warfare [vol. 1] [vol. 2], published by the Operations Research Society of America, Arlington, Virginia, in 1983) introduced a significant modification: the representation of the passage of time in the model. Instead of resorting to successive “snapshots,” the introduction of Taylor’s differential equation technique permitted the representation of time as a continuous flow. While this new approach required substantial changes to the software, the relationship of the model to historical experience was unchanged.[4] This revision of the model also included the substitution of formulae for some of its tables so that there was a continuous flow of values across the individual points in the tables. It also included some adjustment to the values and tables in the QJM. Finally, it incorporated a revised OLI [Operational Lethality Index] calculation methodology for modem armor (mobile fighting machines) to take into account all the factors that influence modern tank warfare.[5] The model was reprogrammed in Turbo PASCAL (the original had been written in BASIC). The new model was called the TNDM (Tactical Numerical Deterministic Model).

Building on its foundation of historical validation and proven attrition methodology, in December 1990, HERO used the TNDM to predict the outcome of, and losses from, the impending Operation DESERT STORM.[6] It was the most accurate (lowest) public estimate of U.S. war casualties provided before the war. It differed from most other public estimates by an order of magnitude.

Also, in 1990, Trevor Dupuy published an abbreviated form of the TNDM in the book Attrition: Forecasting Battle Casualties and Equipment Losses in Modern War. A brief validation exercise using 12 battles from 1805 to 1973 was published in this book.[7] This version was used for creation of M-COAT[8] and was also separately tested by a student (Lieutenant Gozel) at the Naval Postgraduate School in 2000.[9] This version did not have the firepower scoring system, and as such neither M-COAT, Lieutenant Gozel’s test, nor Colonel Dupuy’s 12-battle validation included the OLI methodology that is in the primary version of the TNDM.

For counting purposes, I consider the Gulf War the third validation of the model. In the end, for any model, the proof is in the pudding. Can the model be used as a predictive tool or not? If not, then there is probably a fundamental flaw or two in the model. Still the validation of the TNDM was somewhat second-hand, in the sense that the closely-related previous model, the QJM, was validated in the 1970s to 200 World War II and 1967 and 1973 Arab-Israeli War battles, but the TNDM had not been. Clearly, something further needed to be done.

The Battalion-Level Validation of the TNDM

Under the guidance of Christopher A. Lawrence, The Dupuy Institute undertook a battalion-level validation of the TNDM in late 1996. This effort tested the model against 76 engagements from World War I, World War II, and the post-1945 world including Vietnam, the Arab-Israeli Wars, the Falklands War, Angola, Nicaragua, etc. This effort was thoroughly documented in The International TNDM Newsletter.[10] This effort was probably one of the more independent and better-documented validations of a casualty estimation methodology that has ever been conducted to date, in that:

  • The data was independently assembled (assembled for other purposes before the validation) by a number of different historians.
  • There were no calibration runs or adjustments made to the model before the test.
  • The data included a wide range of material from different conflicts and times (from 1918 to 1983).
  • The validation runs were conducted independently (Susan Rich conducted the validation runs, while Christopher A. Lawrence evaluated them).
  • The results of the validation were fully published.
  • The people conducting the validation were independent, in the sense that:

a) there was no contract, management, or agency requesting the validation;
b) none of the validators had previously been involved in designing the model, and had only very limited experience in using it; and
c) the original model designer was not able to oversee or influence the validation.[11]

The validation was not truly independent, as the model tested was a commercial product of The Dupuy Institute, and the person conducting the test was an employee of the Institute. On the other hand, this was an independent effort in the sense that the effort was employee-initiated and not requested or reviewed by the management of the Institute. Furthermore, the results were published.

The TNDM was also given a limited validation test back to its original WWII data around 1997 by Niklas Zetterling of the Swedish War College, who retested the model to about 15 or so Italian campaign engagements. This effort included a complete review of the historical data used for the validation back to their primarily sources, and details were published in The International TNDM Newsletter.[12]

There has been one other effort to correlate outputs from QJM/TNDM-inspired formulae to historical data using the Ardennes and Kursk campaign-level (i.e., division-level) databases.[13] This effort did not use the complete model, but only selective pieces of it, and achieved various degrees of “goodness of fit.” While the model is hypothetically designed for use from squad level to army group level, to date no validation has been attempted below battalion level, or above division level. At this time, the TNDM also needs to be revalidated back to its original WWII and Arab-Israeli War data, as it has evolved since the original validation effort.

The Corps- and Division-level Validations of the TNDM

Having now having done one extensive battalion-level validation of the model and published the results in our newsletters, Volume 1, issues 5 and 6, we were then presented an opportunity in 2006 to conduct two more validations of the model. These are discussed in depth in two articles of this issue of the newsletter.

These validations were again conducted using historical data, 24 days of corps-level combat and 25 cases of division-level combat drawn from the Battle of Kursk during 4-15 July 1943. It was conducted using an independently-researched data collection (although the research was conducted by The Dupuy Institute), using a different person to conduct the model runs (although that person was an employee of the Institute) and using another person to compile the results (also an employee of the Institute). To summarize the results of this validation (the historical figure is listed first followed by the predicted result):

There was one other effort that was done as part of work we did for the Army Medical Department (AMEDD). This is fully explained in our report Casualty Estimation Methodologies Study: The Interim Report dated 25 July 2005. In this case, we tested six different casualty estimation methodologies to 22 cases. These consisted of 12 division-level cases from the Italian Campaign (4 where the attack failed, 4 where the attacker advanced, and 4 Where the defender was penetrated) and 10 cases from the Battle of Kursk (2 cases Where the attack failed, 4 where the attacker advanced and 4 where the defender was penetrated). These 22 cases were randomly selected from our earlier 628 case version of the DLEDB (Division-level Engagement Database; it now has 752 cases). Again, the TNDM performed as well as or better than any of the other casualty estimation methodologies tested. As this validation effort was using the Italian engagements previously used for validation (although some had been revised due to additional research) and three of the Kursk engagements that were later used for our division-level validation, then it is debatable whether one would want to call this a seventh validation effort. Still, it was done as above with one person assembling the historical data and another person conducting the model runs. This effort was conducted a year before the corps and division-level validation conducted above and influenced it to the extent that we chose a higher CEV (Combat Effectiveness Value) for the later validation. A CEV of 2.5 was used for the Soviets for this test, vice the CEV of 3.0 that was used for the later tests.

Summation

The QJM has been validated at least twice. The TNDM has been tested or validated at least four times, once to an upcoming, imminent war, once to battalion-level data from 1918 to 1989, once to division-level data from 1943 and once to corps-level data from 1943. These last four validation efforts have been published and described in depth. The model continues, regardless of which validation is examined, to accurately predict outcomes and make reasonable predictions of advance rates, loss rates and armor loss rates. This is regardless of level of combat (battalion, division or corps), historic period (WWI, WWII or modem), the situation of the combats, or the nationalities involved (American, German, Soviet, Israeli, various Arab armies, etc.). As the QJM, the model was effectively validated to around 200 World War II and 1967 and 1973 Arab-Israeli War battles. As the TNDM, the model was validated to 125 corps-, division-, and battalion-level engagements from 1918 to 1989 and used as a predictive model for the 1991 Gulf War. This is the most extensive and systematic validation effort yet done for any combat model. The model has been tested and re-tested. It has been tested across multiple levels of combat and in a wide range of environments. It has been tested where human factors are lopsided, and where human factors are roughly equal. It has been independently spot-checked several times by others outside of the Institute. It is hard to say what more can be done to establish its validity and accuracy.

NOTES

[1] It is unclear what these percentages, quoted from Dupuy in the TNDM General Theoretical Description, specify. We suspect it is a measurement of the model’s ability to predict winner and loser. No validation report based on this effort was ever published. Also, the validation figures seem to reflect the results after any corrections made to the model based upon these tests. It does appear that the division-level validation was “incremental.” We do not know if the earlier validation tests were tested back to the earlier data, but we have reason to suspect not.

[2] The original QJM validation data was first published in the Combat Data Subscription Service Supplement, vol. 1, no. 3 (Dunn Loring VA: HERO, Summer 1975). (HERO Report #50) That effort used data from 1943 through 1973.

[3] HERO published its QJM validation database in The QJM Data Base (3 volumes) Fairfax VA: HERO, 1985 (HERO Report #100).

[4] The Dupuy Institute, The Tactical Numerical Deterministic Model (TNDM): A General and Theoretical Description, McLean VA: The Dupuy Institute, October 1994.

[5] This had the unfortunate effect of undervaluing WWII-era armor by about 75% relative to other WWII weapons when modeling WWII engagements. This left The Dupuy Institute with the compromise methodology of using the old OLI method for calculating armor (Mobile Fighting Machines) when doing WWII engagements and using the new OLI method for calculating armor when doing modem engagements

[6] Testimony of Col. T. N. Dupuy, USA, Ret, Before the House Armed Services Committee, 13 Dec 1990. The Dupuy Institute File I-30, “Iraqi Invasion of Kuwait.”

[7] Trevor N. Dupuy, Attrition: Forecasting Battle Casualties and Equipment Losses in Modern War (HERO Books, Fairfax, VA, 1990), 123-4.

[8] M-COAT is the Medical Course of Action Tool created by Major Bruce Shahbaz. It is a spreadsheet model based upon the elements of the TNDM provided in Dupuy’s Attrition (op. cit.) It used a scoring system derived from elsewhere in the U.S. Army. As such, it is a simplified form of the TNDM with a different weapon scoring system.

[9] See Gözel, Ramazan. “Fitting Firepower Score Models to the Battle of Kursk Data,” NPGS Thesis. Monterey CA: Naval Postgraduate School.

[10] Lawrence, Christopher A. “Validation of the TNDM at Battalion Level.” The International TNDM Newsletter, vol. 1, no. 2 (October 1996); Bongard, Dave “The 76 Battalion-Level Engagements.” The International TNDM Newsletter, vol. 1, no. 4 (February 1997); Lawrence, Christopher A. “The First Test of the TNDM Battalion-Level Validations: Predicting the Winner” and “The Second Test of the TNDM Battalion-Level Validations: Predicting Casualties,” The International TNDM Newsletter, vol. 1 no. 5 (April 1997); and Lawrence, Christopher A. “Use of Armor in the 76 Battalion-Level Engagements,” and “The Second Test of the Battalion-Level Validation: Predicting Casualties Final Scorecard.” The International TNDM Newsletter, vol. 1, no. 6 (June 1997).

[11] Trevor N. Dupuy passed away in July 1995, and the validation was conducted in 1996 and 1997.

[12] Zetterling, Niklas. “CEV Calculations in Italy, 1943,” The International TNDM Newsletter, vol. 1, no. 6. McLean VA: The Dupuy Institute, June 1997. See also Research Plan, The Dupuy Institute Report E-3, McLean VA: The Dupuy Institute, 7 Oct 1998.

[13] See Gözel, “Fitting Firepower Score Models to the Battle of Kursk Data.”

TDI Friday Read: Principles Of War & Verities Of Combat

[izquotes.com]

Trevor Dupuy distilled his research and analysis on combat into a series of verities, or what he believed were empirically-derived principles. He intended for his verities to complement the classic principles of war, a slightly variable list of maxims of unknown derivation and provenance, which describe the essence of warfare largely from the perspective of Western societies. These are summarized below.

What Is The Best List Of The Principles Of War?

The Timeless Verities of Combat

Trevor N. Dupuy’s Combat Attrition Verities

Trevor Dupuy’s Combat Advance Rate Verities

Military History and Validation of Combat Models

Soldiers from Britain’s Royal Artillery train in a “virtual world” during Exercise Steel Sabre, 2015 [Sgt Si Longworth RLC (Phot)/MOD]

Military History and Validation of Combat Models

A Presentation at MORS Mini-Symposium on Validation, 16 Oct 1990

By Trevor N. Dupuy

In the operations research community there is some confusion as to the respective meanings of the words “validation” and “verification.” My definition of validation is as follows:

“To confirm or prove that the output or outputs of a model are consistent with the real-world functioning or operation of the process, procedure, or activity which the model is intended to represent or replicate.”

In this paper the word “validation” with respect to combat models is assumed to mean assurance that a model realistically and reliably represents the real world of combat. Or, in other words, given a set of inputs which reflect the anticipated forces and weapons in a combat encounter between two opponents under a given set of circumstances, the model is validated if we can demonstrate that its outputs are likely to represent what would actually happen in a real-world encounter between these forces under those circumstances

Thus, in this paper, the word “validation” has nothing to do with the correctness of computer code, or the apparent internal consistency or logic of relationships of model components, or with the soundness of the mathematical relationships or algorithms, or with satisfying the military judgment or experience of one individual.

True validation of combat models is not possible without testing them against modern historical combat experience. And so, in my opinion, a model is validated only when it will consistently replicate a number of military history battle outcomes in terms of: (a) Success-failure; (b) Attrition rates; and (c) Advance rates.

“Why,” you may ask, “use imprecise, doubtful, and outdated history to validate a modem, scientific process? Field tests, experiments, and field exercises can provide data that is often instrumented, and certainly more reliable than any historical data.”

I recognize that military history is imprecise; it is only an approximate, often biased and/or distorted, and frequently inconsistent reflection of what actually happened on historical battlefields. Records are contradictory. I also recognize that there is an element of chance or randomness in human combat which can produce different results in otherwise apparently identical circumstances. I further recognize that history is retrospective, telling us only what has happened in the past. It cannot predict, if only because combat in the future will be fought with different weapons and equipment than were used in historical combat.

Despite these undoubted problems, military history provides more, and more accurate information about the real world of combat, and how human beings behave and perform under varying circumstances of combat, than is possible to derive or compile from arty other source. Despite some discrepancies, patterns are unmistakable and consistent. There is always a logical explanation for any individual deviations from the patterns. Historical examples that are inconsistent, or that are counter-intuitive, must be viewed with suspicion as possibly being poor or false history.

Of course absolute prediction of a future event is practically impossible, although not necessarily so theoretically. Any speculations which we make from tests or experiments must have some basis in terms of projections from past experience.

Training or demonstration exercises, proving ground tests, field experiments, all lack the one most pervasive and most important component of combat: Fear in a lethal environment. There is no way in peacetime, or non-battlefield, exercises, test, or experiments to be sure that the results are consistent with what would have been the behavior or performance of individuals or units or formations facing hostile firepower on a real battlefield.

We know from the writings of the ancients (for instance Sun Tze—pronounced Sun Dzuh—and Thucydides) that have survived to this day that human nature has not changed since the dawn of history. The human factor the way in which humans respond to stimuli or circumstances is the most important basis for speculation and prediction. What about the “scientific” approach of those who insist that we cart have no confidence in the accuracy or reliability of historical data, that it is therefore unscientific, and therefore that it should be ignored? These people insist that only “scientific” data should be used in modeling.

In fact, every model is based upon fundamental assumptions that are intuitive and unprovable. The first step in the creation of a model is a step away from scientific reality in seeking a basis for an unreal representation of a real phenomenon. I have shown that the unreality is perpetuated when we use other imitations of reality as the basis for representing reality. History is less than perfect, but to ignore it, and to use only data that is bound to be wrong, assures that we will not be able to represent human behavior in real combat.

At the risk of repetition, and even of protesting too much, let me assure you that I am well aware of the shortcomings of military history:

The record which is available to us, which is history, only approximately reflects what actually happened. It is incomplete. It is often biased, it is often distorted. Even when it is accurate, it may be reflecting chance rather than normal processes. It is neither precise nor consistent. But, it provides more, and more accurate, information on the real world of battle than is available from the most thoroughly documented field exercises, proving ground less, or laboratory or field experiments.

Military history is imperfect. At best it reflects the actions and interactions of unpredictable human beings. We must always realize that a single historical example can be misleading for either of two reasons: (1) The data may be inaccurate, or (2) The data may be accurate, but untypical.

Nevertheless, history is indispensable. I repeat that the most pervasive characteristic of combat is fear in a lethal environment. For all of its imperfections, military history and only military history represents what happens under the environmental condition of fear.

Unfortunately, and somewhat unfairly, the reported findings of S.L.A. Marshall about human behavior in combat, which he reported in Men Against Fire, have been recently discounted by revisionist historians who assert that he never could have physically performed the research on which the book’s findings were supposedly based. This has raised doubts about Marshall’s assertion that 85% of infantry soldiers didn’t fire their weapons in combat in World War ll. That dramatic and surprising assertion was first challenged in a New Zealand study which found, on the basis of painstaking interviews, that most New Zealanders fired their weapons in combat. Thus, either Americans were different from New Zealanders, or Marshall was wrong. And now American historians have demonstrated that Marshall had had neither the time nor the opportunity to conduct his battlefield interviews which he claimed were the basis for his findings.

I knew Marshall, moderately well. I was fully as aware of his weaknesses as of his strengths. He was not a historian. I deplored the imprecision and lack of documentation in Men Against Fire. But the revisionist historians have underestimated the shrewd journalistic assessment capability of “SLAM” Marshall. His observations may not have been scientifically precise, but they were generally sound, and his assessment has been shared by many American infantry officers whose judgements l also respect. As to the New Zealand study, how many people will, after the war, admit that they didn’t fire their weapons?

Perhaps most important, however, in judging the assessments of SLAM Marshall, is a recent study by a highly-respected British operations research analyst, David Rowland. Using impeccable OR methods Rowland has demonstrated that Marshall’s assessment of the inefficient performance, or non-performance, of most soldiers in combat was essentially correct. An unclassified version of Rowland’s study, “Assessments of Combat Degradation,” appeared in the June 1986 issue of the Royal United Services Institution Journal.

Rowland was led to his investigations by the fact that soldier performance in field training exercises, using the British version of MILES technology, was not consistent with historical experience. Even after allowances for degradation from theoretical proving ground capability of weapons, defensive rifle fire almost invariably stopped any attack in these field trials. But history showed that attacks were often in fact, usually successful. He therefore began a study in which he made both imaginative and scientific use of historical data from over 100 small unit battles in the Boer War and the two World Wars. He demonstrated that when troops are under fire in actual combat, there is an additional degradation of performance by a factor ranging between 10 and 7. A degradation virtually of an order of magnitude! And this, mind you, on top of a comparable built-in degradation to allow for the difference between field conditions and proving ground conditions.

Not only does Rowland‘s study corroborate SLAM Marshall’s observations, it showed conclusively that field exercises, training competitions and demonstrations, give results so different from real battlefield performance as to render them useless for validation purposes.

Which brings us back to military history. For all of the imprecision, internal contradictions, and inaccuracies inherent in historical data, at worst the deviations are generally far less than a factor of 2.0. This is at least four times more reliable than field test or exercise results.

I do not believe that history can ever repeat itself. The conditions of an event at one time can never be precisely duplicated later. But, bolstered by the Rowland study, I am confident that history paraphrases itself.

If large bodies of historical data are compiled, the patterns are clear and unmistakable, even if slightly fuzzy around the edges. Behavior in accordance with this pattern is therefore typical. As we have already agreed, sometimes behavior can be different from the pattern, but we know that it is untypical, and we can then seek for the reason, which invariably can be discovered.

This permits what l call an actuarial approach to data analysis. We can never predict precisely what will happen under any circumstances. But the actuarial approach, with ample data, provides confidence that the patterns reveal what is to happen under those circumstances, even if the actual results in individual instances vary to some extent from this “norm” (to use the Soviet military historical expression.).

It is relatively easy to take into account the differences in performance resulting from new weapons and equipment. The characteristics of the historical weapons and the current (or projected) weapons can be readily compared, and adjustments made accordingly in the validation procedure.

In the early 1960s an effort was made at SHAPE Headquarters to test the ATLAS Model against World War II data for the German invasion of Western Europe in May, 1940. The first excursion had the Allies ending up on the Rhine River. This was apparently quite reasonable: the Allies substantially outnumbered the Germans, they had more tanks, and their tanks were better. However, despite these Allied advantages, the actual events in 1940 had not matched what ATLAS was now predicting. So the analysts did a little “fine tuning,” (a splendid term for fudging). Alter the so-called adjustments, they tried again, and ran another excursion. This time the model had the Allies ending up in Berlin. The analysts (may the Lord forgive them!) were quite satisfied with the ability of ATLAS to represent modem combat. (Or at least they said so.) Their official conclusion was that the historical example was worthless, since weapons and equipment had changed so much in the preceding 20 years!

As I demonstrated in my book, Options of Command, the problem was that the model was unable to represent the German strategy, or to reflect the relative combat effectiveness of the opponents. The analysts should have reached a different conclusion. ATLAS had failed validation because a model that cannot with reasonable faithfulness and consistency replicate historical combat experience, certainly will be unable validly to reflect current or future combat.

How then, do we account for what l have said about the fuzziness of patterns, and the fact that individual historical examples may not fit the patterns? I will give you my rules of thumb:

  1. The battle outcome should reflect historical success-failure experience about four times out of five.
  2. For attrition rates, the model average of five historical scenarios should be consistent with the historical average within a factor of about 1.5.
  3. For the advance rates, the model average of five historical scenarios should be consistent with the historical average within a factor of about 1.5.

Just as the heavens are the laboratory of the astronomer, so military history is the laboratory of the soldier and the military operations research analyst. The scientific basis for both astronomy and military science is the recording of the movements and relationships of bodies, and then analysis of those movements. (In the one case the bodies are heavenly, in the other they are very terrestrial.)

I repeat: Military history is the laboratory of the soldier. Failure of the analyst to use this laboratory will doom him to live with the scientific equivalent of Ptolomean astronomy, whereas he could use the evidence available in his laboratory to progress to the military science equivalent of Copernican astronomy.

Human Factors In Warfare: Combat Effectiveness

An Israeli tank unit crosses the Sinai, heading for the Suez Canal, during the 1973 Arab-Israeli War [Israeli Government Press Office/HistoryNet]

It has been noted throughout the history of human conflict that some armies have consistently fought more effectively on the battlefield than others. The armies of Sparta in ancient Greece, for example, have come to epitomize the warrior ideal in Western societies. Rome’s legions have acquired a similar legendary reputation. Within armies too, some units are known to be superior combatants than others. The U.S. 1st Infantry Division, the British Expeditionary Force of 1914, Japan’s Special Naval Landing Forces, the U.S. Marine Corps, the German 7th Panzer Division, and the Soviet Guards divisions are among the many superior fighting forces from history.

Trevor Dupuy found empirical substantiation of this in his analysis of historical combat data. He discovered that in 1943-1944 during World War II, after accounting for environmental and operational factors, the German Army consistently performed more effectively in ground combat than the U.S. and British armies. This advantage—measured in terms of casualty exchanges, terrain held or lost, and mission accomplishment—manifested whether the Germans were attacking or defending, or winning or losing. Dupuy observed that the Germans demonstrated an even more marked effectiveness in battle against the Soviet Army throughout the war.

He found the same disparity in battlefield effectiveness in combat data on the 1967 and 1973 Arab-Israeli wars. The Israeli Army performed uniformly better in ground combat over all of the Arab armies it faced in both conflicts, regardless of posture or outcome.

The clear and consistent patterns in the historical data led Dupuy to conclude that superior combat effectiveness on the battlefield was attributable to moral and behavioral (i.e. human) factors. Those factors he believed were the most important contributors to combat effectiveness were:

  • Leadership
  • Training or Experience
  • Morale, which may or may not include
  • Cohesion

Although the influence of human factors on combat effectiveness was identifiable and measurable in the aggregate, Dupuy was skeptical whether all of the individual moral and behavioral intangibles could be discreetly quantified. He thought this particularly true for a set of factors that also contributed to combat effectiveness, but were a blend of human and operational factors. These include:

  • Logistical effectiveness
  • Time and Space
  • Momentum
  • Technical Command, Control, Communications
  • Intelligence
  • Initiative
  • Chance

Dupuy grouped all of these intangibles together into a composite factor he designated as relative combat effectiveness value, or CEV. The CEV, along with environmental and operational factors (Vf), comprise the Circumstantial Variables of Combat, which when multiplied by force strength (S), determines the combat power (P) of a military force in Dupuy’s formulation.

P = S x Vf x CEV

Dupuy did not believe that CEVs were static values. As with human behavior, they vary somewhat from engagement to engagement. He did think that human factors were the most substantial of the combat variables. Therefore any model or theory of combat that failed to account for them would invariably be inaccurate.

NOTES

This post is drawn from Trevor N. Dupuy, Numbers, Predictions and War: Using History to Evaluate Combat Factors and Predict the Outcome of Battles (Indianapolis; New York: The Bobbs-Merrill Co., 1979), Chapters 5, 7 and 9; Trevor N. Dupuy, Understanding War: History and Theory of Combat (New York: Paragon House, 1987), Chapters 8 and 10; and Trevor N. Dupuy, “The Fundamental Information Base for Modeling Human Behavior in Combat, ” presented at the Military Operations Research Society (MORS) Mini-Symposium, “Human Behavior and Performance as Essential Ingredients in Realistic Modeling of Combat – MORIMOC II,” 22-24 February 1989, Center for Naval Analyses, Alexandria, Virginia.

Human Factors In Warfare: Interaction Of Variable Factors

The Second Battle of Ypres, 22 April to 25 May 1915 by Richard Jack [Canadian War Museum]

Trevor Dupuy thought that it was possible to identify and quantify the effects of some individual moral and behavioral (i.e. human) factors on combat. He also believed that many of these factors interacted with each other and with environmental and operational (i.e. physical) variables in combat as well, although parsing and quantifying these effects was a good deal more difficult. Among the combat phenomena he considered to be the result of interaction with human factors were:

Dupuy was critical of combat models and simulations that failed to address these relationships. The prevailing approach to the design of combat modeling used by the U.S. Department of Defense is known as the aggregated, hierarchical, or “bottom-up” construct. Bottom-up models generally use the Lanchester equations, or some variation on them, to calculate combat outcomes between individual soldiers, tanks, airplanes, and ships. These results are then used as inputs for models representing warfare at the brigade/division level, the outputs of which are then fed into theater-level simulations. Many in the American military operations research community believe bottom-up models to be the most realistic method of modeling combat.

Dupuy criticized this approach for many reasons (including the inability of the Lanchester equations to accurately replicate real-world combat outcomes), but mainly because it failed to represent human factors and their interactions with other combat variables.

It is almost undeniable that there must be some interaction among and within the effects of physical as well as behavioral variable factors. I know of no way of measuring this. One thing that is reasonably certain is that the use of the bottom-up approach to model design and development cannot capture such interactions. (Most models in use today are bottom-up models, built up from one-on-one weapons interactions to many-on-many.) Presumably these interactions are captured in a top-down model derived from historical experience, of which there is at least one in existence [by which, Dupuy meant his own].

Dupuy was convinced that any model of combat that failed to incorporate human factors would invariably be inaccurate, which put him at odds with much of the American operations research community.

War does not consist merely of a number of duels. Duels, in fact, are only a very small—though integral—part of combat. Combat is a complex process involving interaction over time of many men and numerous weapons combined in a great number of different, and differently organized, units. This process cannot be understood completely by considering the theoretical interactions of individual men and weapons. Complete understanding requires knowing how to structure such interactions and fit them together. Learning how to structure these interactions must be based on scientific analysis of real combat data.[1]

While this unresolved debate went dormant some time ago, bottom-up models became the simulations of choice in Defense Department campaign planning and analysis. It should be noted, however, that the Defense Department disbanded its campaign-level modeling capabilities in 2011 because the use of the simulations in strategic analysis was criticized as “slow, manpower-intensive, opaque, difficult to explain because of its dependence on complex models, inflexible, and weak in dealing with uncertainty.”

NOTES

[1] Trevor N. Dupuy, Understanding War: History and Theory of Combat (New York: Paragon House, 1987), p. 195.

Human Factors In Warfare: Diminishing Returns In Combat

[Jan Spousta; Wikimedia Commons]

One of the basic problems facing military commanders at all levels is deciding how to allocate available forces to accomplish desired objectives. A guiding concept in this sort of decision-making is economy of force, one of the fundamental and enduring principles of war. As defined in the 1954 edition of U.S. Army Field Manual FM 100-5, Field Service Regulations, Operations (which Trevor Dupuy believed contained the best listing of the principles):

Economy of Force

Minimum essential means must be employed at points other than that of decision. To devote means to unnecessary secondary efforts or to employ excessive means on required secondary efforts is to violate the principle of both mass and the objective. Limited attacks, the defensive, deception, or even retrograde action are used in noncritical areas to achieve mass in the critical area.

How do leaders determine the appropriate means for accomplishing a particular mission? The risk of failing to assign too few forces to a critical task is self-evident, but is it possible to allocate too many? Determining the appropriate means in battle has historically involved subjective calculations by commanders and their staff advisors of the relative combat power of friendly and enemy forces. Most often, it entails a rudimentary numerical comparison of numbers of troops and weapons and estimates of the influence of environmental and operational factors. An exemplar of this is the so-called “3-1 rule,” which holds that an attacking force must achieve a three to one superiority in order to defeat a defending force.

Through detailed analysis of combat data from World War II and the 1967 and 1973 Arab-Israeli wars, Dupuy determined that combat appears subject to a law of diminishing returns and that it is indeed possible to over-allocate forces to a mission.[1] By comparing the theoretical outcomes of combat engagements with the actual results, Dupuy discovered that a force with a combat power advantage greater than double that of its adversary seldom achieved proportionally better results than a 2-1 advantage. A combat power superiority of 3 or 4 to 1 rarely yielded additional benefit when measured in terms of casualty rates, ground gained or lost, and mission accomplishment.

Dupuy also found that attackers sometimes gained marginal benefits from combat power advantages greater than 2-1, though less proportionally and economically than the numbers of forces would suggest. Defenders, however, received no benefit at all from a combat power advantage beyond 2-1.

Two human factors contributed to this apparent force limitation, Dupuy believed, Clausewitzian friction and breakpoints. As described in a previous post, friction accumulates on the battlefield through the innumerable human interactions between soldiers, degrading combat performance. This phenomenon increases as the number of soldiers increases.

A breakpoint represents a change of combat posture by a unit on the battlefield, for example, from attack to defense, or from defense to withdrawal. A voluntary breakpoint occurs due to mission accomplishment or a commander’s order. An involuntary breakpoint happens when a unit spontaneously ceases an attack, withdraws without orders, or breaks and routs. Involuntary breakpoints occur for a variety of reasons (though contrary to popular wisdom, seldom due to casualties). Soldiers are not automatons and will rarely fight to the death.

As Dupuy summarized,

It is obvious that the law of diminishing returns applies to combat. The old military adage that the greater the superiority the better, is not necessarily true. In the interests of economy of force, it appears to be unnecessary, and not really cost-effective, to build up a combat power superiority greater than two-to-one. (Note that this is not the same as a numerical superiority of two-to-one.)[2] Of course, to take advantage of this phenomenon, it is essential that a commander be satisfied that he has a reliable basis for calculating relative combat power. This requires an ability to understand and use “combat multipliers” with greater precision than permitted by U.S. Army doctrine today.[3] [Emphasis added.]

NOTES

[1] This section is drawn from Trevor N. Dupuy, Understanding War: History and Theory of Combat (New York: Paragon House, 1987), Chapter 11.

[2] This relates to Dupuy’s foundational conception of combat power, which is clearly defined and explained in Understanding War, Chapter 8.

[3] Dupuy, Understanding War, p. 139.