Tag Quantified Judgement Model (QJM)

The Dupuy Air Campaign Model (DACM)

[The article below is reprinted from the April 1997 edition of The International TNDM Newsletter. A description of the TDI Air Model Historical Data Study can be found here.]

The Dupuy Air Campaign Model
by Col. Joseph A. Bulger, Jr., USAF, Ret.

The Dupuy Institute, as part of the DACM [Dupuy Air Campaign Model], created a draft model in a spreadsheet format to show how such a model would calculate attrition. Below are the actual printouts of the “interim methodology demonstration,” which shows the types of inputs, outputs, and equations used for the DACM. The spreadsheet was created by Col. Bulger, while many of the formulae were the work of Robert Shaw.

The Dupuy Institute Air Model Historical Data Study

British Air Ministry aerial combat diagram that sought to explain how the RAF had fought off the Luftwaffe. [World War II Today]

[The article below is reprinted from the April 1997 edition of The International TNDM Newsletter.]

Air Model Historical Data Study
by Col. Joseph A. Bulger, Jr., USAF, Ret

The Air Model Historical Study (AMHS) was designed to lead to the development of an air campaign model for use by the Air Command and Staff College (ACSC). This model, never completed, became known as the Dupuy Air Campaign Model (DACM). It was a team effort led by Trevor N. Dupuy and included the active participation of Lt. Col. Joseph Bulger, Gen. Nicholas Krawciw, Chris Lawrence, Dave Bongard, Robert Schmaltz, Robert Shaw, Dr. James Taylor, John Kettelle, Dr. George Daoust and Louis Zocchi, among others. After Dupuy’s death, I took over as the project manager.

At the first meeting of the team Dupuy assembled for the study, it became clear that this effort would be a serious challenge. In his own style, Dupuy was careful to provide essential guidance while, at the same time, cultivating a broad investigative approach to the unique demands of modeling for air combat. It would have been no surprise if the initial guidance established a focus on the analytical approach, level of aggregation, and overall philosophy of the QJM [Quantified Judgement Model] and TNDM [Tactical Numerical Deterministic Model]. It was clear that Trevor had no intention of steering the study into an air combat modeling methodology based directly on QJM/TNDM. To the contrary, he insisted on a rigorous derivation of the factors that would permit the final choice of model methodology.

At the time of Dupuy’s death in June 1995, the Air Model Historical Data Study had reached a point where a major decision was needed. The early months of the study had been devoted to developing a consensus among the TDI team members with respect to the factors that needed to be included in the model. The discussions tended to highlight three areas of particular interest—factors that had been included in models currently in use, the limitations of these models, and the need for new factors (and relationships) peculiar to the properties and dynamics of the air campaign. Team members formulated a family of relationships and factors, but the model architecture itself was not investigated beyond the surface considerations.

Despite substantial contributions from team members, including analytical demonstrations of selected factors and air combat relationships, no consensus had been achieved. On the contrary, there was a growing sense of need to abandon traditional modeling approaches in favor of a new application of the “Dupuy Method” based on a solid body of air combat data from WWII.

The Dupuy approach to modeling land combat relied heavily on the ratio of force strengths (largely determined by firepower as modified by other factors). After almost a year of investigations by the AMHDS team, it was beginning to appear that air combat differed in a fundamental way from ground combat. The essence of the difference is that in air combat, the outcome of the maneuver battle for platform position must be determined before the firepower relationships may be brought to bear on the battle outcome.

At the time of Dupuy’s death, it was apparent that if the study contract was to yield a meaningful product, an immediate choice of analysis thrust was required. Shortly prior to Dupuy’s death, I and other members of the TDI team recommended that we adopt the overall approach, level of aggregation, and analytical complexity that had characterized Dupuy’s models of land combat. We also agreed on the time-sequenced predominance of the maneuver phase of air combat. When I was asked to take the analytical lead for the contact in Dupuy’s absence, I was reasonably confident that there was overall agreement.

In view of the time available to prepare a deliverable product, it was decided to prepare a model using the air combat data we had been evaluating up to that point—June 1995. Fortunately, Robert Shaw had developed a set of preliminary analysis relationships that could be used in an initial assessment of the maneuver/firepower relationship. In view of the analytical, logistic, contractual, and time factors discussed, we decided to complete the contract effort based on the following analytical thrust:

  1. The contract deliverable would be based on the maneuver/firepower analysis approach as currently formulated in Robert Shaw’s performance equations;
  2. A spreadsheet formulation of outcomes for selected (Battle of Britain) engagements would be presented to the customer in August 1995;
  3. To the extent practical, a working model would be provided to the customer with suggestions for further development.

During the following six weeks, the demonstration model was constructed. The model (programmed for a Lotus 1-2-3 style spreadsheet formulation) was developed, mechanized, and demonstrated to ACSC in August 1995. The final report was delivered in September of 1995.

The working model demonstrated to ACSC in August 1995 suggests the following observations:

  • A substantial contribution to the understanding of air combat modeling has been achieved.
  • While relationships developed in the Dupuy Air Combat Model (DACM) are not fully mature, they are analytically significant.
  • The approach embodied in DACM derives its authenticity from the famous “Dupuy Method” thus ensuring its strong correlations with actual combat data.
  • Although demonstrated only for air combat in the Battle of Britain, the methodology is fully capable of incorporating modem technology contributions to sensor, command and control, and firepower performance.
  • The knowledge base, fundamental performance relationships, and methodology contributions embodied in DACM are worthy of further exploration. They await only the expression of interest and a relatively modest investment to extend the analysis methodology into modem air combat and the engagements anticipated for the 21st Century.

One final observation seems appropriate. The DACM demonstration provided to ACSC in August 1995 should not be dismissed as a perhaps interesting, but largely simplistic approach to air combat modeling. It is a significant contribution to the understanding of air combat relationships that will prevail in the 21st Century. The Dupuy Institute is convinced that further development of DACM makes eminent good sense. An exploitation of the maneuver and firepower relationships already demonstrated in DACM will provide a valid basis for modeling air combat with modern technology sensors, control mechanisms, and weapons. It is appropriate to include the Dupuy name in the title of this latest in a series of distinguished combat models. Trevor would be pleased.

Validating Trevor Dupuy’s Combat Models

[The article below is reprinted from Winter 2010 edition of The International TNDM Newsletter.]

A Summation of QJM/TNDM Validation Efforts

By Christopher A. Lawrence

There have been six or seven different validation tests conducted of the QJM (Quantified Judgment Model) and the TNDM (Tactical Numerical Deterministic Model). As the changes to these two models are evolutionary in nature but do not fundamentally change the nature of the models, the whole series of validation tests across both models is worth noting. To date, this is the only model we are aware of that has been through multiple validations. We are not aware of any DOD [Department of Defense] combat model that has undergone more than one validation effort. Most of the DOD combat models in use have not undergone any validation.

The Two Original Validations of the QJM

After its initial development using a 60-engagement WWII database, the QJM was tested in 1973 by application of its relationships and factors to a validation database of 21 World War II engagements in Northwest Europe in 1944 and 1945. The original model proved to be 95% accurate in explaining the outcomes of these additional engagements. Overall accuracy in predicting the results of the 81 engagements in the developmental and validation databases was 93%.[1]

During the same period the QJM was converted from a static model that only predicted success or failure to one capable of also predicting attrition and movement. This was accomplished by adding variables and modifying factor values. The original QJM structure was not changed in this process. The addition of movement and attrition as outputs allowed the model to be used dynamically in successive “snapshot” iterations of the same engagement.

From 1973 to 1979 the QJM’s formulae, procedures, and variable factor values were tested against the results of all of the 52 significant engagements of the 1967 and 1973 Arab-Israeli Wars (19 from the former, 33 from the latter). The QJM was able to replicate all of those engagements with an accuracy of more than 90%?[2]

In 1979 the improved QJM was revalidated by application to 66 engagements. These included 35 from the original 81 engagements (the “development database”), and 31 new engagements. The new engagements included five from World War II and 26 from the 1973 Middle East War. This new validation test considered four outputs: success/failure, movement rates, personnel casualties, and tank losses. The QJM predicted success/failure correctly for about 85% of the engagements. It predicted movement rates with an error of 15% and personnel attrition with an error of 40% or less. While the error rate for tank losses was about 80%, it was discovered that the model consistently underestimated tank losses because input data included all kinds of armored vehicles, but output data losses included only numbers of tanks.[3]

This completed the original validations efforts of the QJM. The data used for the validations, and parts of the results of the validation, were published, but no formal validation report was issued. The validation was conducted in-house by Colonel Dupuy’s organization, HERO [Historical Evaluation Research Organization]. The data used were mostly from division-level engagements, although they included some corps- and brigade-level actions. We count these as two separate validation efforts.

The Development of the TNDM and Desert Storm

In 1990 Col. Dupuy, with the collaborative assistance of Dr. James G. Taylor (author of Lanchester Models of Warfare [vol. 1] [vol. 2], published by the Operations Research Society of America, Arlington, Virginia, in 1983) introduced a significant modification: the representation of the passage of time in the model. Instead of resorting to successive “snapshots,” the introduction of Taylor’s differential equation technique permitted the representation of time as a continuous flow. While this new approach required substantial changes to the software, the relationship of the model to historical experience was unchanged.[4] This revision of the model also included the substitution of formulae for some of its tables so that there was a continuous flow of values across the individual points in the tables. It also included some adjustment to the values and tables in the QJM. Finally, it incorporated a revised OLI [Operational Lethality Index] calculation methodology for modem armor (mobile fighting machines) to take into account all the factors that influence modern tank warfare.[5] The model was reprogrammed in Turbo PASCAL (the original had been written in BASIC). The new model was called the TNDM (Tactical Numerical Deterministic Model).

Building on its foundation of historical validation and proven attrition methodology, in December 1990, HERO used the TNDM to predict the outcome of, and losses from, the impending Operation DESERT STORM.[6] It was the most accurate (lowest) public estimate of U.S. war casualties provided before the war. It differed from most other public estimates by an order of magnitude.

Also, in 1990, Trevor Dupuy published an abbreviated form of the TNDM in the book Attrition: Forecasting Battle Casualties and Equipment Losses in Modern War. A brief validation exercise using 12 battles from 1805 to 1973 was published in this book.[7] This version was used for creation of M-COAT[8] and was also separately tested by a student (Lieutenant Gozel) at the Naval Postgraduate School in 2000.[9] This version did not have the firepower scoring system, and as such neither M-COAT, Lieutenant Gozel’s test, nor Colonel Dupuy’s 12-battle validation included the OLI methodology that is in the primary version of the TNDM.

For counting purposes, I consider the Gulf War the third validation of the model. In the end, for any model, the proof is in the pudding. Can the model be used as a predictive tool or not? If not, then there is probably a fundamental flaw or two in the model. Still the validation of the TNDM was somewhat second-hand, in the sense that the closely-related previous model, the QJM, was validated in the 1970s to 200 World War II and 1967 and 1973 Arab-Israeli War battles, but the TNDM had not been. Clearly, something further needed to be done.

The Battalion-Level Validation of the TNDM

Under the guidance of Christopher A. Lawrence, The Dupuy Institute undertook a battalion-level validation of the TNDM in late 1996. This effort tested the model against 76 engagements from World War I, World War II, and the post-1945 world including Vietnam, the Arab-Israeli Wars, the Falklands War, Angola, Nicaragua, etc. This effort was thoroughly documented in The International TNDM Newsletter.[10] This effort was probably one of the more independent and better-documented validations of a casualty estimation methodology that has ever been conducted to date, in that:

  • The data was independently assembled (assembled for other purposes before the validation) by a number of different historians.
  • There were no calibration runs or adjustments made to the model before the test.
  • The data included a wide range of material from different conflicts and times (from 1918 to 1983).
  • The validation runs were conducted independently (Susan Rich conducted the validation runs, while Christopher A. Lawrence evaluated them).
  • The results of the validation were fully published.
  • The people conducting the validation were independent, in the sense that:

a) there was no contract, management, or agency requesting the validation;
b) none of the validators had previously been involved in designing the model, and had only very limited experience in using it; and
c) the original model designer was not able to oversee or influence the validation.[11]

The validation was not truly independent, as the model tested was a commercial product of The Dupuy Institute, and the person conducting the test was an employee of the Institute. On the other hand, this was an independent effort in the sense that the effort was employee-initiated and not requested or reviewed by the management of the Institute. Furthermore, the results were published.

The TNDM was also given a limited validation test back to its original WWII data around 1997 by Niklas Zetterling of the Swedish War College, who retested the model to about 15 or so Italian campaign engagements. This effort included a complete review of the historical data used for the validation back to their primarily sources, and details were published in The International TNDM Newsletter.[12]

There has been one other effort to correlate outputs from QJM/TNDM-inspired formulae to historical data using the Ardennes and Kursk campaign-level (i.e., division-level) databases.[13] This effort did not use the complete model, but only selective pieces of it, and achieved various degrees of “goodness of fit.” While the model is hypothetically designed for use from squad level to army group level, to date no validation has been attempted below battalion level, or above division level. At this time, the TNDM also needs to be revalidated back to its original WWII and Arab-Israeli War data, as it has evolved since the original validation effort.

The Corps- and Division-level Validations of the TNDM

Having now having done one extensive battalion-level validation of the model and published the results in our newsletters, Volume 1, issues 5 and 6, we were then presented an opportunity in 2006 to conduct two more validations of the model. These are discussed in depth in two articles of this issue of the newsletter.

These validations were again conducted using historical data, 24 days of corps-level combat and 25 cases of division-level combat drawn from the Battle of Kursk during 4-15 July 1943. It was conducted using an independently-researched data collection (although the research was conducted by The Dupuy Institute), using a different person to conduct the model runs (although that person was an employee of the Institute) and using another person to compile the results (also an employee of the Institute). To summarize the results of this validation (the historical figure is listed first followed by the predicted result):

There was one other effort that was done as part of work we did for the Army Medical Department (AMEDD). This is fully explained in our report Casualty Estimation Methodologies Study: The Interim Report dated 25 July 2005. In this case, we tested six different casualty estimation methodologies to 22 cases. These consisted of 12 division-level cases from the Italian Campaign (4 where the attack failed, 4 where the attacker advanced, and 4 Where the defender was penetrated) and 10 cases from the Battle of Kursk (2 cases Where the attack failed, 4 where the attacker advanced and 4 where the defender was penetrated). These 22 cases were randomly selected from our earlier 628 case version of the DLEDB (Division-level Engagement Database; it now has 752 cases). Again, the TNDM performed as well as or better than any of the other casualty estimation methodologies tested. As this validation effort was using the Italian engagements previously used for validation (although some had been revised due to additional research) and three of the Kursk engagements that were later used for our division-level validation, then it is debatable whether one would want to call this a seventh validation effort. Still, it was done as above with one person assembling the historical data and another person conducting the model runs. This effort was conducted a year before the corps and division-level validation conducted above and influenced it to the extent that we chose a higher CEV (Combat Effectiveness Value) for the later validation. A CEV of 2.5 was used for the Soviets for this test, vice the CEV of 3.0 that was used for the later tests.

Summation

The QJM has been validated at least twice. The TNDM has been tested or validated at least four times, once to an upcoming, imminent war, once to battalion-level data from 1918 to 1989, once to division-level data from 1943 and once to corps-level data from 1943. These last four validation efforts have been published and described in depth. The model continues, regardless of which validation is examined, to accurately predict outcomes and make reasonable predictions of advance rates, loss rates and armor loss rates. This is regardless of level of combat (battalion, division or corps), historic period (WWI, WWII or modem), the situation of the combats, or the nationalities involved (American, German, Soviet, Israeli, various Arab armies, etc.). As the QJM, the model was effectively validated to around 200 World War II and 1967 and 1973 Arab-Israeli War battles. As the TNDM, the model was validated to 125 corps-, division-, and battalion-level engagements from 1918 to 1989 and used as a predictive model for the 1991 Gulf War. This is the most extensive and systematic validation effort yet done for any combat model. The model has been tested and re-tested. It has been tested across multiple levels of combat and in a wide range of environments. It has been tested where human factors are lopsided, and where human factors are roughly equal. It has been independently spot-checked several times by others outside of the Institute. It is hard to say what more can be done to establish its validity and accuracy.

NOTES

[1] It is unclear what these percentages, quoted from Dupuy in the TNDM General Theoretical Description, specify. We suspect it is a measurement of the model’s ability to predict winner and loser. No validation report based on this effort was ever published. Also, the validation figures seem to reflect the results after any corrections made to the model based upon these tests. It does appear that the division-level validation was “incremental.” We do not know if the earlier validation tests were tested back to the earlier data, but we have reason to suspect not.

[2] The original QJM validation data was first published in the Combat Data Subscription Service Supplement, vol. 1, no. 3 (Dunn Loring VA: HERO, Summer 1975). (HERO Report #50) That effort used data from 1943 through 1973.

[3] HERO published its QJM validation database in The QJM Data Base (3 volumes) Fairfax VA: HERO, 1985 (HERO Report #100).

[4] The Dupuy Institute, The Tactical Numerical Deterministic Model (TNDM): A General and Theoretical Description, McLean VA: The Dupuy Institute, October 1994.

[5] This had the unfortunate effect of undervaluing WWII-era armor by about 75% relative to other WWII weapons when modeling WWII engagements. This left The Dupuy Institute with the compromise methodology of using the old OLI method for calculating armor (Mobile Fighting Machines) when doing WWII engagements and using the new OLI method for calculating armor when doing modem engagements

[6] Testimony of Col. T. N. Dupuy, USA, Ret, Before the House Armed Services Committee, 13 Dec 1990. The Dupuy Institute File I-30, “Iraqi Invasion of Kuwait.”

[7] Trevor N. Dupuy, Attrition: Forecasting Battle Casualties and Equipment Losses in Modern War (HERO Books, Fairfax, VA, 1990), 123-4.

[8] M-COAT is the Medical Course of Action Tool created by Major Bruce Shahbaz. It is a spreadsheet model based upon the elements of the TNDM provided in Dupuy’s Attrition (op. cit.) It used a scoring system derived from elsewhere in the U.S. Army. As such, it is a simplified form of the TNDM with a different weapon scoring system.

[9] See Gözel, Ramazan. “Fitting Firepower Score Models to the Battle of Kursk Data,” NPGS Thesis. Monterey CA: Naval Postgraduate School.

[10] Lawrence, Christopher A. “Validation of the TNDM at Battalion Level.” The International TNDM Newsletter, vol. 1, no. 2 (October 1996); Bongard, Dave “The 76 Battalion-Level Engagements.” The International TNDM Newsletter, vol. 1, no. 4 (February 1997); Lawrence, Christopher A. “The First Test of the TNDM Battalion-Level Validations: Predicting the Winner” and “The Second Test of the TNDM Battalion-Level Validations: Predicting Casualties,” The International TNDM Newsletter, vol. 1 no. 5 (April 1997); and Lawrence, Christopher A. “Use of Armor in the 76 Battalion-Level Engagements,” and “The Second Test of the Battalion-Level Validation: Predicting Casualties Final Scorecard.” The International TNDM Newsletter, vol. 1, no. 6 (June 1997).

[11] Trevor N. Dupuy passed away in July 1995, and the validation was conducted in 1996 and 1997.

[12] Zetterling, Niklas. “CEV Calculations in Italy, 1943,” The International TNDM Newsletter, vol. 1, no. 6. McLean VA: The Dupuy Institute, June 1997. See also Research Plan, The Dupuy Institute Report E-3, McLean VA: The Dupuy Institute, 7 Oct 1998.

[13] See Gözel, “Fitting Firepower Score Models to the Battle of Kursk Data.”

Logistics in Trevor Dupuy’s Combat Models

Trevor N. Dupuy, Numbers, Predictions and War: Using History to Evaluate Combat Factors and Predict the Outcome of Battles (Indianapolis; New York: The Bobbs-Merrill Co., 1979), p. 79

Mystics & Statistics reader Stiltzkin posed two interesting questions in response to my recent post on the new blog, Logistics in War:

Is there actually a reliable way of calculating logistical demand in correlation to “standing” ration strength/combat/daily strength army size?

Did Dupuy ever focus on logistics in any of his work?

The answer to his first question is, yes, there is. In fact, this has been a standard military staff function since before there were military staffs (Martin van Creveld’s book, Supplying War: Logistics from Wallenstein to Patton (2nd ed.) is an excellent general introduction). Staff officer’s guides and field manuals from various armies from the 19th century to the present are full of useful information on field supply allotments and consumption estimates intended to guide battlefield sustainment. The records of modern armies also contain reams of bureaucratic records documenting logistical functions as they actually occurred. Logistics and supply is a woefully under-studied aspect of warfare, but not because there are no sources upon which to draw.

As to his second question, the answer is also yes. Dupuy addressed logistics in his work in a couple of ways. He included two logistics multipliers in his combat models, one in the calculation for the battlefield effects of weapons, the Operational Lethality Index (OLI), and also as one element of the value for combat effectiveness, which is a multiplier in his combat power formula.

Dupuy considered the impact of logistics on combat to be intangible, however. From his historical study of combat, Dupuy understood that logistics impacted both weapons and combat effectiveness, but in the absence of empirical data, he relied on subject matter expertise to assign it a specific value in his model.

Logistics or supply capability is basic in its importance to combat effectiveness. Yet, as in the case of the leadership, training, and morale factors, it is almost impossible to arrive at an objective numerical assessment of the absolute effectiveness of a military supply system. Consequently, this factor also can be applied only when solid historical data provides a basis for objective evaluation of the relative effectiveness of the opposing supply capabilities.[1]

His approach to this stands in contrast to other philosophies of combat model design, which hold that if a factor cannot be empirically measured, it should not be included in a model. (It is up to the reader to decide if this is a valid approach to modeling real-world phenomena or not.)

Yet, as with many aspects of the historical study of combat, Dupuy and his colleagues at the Historical Evaluation Research Organization (HERO) had taken an initial cut at empirical research on the subject. In the late 1960s and early 1970s, Dupuy and HERO conducted a series of studies for the U.S. Air Force on the historical use of air power in support of ground warfare. One line of inquiry looked at the effects of air interdiction on supply, specifically at Operation STRANGLE, an effort by the U.S. and British air forces to completely block the lines of communication and supply of German ground forces defending Rome in 1944.

Dupuy and HERO dug deeply into Allied and German primary source documentation to extract extensive data on combat strengths and losses, logistical capabilities and capacities, supply requirements, and aircraft sorties and bombing totals. Dupuy proceeded from a historically-based assumption that combat units, using expedients, experience, and training, could operate unimpaired while only receiving up to 65% of their normal supply requirements. If the level of supply dipped below 65%, the deficiency would begin impinging on combat power at a rate proportional to the percentage of loss (i.e., a 60% supply rate would impose a 5% decline, represented as a combat effectiveness multiplier of .95, and so on).

Using this as a baseline, Dupuy and HERO calculated the amount of aerial combat power the Allies needed to apply to impact German combat effectiveness. They determined that Operation STRANGLE was able to reduce German supply capacity to about 41.8% of normal, which yielded a reduction in the combat power of German ground combat forces by an average of 6.8%.

He cautioned that these calculations were “directly relatable only to the German situation as it existed in Italy in late March and early April 1944.” As detailed as the analysis was, Dupuy stated that it “may be an oversimplification of a most complex combination of elements, including road and railway nets, supply levels, distribution of targets, and tonnage on targets. This requires much further exhaustive analysis in order to achieve confidence in this relatively simple relationship of interdiction effort to supply capability.”[2]

The historical work done by Dupuy and HERO on logistics and combat appears unique, but it seems highly relevant. There is no lack of detailed data from which to conduct further inquiries. The only impediment appears to be lack of interest.

NOTES

 [1] Trevor N. Dupuy, Numbers, Predictions and War: Using History to Evaluate Combat Factors and Predict the Outcome of Battles (Indianapolis; New York: The Bobbs-Merrill Co., 1979), p. 38.

[2] Ibid., pp. 78-94.

[NOTE: This post was edited to clarify the effect of supply reduction through aerial interdiction in the Operation STRANGLE study.]