The failure of the Russian military in Ukraine during 2022 came as a surprise of many who accepted official (government intel agency) assessments of military power. This is not a new problem and comes from the difficulty in determining military capabilities during peacetime.
In peacetime it is difficult to determine whose armed forces are superior because realistic assessments often contradict government claims. There is a demand for independent rankings and the most popular measure is counting manpower, weapons, economic and logistical capabilities plus geography, natural resources and economic strength. Left out are nuclear weapons which have led nations with nuclear weapons to avoid conflict with other countries that also had nukes. This has led to what is known as the “nuclear peace” that has produced a record number of years without a war between major (nuclear armed) powers.
Some key factors are avoided in peacetime analysis and these usually include critical qualities like training and leadership. One of the more comprehensive conventional analyses is provided by the annual GFI (Global Firepower Index). The rankings seem to be accurate until you take into account actual performance of different nations. For example, GFI ranks Israel 20th behind other Middle Eastern nations like Saudi Arabia (17th), Iran (14th), Egypt (13th) and Turkey (11th). Before the 2011 rebellion, and continuing civil war, Syria was considered to rank somewhere between Egypt and Saudi Arabia. Iraq, before its quality flaws were exposed in the 1991 and 2003 wars, was also ranked higher than Syria. Yet military historians and senior officers in many nations note that you cannot ignore historical performance that is usually described as the quality factor or quality multiplier. Not just in weapons and other aspects of military power that could be counted and compared. The quality multiplier is often a percentage decrease to a more conventional analysis like GFI. The firm producing the GFI admits this and stresses that its data, and rankings, depend on unclassified data of what each country possesses.
There are ways to obtain an accurate quality multiplier. Not a perfect quality multiplier but one that provides more accurate rankings. One way to do this is by refighting historical battles after using wargame design type forces analysis that deal with these concepts like total force quality in a way that can be tested. Wargame designers use a “guess and test” approach which applies a fraction to raw (theoretical) combat power that accounts for less capable leadership, equipment quality, support, training and other "soft" factors. Think of it as an efficiency rating, with "100" being perfect and "55" being a more common 55 percent efficiency.
Wargames were a mid-19th century development the used data on outcomes of Napoleonic Wars (1792-1815) battles to create estimates of capabilities of contemporary forces to be used in getting an idea of how future battles would be fought and who was more likely to win. Further historical analysis of the earlier use of these concepts indicated that for thousands of years more successful commanders had a good sense of how total force quality worked and applied it as the “commander's estimate”. The 19th century wargames documented how this worked because the Napoleonic Wars battles were well documented as wars go and part of the Industrial Revolution was a scientific revolution that led to what we now call statistical and predictive analysis. By the 20th century senior officers were familiar with these concepts and prepared to accept them for measuring combat power and accurately predicting whose forces had an edge.
Peacetime politics often prevented openly discussing the application of total force quality. This changed in the 1960s and 70s when accurate commercial wargames, mainly from Avalon Hill in the late 1950s and SPI (Simulations Publications Inc) in the late 1960s, became available and most nations acknowledged the value of this kind of analysis, even though peacetime politics continued to hinder its widespread application in managing the peacetime military.
The commercial wargames were a catalyst because they accepted the concept of validation. This was not an alien concept. Engineers, especially those developing software or complex machinery, have long understood the need for validating their products. Without this double checking and extensive testing, new software or equipment might not do what was intended and get people killed or just limit sales.
Pre-1970s professional wargames were similar but different. Many were predictive, or attempted to simulate unpredictable combat situations. However, in peacetime, there is no real war to keep the wargames, and generals, honest. Instead, it is common for many politicians, generals and policy makers to want a specific outcome from wargames. Put bluntly, the results are often decided on before the wargames come into play. Many professional wargames are quite accurate, as occasionally the users will do some validation work to demonstrate this by using a recent battle to demonstrate similar outcomes from the wargame as in the actual historical event. But, in general, validation was not a high priority and avoided as much as possible during peacetime.
Within the professional wargame community, there have been quiet debates over this issue for decades. Initially validation did not catch on in a big way. But in the decades following the introduction of commercial wargames, more and more military leaders, especially in the West, rose in the ranks with their knowledge of commercial wargames and the crucial importance of validation. Without a lot of publicity American commanders increasingly applied validation to their training and development of tactics. By the time the 1990-91 Gulf War came along the world was shocked by the swift and devastating defeat inflicted on the Iraqi forces in a ground war that lasted, literally, only a hundred hours. This made the mass media curious about what was going on and it was pointed out, in simple terms the media could use, that the military had learned to measure and apply what had made forces superior in the past. Other recent examples, like the three Arab-Israeli wars where Israel, with a greatly inferior force according to measure used by the Global Firepower Index, quickly and decisively defeated large Arab coalitions, led by Egypt and Syria. On paper the Israelis should have lost every time, but they didn’t, and after the 1973 war the Arabs took it as a fact that the Israelis would probably keep defeating them. Gradually the Arab states made peace with Israel and by 2020 they were openly joining Israel in a coalition against a common enemy; Iran.
Historically, the Iranians were the local superpower, defeating all local opponents and sometimes major powers from outside the region, like the Romans, the Russians and, for a while, the Ottoman Turks. The current Iranian government is a religious dictatorship that believes they are on a mission from God and that their success is inevitable. This is unusual for Iranian governments and diminishes the Iranian chances of success but not their ability to hurt their enemies, especially those that are neighbors.
The method by which wargame designers determined who was worth what in historical battles was quite simple. One of the key aspects of wargame design was the assigning of combat values to military units involved. This is a fairly simple process initially. You simply identified the worst unit, as best you can determine it from the historical record, and assign it a value of "1." It's a good idea at this point to take the best unit, as best you can determine it, and ask yourself the question: how much better is the best unit than the worst unit? If you come up with a number no greater than, say, 9, you're probably in the ballpark and at that point all you have to do is fit all the remaining units in the game in between the best and the worst by just asking yourself the question: how capable is this unit in comparison with the best and the worst? If it sounds simple, it is. There's no mystery involved. The system is further refined when you start playing the game and any misjudgments you have made quickly become evident in your attempts to recreate the historical event. You then modify the values on units and eventually end up with a rather accurate numerical appraisal of each unit's combat ability.
Some people in the Pentagon caught onto this in the 1970s and soon realized that the combat models they were currently using, with Global Firepower type values, were clearly unrealistic if you tried to use the current combat models to refight a World War II or other post-1945 battle. The younger combat officers accepted this more quickly than their seniors because the young officers were buying and using a lot of these historical wargames. It was at their urging that SPI, one of the early commercial wargame developers, took a chance and developed wargames based on contemporary forces and battles that had not been fought yet. These contemporary wargames were very popular, in part because so many officers and their troops were buying and playing them. In doing so these early adaptors came to accept the underlying value of the analysis that went into creating an accurate wargame.
While the U.S. Navy, and navies in general, had adopted these wargame techniques early in the 20th century, this was seen as a unique aspect of naval warfare and not really adaptable to ground and air combat. The commercial wargames showed otherwise and by the late 1970s the army and marines, followed by the air force in the 1980s, were creating and using wargames designed with the validation techniques of historical wargames. This produced the surprising, to people outside the military, “100 hour war” outcome in 1991 and similar events ever since.
One thing these validation techniques made clear was that highly trained, volunteer soldiers were the best way to go for post-World War II warfare. For the ground troops, this means "ordinary" soldiers being trained up to standards that would have made them elite commando type troops during World War II. In a parallel development, a lot of new military gadgets were developed, as well as some startling new tactics, that only worked when used by well-trained troops. This was demonstrated after 2001 in Afghanistan and Iraq.
Some post World War II historians had noted and measured the qualitative differences but their results were not widely recognized. One notable practitioner of this was military historian and World War II artillery officer Trevor Dupuy. He noted that while the Germans lost World War I, they developed training and tactics late in the war that almost won the war for them and played a major role in how World War II developed. These well trained and equipped German World War I “Storm Troopers” simply used more effective training and tactics and this approach worked. The Germans proceeded to train half the army up to Storm Trooper standards. Nothing magic was involved, just intense, well thought out training and excellent leadership. Officers and NCOs were carefully selected and given more training than those of any other nation. While most military experts dismissed German efforts after World War II ("after all, they've lost two world wars"), some noted that losing wars and losing battles were two different things. For example, Trevor Dupuy undertook a closer examination of combat records and found, and documented, that German troops generally outfought their opponents. Even after adjusting for variables like defenses and who was attacking, German troops inflicted more casualties than the soldiers they fought. Not every combat division was the same in combat capability, and it was revealing to see how combat divisions matched up in effectiveness. It turned out that most generals were unaware of some of these differences in combat effectiveness. If it hadn't been for the research of American historian Trevor Dupuy in the 1970s and 80s, these critical differences might still sit unnoticed in musty archives. Dupuy's calculations brought forth the reasons why some allied, German, Russian and Japanese divisions were better than others. Mainly, it was the quality of training and leadership. Division commanders had (and still have) discretion in how their troops are trained. During World War II, some divisions set up their own training operations and made sure troops and combat leaders received all the training they needed. This made a major difference on the battlefield. Ironically, the Germans noticed it more than the allies. They accurately identified the best trained and led American and British divisions. After the war, such analyses were generally ignored. After all, the Germans had lost the war.
But in the early 1970s, with the Vietnam war out of the way, the American army began to concentrate on the future. They did this by more diligently studying the past. What began the reform activity was the 1973 Arab-Israeli war. The Israelis won this one, but not in six days as they had in 1967. It took them 16 days, and a lot more casualties. Two aspects of this war caught the attention of American generals. One was the speed and violence made possible by the modern weapons each side was using. But more importantly, the Israelis were using the same attitudes towards training as the Germans had in the two World Wars. While many pundits dismissed the Israeli victory as inevitable because they were fighting Arabs, a closer look at the battles revealed that Israeli troops were exceptionally well trained and led, much more so than their Arab opponents.
The American generals in the 1970s then looked at their current foe, the mighty Red Army (of the Soviet Union). This force was arrayed from East Germany back to Moscow, ready to march to the English Channel by using wave after wave of tanks, infantry, and warplanes. There was no way the United States and its allies were going to match the Red Army in quantity, so it would have to be done with quality.
Veterans of the Vietnam war remembered that about ten percent of the American infantry were commandos, Special Forces or LRRPS (long range recon patrol troops), and that they did very well against the North Vietnamese, and usually in the North Vietnamese backyard. After the war the U.S. armed forces were dealing with the end of conscription. All troops would be volunteers. Maybe the solution to all this was what the Germans discovered and applied late in World War I and into World War II.
While the media talked about the American "high tech armed forces" in Vietnam, the troops themselves knew that there would have to be a substantial increase in soldier skills before one could expect to "fight outnumbered and win." There was also the realization that press releases touting great new weapons would not add any combat power immediately. This led to the use of the phrase, "come as you are war." Meaning that you had to be realistic and train for combat realizing that future improvements in troops quality or equipment were not what you were currently going into combat with. This outbreak of pragmatism led to a largely unspoken, but very real shift to idea that every combat soldier should be a more effective fighter than the enemy counterpart and that edge could be achieved right now and, with enough effort, maintained.
After 1991 Russia tried to do this, and failed. This was not a big news story but it was not a secret either. Even people in the intelligence agencies were aware of this but the official policy preferred to believe in an alternate reality. The American CIA found a novel solution to this. When official reports were prepared the main text reported what the customer (current government) wanted while the reality was described in footnotes. In the Soviet Union the correlation of forces analysis method was used, that took into account numbers as well as capabilities for both sides. The government preferred to see results that always proved the superiority of Soviet forces. Occasionally a new analyst assigned to the STAVKA (senior military staff) was ordered to an updated correlation of forces on the Soviets versus NATO. Sometimes the new analyst delivered an accurate report that showed NATO forces now had an edge. That was officially ignored and the new analyst was transferred to one of the secret scientific cities in the Ural Mountains, where talented but undisciplined experts lived and worked in golden cages that were difficult to escape from and for enemy spies or diplomats to visit.
After the 199os China built its first modern military force which now, for the first time in history, is larger and better equipped than the Russians. The Chinese also adopted Western training and wargaming techniques. The results of these validated wargames were considered state secrets. It was clear that these wargames demonstrated that China still had the reliability problems Chinese peacetime forces have suffered from for thousands of years. The government continues trying to fix those problems but corruption and lax military leadership persist. China has adapted by changing its tactics used against current enemies, like India, South Korea and nations with existing claims to islands in the South China Sea. China avoids armed combat with these foes and relies on economic and non-lethal force to wear the enemy down. This is expensive and doesn’t always work, but it is preferable to starting a shooting war that could escalate into something China cannot control.