Analysis of Forms
Introduction
One of the wonderful things about taekwondo—as with any martial art—is that there always remains infinitely more to discover. Over fourteen years, I’ve noticed some of my interests shift, and other stay the same, and most deepen in some way. I’ve always loved forms (poomsae), but they mean more to me now: they mean something in a semantic sense as well as an embodied, physical sense. The harmony of these two senses—the semantic and the physical—is increasingly fascinating to me. Over time, I’ve also become more interested in statistics, and more obsessed with Microsoft Excel. The sum of these changes is the associated spreadsheet: Analysis of Forms.
As a caveat: this analysis is very circumscribed, and only represents the particular corpus of forms I have been taught, in a hybrid of ITF and WT styles. With the existing spreadsheet infrastructure, similar statistics can be readily calculated for other styles and forms, if desired.
Guide to the Spreadsheet
[NOTE: You will need to download the spreadsheet as an Excel (.xlsx) file; Google Sheets isn’t sophisticated enough to handle it. If you’re only looking at the preview, some of the data is going to be messed up. Even the OneDrive preview may not be entirely accurate.]
The spreadsheet is composed of many “tabs” (or “sheets”). Most record the moves of individual forms, and are respectively named and color-coded according to the style I have studied. For each form, I’ve encoded each move’s stance and technique (as applicable), using the vocabulary I’ve learned. Technically, each row is a “technique” which represents any individual action in a form. Any technique in this sense is discrete, but can be physically simultaneous, such as a double face punch. (In my style, a double face punch means two face punches at the same time; a double body punch is a one-two punch, generally beginning with the lead hand. I follow this terminology in the spreadsheet.) Each “move” often maps to individual techniques, but sometimes more than one technique comprises a single move, as in the case of a combination (especially in higher belt forms). In earlier forms, I tried to stick to the move breakdown found in our school’s form handouts, but for black belt forms I just had to do what I thought was best. Each row (technique) is also given an “aggression” value, which pulls from a hidden lookup table tab in order to quickly give all offensive techniques a positive aggression score (all defensive or neutral techniques get a null score). Kihaps (and absences of kihaps) are also recorded.
(If you read through the Ki Cho Hyung (white belt form) tab, you’ll probably get a sense of my encoding procedures. Do note that for simplicity’s sake (and ease of calculation), I’ve only encoded techniques, and not particular steps or motions between techniques. I acknowledge that these are just as integral to each form, but it’s beyond the scope of the present analysis.)
The Aggregate tab uses some fancy Excel functions to pull all the information from the individual Forms tabs into one location, for comprehensive analysis (found at the bottom of that sheet). Do note that if you add or subtract rows in individual Forms tabs, you may need to make corresponding adjustments to the Aggregate tab. The Aggregate tab also uses simple concatenation functions to list more clearly each technique of each form. (My idiosyncratic hyphenation can be ignored.)
The Analysis tab gives an overview of all other information in the spreadsheet, by form. The belt color (according to my school’s tradition), form name, original style, and meaning are recorded, as well as: number of moves and techniques (as previously defined); “loudness” (number of kihaps); “aggression” (percentage of offensive vs. defensive/neutral techniques); “left-ness”, “evenness”, and “right-ness” of both stances and techniques (I’ve used “even” for stances and “double” for techniques, just because this seems to be the prevailing nomenclature; they refer to essentially the same thing), as well as respective differentials; and “imbalance” (absolute value of the difference in left/right differentials between stances and techniques; in other words, which forms are the most side-imbalanced). This tab uses some nice formulas to calculate all this from the individual sheets simply by referencing the name of the form. (Do note that this requires the hidden “Form Key” column, since the name of the form has to match the name of the Form Tab exactly.)
Finally, the Pivot Tables tab (my favorite) offers more nuanced statistical information about every form compiled in the Aggregate sheet. Expanding and collapsing the levels of the pivot tables makes it easy to visualize different kinds of data. (You can right-click the pivot table in Excel and select “expand/collapse all.” Google does pivot tables totally differently, so don’t try to mess with it in the preview.) In this tab, you can explore the most common stances and techniques (as well as the most common sides—right or left—for each); the frequencies of different techniques of various “species” (blocks, punches, kicks, elbows, backfists, etc.); the most-kihapy techniques and stances; and a scatterplot visualizing the trend between belt rank and the number of techniques in the associated form.
It is my hope that a taekwondo student of any level will find at least some interesting bit of insight here.
Analysis
I’d like to draw your attention to a few most interesting (and/or most amusing) findings of this statistical analysis (again, with the caveat that this reflects the teachings only of the tradition I’ve learned, up to my current rank—4th Dan).
The longest colored belt form is Toi Gye (high brown) at 41 techniques, and the shortest is Chun Ji (yellow) at 19 techniques. The longest black belt form (and longest form overall) is unsurprisingly Choong Jang, at 53 techniques. The shortest black belt form is Pyongwon at 29 techniques, which is just about the average technique number for colored belt forms (29.38 ±6.28); the average techniques for black belt forms is 39.57 ±8.68. Overall, it’s 32.95 ±8.70 techniques per form. We can subsequently give each form a z-score, which is displayed in the Analysis tab. (If this means nothing to you, don’t worry; it’s just a method of statistical comparison.)
The “loudest” forms are Taeguk Pahl Jong (red) and Kwang Gae (black), at 3 kihaps each. (Though, whether Taebaek has one or two kihaps is up for debate.) The average number of kihaps is 1.77 ±0.58 and 2.00 ±0.53 for colored belt and black belt forms, respectively (and 1.85 ±0.57 total).
The most aggressive colored belt form is, perhaps surprisingly, Won Hyo (green), at 62% (of offensive techniques out of total techniques). That monk packs a punch, or rather, seven punches, four chops, four kicks, and one thrust. The most aggressive black belt form is Taebaek, at 71%. The least aggressive colored belt form is Toi Gye (high brown) at 37%, and for black belt we have Pyongwon at 38%—indeed a peaceful open plain. From this, we can see that aggression really depends as much on the proportion of defensive techniques as offensive techniques. All those mountain blocks in Toi Gye add up. Interestingly, no form is precisely 50% offensive/defensive. The average aggression scores are 52% ±7% for colored belt forms, 53% ±11% for black belt forms, and 53% ±9% overall. Aggression z-scores can be seen in the right adjacent column.
The most imbalanced form by a landslide is Ge Baek—surprising no one, I’m sure. Ge Baek is extremely left-leaning in the feet, at 59% left stances, but 45% right techniques (which is the prevailing amount: do note that left, right, and even/double stances and techniques are all factored in, so 50% is not necessarily the expected value). The Analysis tab visualization is perhaps the best way to approach this data. I will note that only three forms are perfectly balanced: Ki Cho Hyung (white), Taeguk Som Jong (high green), and Taeguk Yuk Jong. (Chun Ji also has an imbalance score of zero, but that’s because it’s 53% right-leaning for both stances and techniques.) It’s also worth noting Keumgang, with an incredible 59% even stances—all those horse-riding stances, of course. (Lastly, also note that some forms have a different total number of forms and stances, which can lead to apparently uneven differentials.)
From the Aggregate tab, we can see that of all the forms, there are 659 techniques with 53% overall aggression and 37 kihaps. Overall, stances lean 45% left, 41% right, and 14% even. Techniques skew 42% left, 47% right, and 10% double (equivalent to even, just different terminology). The Aggregate tab also allows simple searching; for instance, we can see that the first appearance of reverse elbow is in Taeguk Oh Jong (high blue), and the first horse-riding stance is in Taeguk Chil Jong (brown). My absolute favorite—the gem of this whole analysis—is that the first palm-heel strike—surely a beginner-level technique (I frequently see it taught to white belts)—does not appear until Keumgang, one of our 2nd Dan black belt forms, and then only once more in Choong Jang, our second 3rd Dan form. If you’d like to find all instances of a particular technique, a simple Ctrl+F in the Aggregate tab should be sufficient. (Just make sure your spelling matches mine—sorry. Also: because =vlookup is persnickety, I’m using the hidden column M, which should exactly copy column A, but don’t overlook that if you make changes to your own copy, especially if you add or subtract rows.)
At last, we get to the beauty that is pivot tables. I’ve tried to visualize everything as comprehensively and cleanly as possible. As aforementioned, you can adjust what level of data shows by expanding/collapsing from the right-click menu. That also allows you to resort things, if desired.
Interestingly, most types of stances are quite balanced left/right, even the more unusual ones. For techniques, we get a little more variation, with most techniques favoring the right side—probably a handedness bias that isn’t as pronounced in footedness (?) bias. And to no one’s surprise, the most common stances are front and back, and the most common techniques are body punch and low block. The most common kick follows directly after—front kick. Interestingly, side kick lands a few places after that, and roundhouse is quite a ways down the list. More nuance could certainly be calculated by a more serious encoding of information—for instance, how many reverse vs. normal techniques there are. (You could theoretically compare side of technique with side of stance in the existing spreadsheet, but it would be tricky and annoying.)
With a number of technique “species” (punches, blocks, kicks, etc.) we see rather exponential distributions. Punches is a good example: out of every punch in every form, almost half are body punches. For all kicks, over half are front kicks. Note quite a Pareto distribution, but heading in that direction. We also see interesting sorts of “collocations” (though perhaps the semantics of my technique encoding has forced this). For instance, the only kind of “smash” is a knee-smash; there are no other smashes. Obviously different styles have dramatically different translations for various techniques, but looking at the distribution of an individual corpus such as this can be interesting.
Looking at the “top three” for each species, we see blocks characterized by low block, double knife, and hammer block (representing over a third of all blocks); punches characterized by body, face, and double-body (representing almost two thirds of all punches); kicks characterized by front, side, and roundhouse (four fifths of all kicks); as well as a surprising dominance of reverse elbows in the elbow species.
It’s interesting—though ultimately not unexpected—that the distribution of techniques in forms does not directly map to the distribution of techniques learned by students. Of course, this is because students have other applications for techniques: sparring, self-defense, moving basics, kicking combination practice, etc. For this reason, forms are dominated by front kicks, with zero spin hook kicks, while sparring would show a predominance of roundhouses and hooks.
For my own amusement, I also looked at the kihap distribution. Body punch is apparently the most kihapy technique (though it’s also overrepresented across techniques in general), followed by face punch. (I think we can all agree that stick block is the least kihapy technique—one of the six surprising kihaps accompanying blocks.) The most kihapy kick looks to be jump front kick. For kihaps not occurring during kicks, the most common stance for a kihap is by far a front stance. (Due to current encoding methods, stance kihaps cannot be accurately summarized in a pivot table, so I’ve inputted that chart by hand.) It is interesting that, for a martial art prioritizing the legs over the arms, only 5 of 37 kihaps (13.5%) occur on kicks.
Finally, we can see from the scatterplot that the general trend in form advancement is an increase in number of techniques (r = 0.7). The vertical dotted line represents the transition from colored belt to black belt forms, and while we see the general trend continue, black belt forms notably demonstrate increasing variance.
Conclusion
This is not a formal analysis, and therefore I will not write a formal conclusion. These calculations were done mostly for my own enjoyment, perhaps because I’ve gotten to a point where I look at forms differently. When you are a colored belt, only one form really matters to you at a time. When you are a black belt, you need to know all the forms, and at higher levels you need to be able to teach anybody any of the forms. By the time you know twenty forms, another level of analysis has emerged, which is not any individual form, but the set of all forms. Looking at individual forms as components of the set of all forms reveals insight and patterns (don’t forget: “the detail of the pattern is movement”) otherwise missed: idiosyncrasies of certain movements, coherence (and incoherence) between different parameters, and similarities in unexpected places—sometimes across many ranks.
It is my hope that this analysis has produced something at least interesting, and perhaps even useful, for students of taekwondo. I encourage you to analyze your own forms and sets of forms, in any style. There is certainly so much space for additional analysis. For instance: have you noticed that every single form begins with a block? (Let’s assume Kwang Gae’s explosive incipit constitutes a block—it’s at least not an offensive maneuver.) And that Choong Moo and Pyongwon are the only forms that don’t follow that block immediately with an offensive technique? In summary, I encourage you to download and make use of my spreadsheet, though do note you will have to get behind the scenes a little bit (unhiding things, adjusting formulas, etc.) if you plan on making your own changes and analyses. (You should also get familiar with =indirect.)
Happy kicking.