Compound Interest in IT

by Bernard Ng (updated Dec 20th 2000)


Section Contents
0 Background
1 Magic of Compound Interest
2 Nature of IT Work
3 Compound Interest in IT Work
4 Case Studies
5 Recurring Themes from the Case Studies
6 Recommendation to Software Engineers
7 Recommendation to IT Managers
8 Some Final Notes

0. Background

The magic of compound interest in IT bears a strong resemblance to its effect on ones personal finances. This similarity first occurred to me after I read "The Wealthy Barber" by David Chilton in 1990. The book clearly describes what most of us already know, that it is better to earn interest on our savings and investments than live on credit and owe other people lots of money. The detailed explanation of the mechanics of compound interest led me to compare it to some frustration I was experiencing at work, and so I became motivated to write this document to share my hypothesis. Over the years, I have been observing knowledge workers in the IT industry and all my observations have reinforced my conviction that there is plenty of truth in my hypothesis although I believe it is too subjectively complex to scientifically prove it. There were numerous other sources that contributed to my thoughts. Most noteworthy are the pearls of wisdom to be found in Peopleware by Tom De Marco and Timothy Lister ( I've summarized this book ). Many of the phenomena I've observed is clearly explained in this book. The only other material deserving explicit credit is what I learned attending the "Managing Your Time" course at Sun University. I found it interesting to extrapolate organizational effectiveness from individual effectiveness when operating in each of the 4 time quardrants (1- important & urgent, 2- urgent but unimportant, 3- important but not urgent, 4- neither urgent nor important). It was only in 1997 that I felt compelled to share this personal relevation so I scribbled the first draft during a trans-Pacific flight. I will keep updating this document with more case studies if I believe they add some value. If you are thoroughly familiar with the magic of compound interest applied to personal finance, then skip to section 2 on the Nature of IT Work . If you possess deep insight on the IT industry, you may opt to familiarize yourself with 3 variables I use (E, W & K) in the 1st paragraph in section 2, then jump on to section 3 on Compound Interest in IT Work .

1. Magic of Compound Interest

Let's start with a simple example: If you deposit $1000 per month into a investment account that returns a modest 6% interest compounded monthly, it would take you 30 years to reach $1 million dollars. You actually deposit a total of $360,000 (30 X 12 * $1000) which means that $640,000 is interest that the financial institution pays you. After 11 years and 8 months, the interest gained overtakes the amount you are depositing. Ignoring the effects of inflation, many of us would be happy to have started such an account some years ago. Now for a more impressive example: If you invested $12,000 per year in the S&P 500 Index, which returns 10.5% per year, it would take you just 23 years to cross the same $1 million mark! Even factoring in painful realities such as income tax, having $1 million in financial instruments that match the performance of the S&P 500 give you the comfort of having $60,000 to $70,000 per year to live on, even if you don't lift another finger to work for the rest of your life.

Conversely speaking, if you bought a car for $12,000 but decided to float it for 5 years on your credit card which charges the typical 12.9% compounded monthly, you'd find yourself owing $23,000 (almost double) 5 years later. Instant gratification comes at a high price. Credit card companies love it when you just make the minimum payment each month, if the amount you owe is large enough, the total amount you owe will grow larger and larger each month. Taking the scenario of owing a credit card company $23,000 for a car or any other purchase that you couldn't really afford in the first place, you can make payments of about $250 per month ($3,000 per year) for the rest of your life and still die owing them $23,000!

You can convince yourself of the magic of compound interest by using a spreadsheet to vary the scenarios. My own experience with looking at the numbers is that the result is not always intuitive, but it is usually surprising. There seems to be some magic that accelerates the growth of what starts off as a small number into a much bigger number. So anybody with half a brain will conclude that they would like this magic to be working for them, and certainly not against them. Those who have built a fortune certainly understand this and use it to work for them.

Sadly, this seemingly simple phenomenon is not understood by the masses. It is used to benefit even fewer. Most people whom you'd think should be in good financial health but aren't are victims of compound interest acting against them. I have more friends and family who are such victims than I want to count. So if my hypothesis is plausible and this really has a parallel in IT, IT workers and organizations should also take heed and let the magic work for them, or at least not against them.

2. Nature of IT Work

IT work is inherently knowledge-based. All other things (like demeanour) being equal, a person's effectiveness (E) varies as a function (f) of the work (W) they have been assigned, and the knowledge (K) they currently possess to tackle that work. I've grouped theoretical knowledge and practical experience into K.

	E = f(W, K)
Don't misunderstand how much emphasis I place on an individual's ability to get along with others and exhibit great teamwork, I'm just trying to keep this discussion simple enough to be useful. I'll limit myself to basic formulae since the strength of my argument is directly proportional to how much experience you have in the IT industry anyway. Regardless of your experience, some things should be quite clear:
  1. If K is near zero, almost any reasonable W will result in low E. People with very little knowledge (even indirectly) applicable to W cannot contribute effectively because they have to learn from others, from mistakes or from some form of documentation. This is termed OJT (on the job training).
  2. The lower bound of knowledge (Klb) is not zero! It is some negative value limited by how misinformed this person has been. For example, if you need to drive to San Jose from San Francisco Airport and K represents the relative positions of the two cities, K = 0 when you have no idea where San Jose is. K is negative if you think San Jose is North of San Francisco. The exact value depends on how far beyond the Golden Gate Bridge you drive before you realize you are going the wrong way. The old saying that "A little knowledge is a dangerous thing" applies here.
  3. The lower bound of effectiveness (Elb) is not zero! It is some negative value limited by how much access the person has to systems and other people. A person with low enough K (or sheer ignorance) under intense work pressure to complete W can potentially inflict severe harm on the health of an IT system or project, and lower the effectiveness of others at the same time.
  4. For any fixed target of W, there is some minimum knowledge (Kmin) required. This either existed before the project, is acquired during the project whilst completing W, or is never acquired as W is never completed. Kmin for any W corresponds to Emin. To simplify the analysis, it is safe to assume that Emin is at some arbitrary industry average, and as negative as I am about the industry, it is probably somewhere above zero. If I really wanted to be controversial, I could cite several references suggesting Emin is actually negative. There can be no concensus due to the subjectivity of the topic. A good day at work for me constitutes doing something well or learning how to do something well. I know many people (all to remain anonymous), where good days at work mean they get to keep their jobs. The IT industry is so immature that the chains of incompetence are pervasive.
  5. If K < Kmin, as Kmin - K starts from zero and increases, E gets further below Emin non-linearly, I conjecture that it accelerates until it becomes negative then decelerates and becomes asymptotic to Elb. With the current pace of technological advancement, although missing some required knowledge is commonplace enough, it can be overcome via OJT. In fact some ambitious people insist their work be challenging enough that Kmin is above the K they currently possess because they want to learn something on every project. The point is that the leap must not be too great as K << Kmin is a recipe for certain disaster unless this person only plays a bit part in the project and is primarily there to increase his or her own K in the hope of contributing at a later stage or a future project, or W is research work.
  6. If K > Kmin, as K - Kmin starts from zero and increases, E gets further above Emin non-linearly as well. I believe this curve is gradual, E increases slowly and is asymptotic to some optimal value where having any more knowledge or experience no longer contributes because the knowledge is only distantly relevant. The point is that knowledgeable and experienced people are the minority because of the large amount of information out there, and the pace at which it changes. It takes a significant amount of effort to excel in any area, and constant attention to stay up-to-date.
In teams, projects or large organizations, the above relationships between K, W and E is further compounded by the knowledge (Kother) other people accessible to the individual possess. I won't attempt to formulate anything here but it suffices to state that a person's low K can lower Eother and a person's high K can raise Eother. Your own experience probably bears this out: Projects where the collective knowledge (Kcol) is greater than Kcol-min, and there aren't too many people (since communication costs don't scale well), have a decent chance of succeeding. Projects where no team member possesses Kmin are doomed to fail very badly.

3. Compound Interest in IT Work

Any unit of IT work (W) presumably has a benefit to it, this benefit may not be easily quantifiable but it should either:

  1. Increase the business revenue stream without a disproportionately large increase in cost.
  2. Lower the cost of doing business without adversely affecting revenue.
  3. Be a prerequisite activity that enables or promotes one of the above in the future.
Any IT industry veteran (or staunch Dilbert fan) will tell you that many projects do none of the above. But let's restrict our discussion to the ideal world where we can assign a positive dollar value to the long term contribution of W to the company.

Even in your wildest open-source dreams, IT workers do not work for free. The effectiveness (E) of an IT worker on the project, alternatively termed ROI (return on investment) or cost-effectiveness, can also be assigned a monetary value. Examining the simple case of a one-person project, E is positive when W exceeds the wage (salary & benefits) and material costs of achieving the project goals. (Note that a positive E can still be unnacceptable, especially when E < Emin.) I've encapsulated all the complexity of time, morale, etc into the function f() because those factors are not central to our discussion. They are given separate treatment in section 5 on Recurring Themes .

The adage, "Knowledge is Power" is as true in IT as any other business. Knowledge (K) is the pivotal element in the IT cost-effectiveness formula:

	E = f(W, K)
A unit of work (W), like setting up a web storefront to sell cameras, where you make a projection that the new storefront will increase your quarterly sale of camera units by some quantity, is relatively quantifiable. K is impossible to quantify on its own. For example, if a subset of your K is your absolute mastery of the COBOL language and development environment, you could have made a small fortune during the Y2K panic but will find it much harder to locate lucrative projects requiring those skills now. In that sense, K cannot be measured in absolute dollar terms but are much closer in nature to frequent flyer miles. They are acquired through various activites like reading books and magazines, attending training and conferences, benchmarking, experimentation, prototyping, and in the course of doing technical work like administration, design, development and troubleshooting. Like frequent flyer miles, K also has the notion of expiry although it resembles a half-life function more than a binary one. If you want to carry the analogy further, this gradual obsolescence of knowledge is similar to cost of living inflation. K differs from mileage points in that using them on one project doesn't reduce their quantity for the next project. Herein lies the effect of compound interest in IT. If you've worked hard to acquire a high K level, your E returns for a typical unit of W will be high. In real terms, this translates to finishing the work on target with satisfactory quality such that you have time and/or justification to assign yourself to other K-increasing activities. I've observed and been fortunate to experience this multiplier effect that closely resembles a positive caseflow situation. Once K breaks out of the Kmin orbit, one uses up less energy to increase it. Or in simpler language, once you've attained a critical mass of knowledge that allows you to exceed your job requirements comfortably, you tend to be more motivated, have more time, and have to use less energy to acquire more knowledge. Here is a more detailed enumeration of the reasons behind this effect:
  1. Although the industry as a whole has a terrible reputation for underestimating the time and effort required for projects, there is still some 1st level approximation concensus out there for any W.
  2. If your K level is far above average, you can meet or exceed expectations and still reserve time for K-building activities.
  3. High K levels are unfortunately not the norm in the IT industry where demand for a skilled workforce far outstrips the supply. As such, if you've attained a high K level in a particular area, you probably understand its value and are more motivated to retain that advantage.
  4. Although the barrier to entry is relatively low (compared to medicine or law for example), it takes a fair amount of time to build up the experience (the facet of K as important as theoretical knowledge) necessary to tackle the complex information tools and systems in use today.
  5. Once you've demonstrated a high K level to your peers, they tend to approach you for critique or to help them out of an impasse. This may consume your precious bandwidth but also offers a wide variety of scenarios for you to learn from without having to be fully engaged in those projects. The secret here is to reprimand those who can't be bothered to RTFM, so that the quality (complexity) of problems brought to you is raised, and most of them turn out to be win-win situations because you also learn something while helping them.
  6. Many areas of K are prerequisites to other areas, thus a high K level facilitates building it up further.
  7. Knowledge in diverse areas of technology may give you insight to design patterns which make it much easier for you to pick new K in an area that uses the same patterns.
  8. It is common knowledge that there are many ways to solve any problem. The humorous saying, "If the only tool you know is the hammer, every problem looks like a nail." is a different way of saying this. With a high K level, you will be equipped with alternative approaches to various problems, and choosing the most appropriate one will increase the quality of your work.
  9. A strange phenomenon I've experienced even when tackling W that demands Kmin exceeding the K which I possess is that with my K level being high enough, I have a calm confidence that allows me to chip away at the problem without being fazed.

Even in IT, the alternative to positive cashflow is deficit spending. It is a very unpleasant predicament that I found myself in during the early years of my career and have resolved never to subject myself to it again. Nowadays, I find it disconcerting just to watch others flounder in that sorry state. Nevertheless, I list the reasons behind this negative force since an understanding of it should help others overcome it.

  1. The hardest part of any project is the start. There is enough natural inertia in getting a project underway. This inertia increases when K < Kmin.
  2. There is good reason why they say "Hindsight is 20/20". When you are missing some of the K required to complete W, it may not always be obvious what exactly that K is, and how to go about acquiring it. There is always the risk of picking up superflous K (although it may benefit a future project) before homing in on the relevant K. This reduces E when the delay becomes significant.
  3. If there are people with high K accessible to you, you might be disturbing them with really trivial RTFM-class questions. This not only lowers their E on whatever else they are working on, but may decrease your accesibility to them in the future when you really need it.
  4. Trying to acquire missing K under schedule pressure is counter-productive. It has been scientifically proven that humans are less productive when they experience low self-esteem, and when they are under the emotional stress that they comes from not seeing light at the end of the tunnel.
  5. Unlike constructing a building, where badly tilted structural beams are quite obvious even to the untrained eye, software is so intractable that severe deficiencies may not be obvious until a system goes into pilot or production. Even the addition of ultra-high K members to a team at such a late stage may not be enough to salvage a project. At best, a lot of effort is already wasted and W is still delivered but with much lower E.
  6. Workers with deficient K are prone to making mistakes. I have not found any correlation between how low their K levels were and the magnitudes of the mistakes, only between how much they were responsible for and how much they managed to screw up. Some mistakes have far-reaching effects and can result in many other people (regardless of K level) scrambling around to try to resolve the problem. My only advice is to keep low-K personnel off mission-critical systems and projects. Either invest in them to upgrade their K levels, or get rid of them. Since there are hardly ever any one-person projects or systems, this negative multiplier is perhaps the single most significant contributor to compound interest working against you in IT.

4. Case Studies

OK, so far all I've shared with you is theory. How do theories come about in the first place? Either someone is a genius who can simulate extremely complex natural occurrences within their minds, or has made some observations through their own experience and notices some patterns. The latter is true in my case. Like I mentioned earlier, it's impractical for me to scientifically prove this hypothesis, but I hope my experience will stimulate reflection on your own experiences and lead you to the same conclusions. Whether that happens or not, I'd be happy to hear about your experiences and conclusions . I've deliberately omitted all names of people and projects to protect the guilty but have been quite liberal with providing lots of other details. If you are among the anonymous guilty and are sure you can explain how the bad things that happened on your project were unavoidable, I will promptly remove it from this document. If I find your excuses lame, I will name you and the project. This is to reduce the amount of time I waste on useless activities. My detailing of case studies are not meant as personal attacks, I would use full names, dates and places if they were. Rather, they are meant to illustrate the good and bad effects of compound interest in the IT workplace. The case studies are grouped by the activity they were observed in just to demonstrate the ubiquity of this effect. If you don't have time to look at the 25+ case studies which I've painstakingly detailed, you can jump ahead to section 5 where I've extracted some Recurring Themes .

4.1 Designing

Low K levels beget low K levels.
I once worked for a development manager that knew enough to get her job done (K > Kmin) but not too much more. As a result, she was wary of a few of us in the group who were passionate about programming and willing to try out new things just to see how they worked. There was a project which a coworker (also a friend) and I thought was highly appropriate for implementation in C++, but since our manager only knew C, she cited instability in the new technology as an excuse not to even try it. Fortunately, my friend also had more stamina than my boss, and she conceded the point after they debated one evening for a few hours. Humans are susceptible to feeling insecure. A supposedly technical manager with low K lowers the group E.

Lower-level platform K is a good thing.
The above-mentioned manager assigned me to tune an application she had written because its run-time of 11 hours was impacting factory production hours. It was an assortment of C programs and shell scripts that filtered through a dump file of our bill-of-materials (BOM) to extract the relevant part hierarchies for the various shopfloor databases. When I mentioned trying to use the relatively new mmap() call (memory-mapping) to speed up file I/O, she vehemently opposed using any 'fancy' techniques. She said all I needed to do was to reduce the run-time down below 7 hours. I avoided further discussion, used mmap() and pointer swizzling instead of staying with the inefficient method of continually forking filter programs from a script, and cut the run-time down to 4 minutes. Evidently, my solution was scalable enough since my extract program remained in production for 9 years until we replaced the entire shopfloor system. I'd love to gloat about what happened to that manager but that is irrelevant to this discussion. Fortunately, my earlier 2 managers had high K levels, and encouraged me to master the many capabilities of our powerful OS. The software we typically manipulate is built upon layers and layers of other software, building up high K levels in the lower platform layers can only help.

Better to have less high-quality K, than more low-quality K.
Sometimes, when a person's K doesn't rest on a strong foundation, or has never been tested in a real production environment, the K becomes a liability. My peer technical lead and I were flown to another continent by the project manager in the hope that we would be convinced to use technology developed by his 'blue-eyed boy' from their time together in another company. We patiently listened to a proposal for a communications infrastructure for our new factory test system, and were appalled to hear that the messaging was based on a non-terminating broadcast scheme. In other words, messages would traverse LAN segment boundaries and be propagated until there were no more customized router daemons at the ends of a 'logical ring', even if the intended recipient was residing on the same host! We promptly shot the proposal down. Some principal engineers are worth less than a fresh graduate.

4.2 Experimenting

Academic qualifications are no guarantee of anything.
We had a PhD candidate from MIT in a technology group I belonged to. He is a highly intelligent person who had a strong conceptual grasp on technological issues, and a nice person as well. I didn't notice until later that he avoided working on any experimental project alone. In retrospect, we realized that we did all the programming on all the projects he was on, and that he's hardly capable of stringing a 5-line script together. He moved on (for a nice raise) to Xerox XSoft, which was probably a better fit since they hardly produced anything either (OK, low blow). His one unforgettably wonderful deed was to bring in Jim Coplien, an ex-colleague of his, from Bell Labs to instruct us on OO and C++. I still track Jim's work on patterns.

The lower bound of E is a function of the people and systems accessible by the person with low K.
Reinforcing the previous point, my boss hired someone with a masters degree in Computer Science (MSCS) to strengthen my team. I was leading a contingency effort to port our existing mission-critical test system to a new OS because new workstations did not have sufficient diagnostics on the old OS. The new-hire was obviously quite a ways behind as she is best remembered for lifting up an RS232 connector and asking another colleague, "Ethernet?". Her limitations soon became obvious so my boss assigned her the development of some 'harmless' utility script while 3 of us plodded away on the main problem. One day, she comes frantically running in my lab asking if anyone knows what happened to her script. To our dismay, we discovered that her script had blown the entire filesystem away, taking down itself as well as 2 days worth of our prototyping work. Call me an extremist but I believe that low K personnel should be isolated to separate systems and networks, firewalled if possible.

4.3 Implementing

Some K is highly applicable to areas they were not intended for.
I took a college course in "VLSI Design" that involved the use of a multitude of CAD tools, most of them were stages in the automatic generation of a chip from logic specifications. It was bad enough that 50 students had to share 12 workstations, most of the students I observed were typing in long incantations of piped commands to get their work done. The more software-savvy were embedding these commands into scripts. Thanks to having worked with people during my internship at Sun that pointed me to source control and automatic builds, I configured the dependencies in the pipelined stages with relative ease and all my project team had to type after that was "make".

K level may not be proportional to how vocal a person is.
On a 15-person, cross-geo team designing a new test system, there was a particularly vocal person who took pleasure in excessive demonstration of his K level (he had an MSCS of course). He used a condescending tone when making presentations and challenged design decisions frequently. It would have been tolerable except he was wrong most of the time. After wasting time proving him wrong in a few agitated public debates, I forced the project manager to remove him from the project to avoid further slowdown. People with K << Kmin should be booted off projects unless they have a nice personality.

Low quality K can be worse than no K.
On the topic of personality, a really nice coworker (with an MSCS) on the above project was tasked to write a resource manager daemon. Unfortunately, he was only superficially trained in OO and coding for performance. While reading a database table with R rows and C columns, instead of speeding up access by batching up the reads, he hit the database R x C times! The high-level design had already passed through design review, but schedule pressure prevented us from performing a code review until he had amassed 15,000 lines of molasses. Another high K colleague and I subsequently wrote an infinitely faster, fault-tolerant version in less than 4,000 lines. It was the first time I had worked with this person, and I'd gladly work with this person again because he is a genuinely nice person, but the complexity of the job (Kmin) must be within reach of his current K level.

You only need enough K to leverage off the K that others have.
I began using Perl around 1990 to automate system administration tasks. I started using it again in the last few years for CGI processing. I am by no means a Perl expert but just knowing how to use it, and of the existence of the CGI module (CGI.pm) allowed me to accomplish my tasks very efficiently. K does not need to be very deep for you to be very efficient. You still need to have enough K to know which little K to acquire to leverage off others.

4.4 Testing

Safety in numbers: High K coworkers give you the courage to follow your convictions.
From the military point of view, I don't believe in taking the chance that I've advanced so fast that enemies are left behind my lines. You don't need to have served in the army to sense that an enemy behind you is a lot more dangerous than one in front. (The notable exception being MacArthur's leapfrogging of immobilised Japanese troops in the final stages of the Pacific War). Being a programmer since 1978 (high school), I have always had a deep sense that I am more productive when I test the heck out of a freshly written block of code than amassing tons of code before I even run it once. The rationale behind incremental, continual testing is very sound: Context switching, an expensive activity for operating systems, is much more expensive for humans. When you try to locate and fix a problem with something you've done a few minutes ago, it will take you a lot less time than if you try to fix it a few days later. The comparative cost goes up with code complexity and duration of elapsed time, peaking with the scenario of someone fixing code so long after writing it that it might as well be someone else's code. Unfortunately, typical IT management seems to reward people for hitting deadlines and reaching milestones, not for the quality of their work. Since one needs K to recognize quality, it is a lot easier for the typical manager, who can't actually do what his people do, to just measure quantity. This brings about an individual contributor's quest for 'pseudo productivity', just churning out as many lines or modules as he or she can get away with in the shortest amount of time. Or better yet, independent of actual output, being able to report as many line items of apparent progress on a status report as possible. I have an excellent ex-colleague and friend to thank for giving me the moral courage to depart from this norm. His code is so well surrounded by unit and integration tests that it inspired me to do the same for mine. He wrote the first public domain C++ wrappers for XView, the RPC-based ToolTalk database server, and finally made it big after 6 start-ups. He himself was influenced by a friend who is an expert in operating systems and an author of a prominent multithreaded programming book. And I believe that person was mentored by the guy who invented the Self language and implemented early Smalltalk environments. My point is that many passionate novices possess the right instincts for high E in IT work. When surrounded by high K coworkers, their potential is maximized because they acquire the courage to do what is right even when politically incorrect. And an obvious compounding effect takes place because it costs a lot less to get it right earlier then come back to try to fix things later on.

Slowing down in order to go faster.
How many IT projects do you know come in on time or early? Having had strong influence from high K coworkers as described above, I developed enough K myself to implement 2 unconventional ideas in a large project: a throwaway prototype and mandatory regression tests in all modules. We were building a test system that required a persistent messaging system. The old system was built on RPC (transient messaging) so there were a lot of unknowns for us to handle. With high K coworkers supporting my idea, we spent 3 weeks on a throwaway prototype (unthinkable for a typical manager to swallow) and got all the important questions on our list (including scalability) answered. The additional time required to write regression tests and harnesses was even more significant. By my estimate, it may have added more than 6 man-months to the 150 man-month project. But I know that we would not have completed the implementation early without the rigor and discipline. We used the remaining time to test the heck out of it at system level and went into production on time. When I first joined the group supporting the old system, I got paged twice a week on average. After the new system went live, I probably got paged twice since.

Chicken or the egg? Test or the code?
The second question is actually a rhetorical one. After interacting with many high K people at OOPSLA conferences, I've become convinced that it is preferable to write a regression test harness and some basic tests BEFORE writing code whenever possible. For example, if I'm developing an XSLT to transform one document type into another, the first thing I'd do nowadays is to write or find a few examples of correct corresponding source and target documents so I immediately know when my work is done. An advantage over the traditional sequence of writing the code first is that people who write the code first will not be able to resist the temptation of running source documents through it to examine the output. Depending on the type of application, one can be more susceptible to accepting erroneous output as correct, delaying the discovery of the problem to a later date or never. I've internalized this now, thanks to the K levels of the people in the XP (Extreme Programming) movement. An illustration of how their K level has compounded my E as well as their own.

4.5 Deploying

Spend $1 on a credit card and owe $1.1 million in 78 years.
Ignoring late charges which would further accelerate the shock, and suspending the reality that nobody would give you such a high credit limit, the above scenario is numerically possible. Such are the parallels in IT where one tiny mistake can end up costing a company orders of magnitude more than the cost of avoiding the mistake in the first place. As a student intern, one of the first projects I was assigned was to determine how much test data collection we were losing over several months, and root-cause why it was happening. Coming from Berkeley where Ingres originated, I was able to hit the floor running and immediately modify the data entry program to log all test data collected into a flat file as well as to the Ingres backend. Comparing the daily totals, I reported that we were missing between 2% to 10% of the entries in the database. I analyzed the timestamps from the multiple logs and realized that the missing entries fell into a regular pattern: more data was lost when multiple factory staff were data entering simultaneously. Picking up a DBA manual for the first time ever, I realized that the cron job my low K coworker wrote restarted Ingres in single-user mode, meaning it didn't bother to enable locking and multiple clients were clobbering each others entries. The version of Ingres we used appended rows to each table file one disk block at a time for efficiency. Without the multi-user mode protection of serializing row insertions from different clients, each client assumed it had solitary control over appending to every table. That meant the last client to write a row that filled up a partial block overwrote any other client that was assigned the same starting block address. If the culprit had invested 2 hours to RTFM, we would have avoided 400 man-hours in meetings, investigation and troubleshooting. Not to mention that lots of data would not be lost forever.

Low K and high speed, a deadly combination.
There was a system administrator in manufacturing support who didn't like being told what to do. When I used to advice him how to take precautions and remind him that the tiniest mistake of bringing the shopfloor down would cost us $4 million in revenue per hour, he repeatedly chanted out, "I know, I know. No problem." He was actually a dilligent person, always in a hurry to get things done, but he had a big problem admitting he didn't know anything. One day my pager went off and I was summoned into an emergency meeting to find out why production went down as a result of all 3 dataservers rebooting. I knew it was statistically close to impossible that 3 servers could experience disk failure within 2 minutes of each other and volunteered to investigate the situation. Nobody admitted to any fault so it smelled very fishy to me. Pulling the server room access logs from security, it turned out that the sysadmin entered the room 5 minutes before 1 dataserver went down and 7 minutes before the other 2. A drive did fail on 1 dataserver (not a problem as production would go on) but after he halted the server to replace the drive, he walked to the rear of the aisle of servers and turned the power off the rack that contained the other 2 dataservers. I'll leave what happened next to your imagination. I don't know what became of this person but if he had enough K then, he would be confident enough to admit lack of K and honest mistakes, and realize that it is OK not to know a lot of things. And that could have prevented him from making many small mistakes as well as that very memorable one. For myself, the more K I acquire, the more K I realize I am missing. The important thing is to have enough K to make steady progress without lowering everybody else's E.

A little knowledge is truly a dangerous thing.
When I was an escalation engineer, our team was requested to help customers recover lost data on many occassions. One particular case I was assigned had a customer sysadmin who insisted that our Veritas RAID solution was buggy and demanded we help recover their data. Upon investigation, I discovered that they had configured 5 plexus into a RAID-0 (striped) metadisk. I was horrified and asked if they had done a backup, they said they had but they were wondering why the hot spare didn't kick in for automatic recovery. I then gave them the bad news that recovering from failed drives was only possible with RAID-0+1 (striping & mirroring) which required another 5 drives, or RAID-5 (which may offer slower performance). Their sysadmin was worst than clueless, he knew enough to lure them into a false sense of security. As you can guess, their backup recovery didn't work either.

Two heads better than one? Twenty can be worse then one if all are low K.
A similar case as above involved a telco complaining about the quality of the drives we supplied them. They complained that the consequences of our inferior quality product was causing them excessive downtime. This was a strategic customer who were data mining from a few hundred gigabytes of phone calls, and it was political in nature since the group championing our platform were under heavy attack from other groups in the telco using competiting platforms. I was instructed that I should drop everything else to concentrate on this 'Severity 1' case. After our account manager gave me the case history and I scoured through the system and database configuration, my response was a nonchalant, "What do they expect?" In a nutshell, their architect (or lame excuse for one) had utilized almost 200 drives but had it configued such that any drive failure would bring down the entire system. At a big meeting where their entire team was in attendance with some development consultants, 2 sales people and 2 engineers representing my company, I explained that all hardware devices had a Mean Time Between Failure (MTBF) rating which could be used to predict system availability. I was thoroughly familiar with the concept having come from the factory where we housed our own MTBF Lab to keep our parts suppliers honest. Based on their configuration, we could expect a drive to fail every 3 to 4 weeks, which was close to what was happening. I don't usually seek to persecute people but this was a kill or be killed situation. So with my company's interest at heart, I had to discredit their architect in front of their management and recommend that they add redundancy in with a ton more drives.

Low K personnel should not be allowed access to production systems without close supervision.
When coworkers demonstrate low K levels, we should isolate them in lab environments to do their work, then check it thoroughly before letting them release it to a production environment. That would have averted this nightmare I am about to describe. This person was assigned to help on the case management project under supervision because she was taken off another development effort that required a lot more initiative and K. She added some new rules to the alert management module and didn't bother to test it well. Next morning in another timezone earlier than ours, many engineers had their message pagers flooded wth alerts to the point many of them were simply turning them off. Several hours later, the new rules were removed and the changes taken into isolation for root cause analysis (RCA). The details of the bug are not as significant as my general observations. Low K workers tend to remain low K because they have no drive to increase their K. Most of the time, they are not only technically low K, they tend not to be too aware of the business issues as well, so they don't realize the importance of the systems. And in line with that, they also tend not to be thorough enough to be left wandering around production environments as this case demonstrated.

4.6 Troubleshooting

Just enough K to do your job is often way too little to do it well.
A colleague in an development and integration group approached me to help troubleshoot a performance problem they were experiencing on a logistics application. The client was supposedly run in Japan while the server was in Singapore. They had raised the issue with the network infrastructure group 2 months earlier but the other group kept telling them that they didn't detect any network bottlenecks. When I investigated the matter, I was horrified to find out that the client was actually also running in Singapore but X-displayed over to the warehouse in Japan. Despite our ATM backbone, the event and display packets would still have to get back and forth betwen Singapore, Osaka and Tokyo on ATM, and between Tokyo and Atsugi on leased lines. I immediately informed my colleague that this situation was not very scalable and poor use of bandwidth. And even if I helped to solve this problem, I would hold them accountable if there were future complaints about performance. It turns out that the person in the network infrastructure group had no idea what sort of traffic this application generated and conducted ping timings on each network segment. The default payload size for a ping is 64 bytes, nowhere close to simulating X traffic. I employed the same ping test on all the segments, but with a 32K packet size resembling an X screen refresh, and detected severe packet loss on the leased line segment between Tokyo and Atsugi. A day after informing a high K friend I had in the network infrastructure group, she confirmed my observations and fixed the problem by changing the priority queuing configuration on that segment. Here we have a classic case of the ball being dropped in no-man's-land. All the application people knew was that the response rate on the screens was unacceptably slow. All the network infrastructure people knew was that the network had no problems for 'normal' traffic. Real world IT environments are extremely complex and getting more so, there is a need for more people with multi-disciplinary skills. Even high K levels can be insufficient if it is all concentrated in a narrow field, there is a need for the K to span many boundaries. Don't fall into the trap of identifying your core competencies and only focusing effort on them. If there are other areas of K you are dependent on, someone in your organization better have the first clue about it.

Some inherently complex problems require a lot of K. If effort is not spent making the necessary K available, more effort will be spent on the problem.
Have you ever considered walking into an electronics store and asking to buy a DVD player that is no longer being sold and doesn't come with a warranty? I have witnessed the purchase of a large, complex CRM system that makes as much sense as that. There are always extenuating circumstances for strange decisions but the jury is still out on the wisdom of this decision. Some cite the overriding factor as balance of trade, but is anyone actually tracking how much trade is being balanced? I will not expand the acronyms I use here while this incident is so recent but enough details illuminate this important lesson. Despite company S1 merging with S2 and focusing solely on S1's products for the future, we decided to deploy the ST product from S2 for production use. There was no real reason why we could not wait for the S2000 product from S1 if we truly wanted to forge a long term relationship. Since the companies were merging and ST effectively EOL-ed, all the knowledgeable engineers who worked on ST left the company. The SCS part of ST was the most problematic as it was a VisiBroker-based CORBA server written in C++, heavily multithreaded, and had to run over Bristol Technologies Wind/U as it had been written on Windows NT synchronization primitives and deployed on Solaris. I kept hearing about how this server kept crashing or hanging everywhere we deployed it, bringing all client connections down and forcing our engineers to log in again, losing any work they had in progress since the last save. Having experience in C++, VisiBroker, multithreading and troubleshooting such hangs and crashes, I volunteered to help root-cause the problems despite my ignorance of Wind/U. My few repeated requests to be provided full Purify out to help them determine if there were memory-related problems were never satisfied. And my suggestion to have a full development and debugging environment with complete source code access must have run into political barriers as it never materialized either. The good reason why everyone lost interest in the problem was that the situation became less critical because some workarounds were put into the client to allow it to reconnect to another server when their server stopped responding. Perhaps I got involved too late to help fix it permanently. What I do know from experience is that we should have forced the vendor the put in the right (high K) resources from the start, or done that ourselves. The hundreds and thousands of man-hours lost, the credibility from our users lost, and the delays to the project should have been arrested much earlier on.

Missing a little K is sufficient to lower E by a lot.
While attending a conference, I called a coworker just to exchange the latest news when he told me a website our division was about to launch was having severe performance problems. This group was using the Trilogy engine to generate quotes and it was about 60 times too slow. They were trying to ramp it up for weeks and their go-live date was at stake. My coworker had been roped in due to his K and reputation for being a problem solver. He first confirmed some of his observations with me about the use of various Solaris performance measurement tools. Then he sought my opinion due to my love of multithreading issues. After 5 minutes of detailed symptom description, I concluded that they must be using the reference version of the JVM without native threads pack installed. That would effectively limit their use of any powerful server to 1 cpu. 3 days later when the other group heeded my advice, my theory was proven correct and the new benchmarks showed a 600% improvement, now they were just 10 times too slow. My next obvious suggestion of adding CPUs didn't yield much of an improvement so we were faced with the reality that inefficient design or implementation was at fault. I suggested using JProbe, OptimizeIt or any other memory profiling tool to see how hard the garbage collector was working. True enough, samples taken before and after quote generating showed that many unneccessary objects were being created and destroyed. If you know JDBC well, you'd agree that DataSource objects should be Singletons or a limited pool at most. The poor quality Trilogy code was creating a few of these objects per run and causing excessive garbage collection. I did not get further involved in this project but there is an important conclusion we can draw from my earliest encounter. 5 minutes of my time was enough to boost the team past a 600% bottleneck. Hundreds of man-hours could have been saved if I was involved the instant this problem was detected. But in all honesty and humility, the K I possessed to be able to help them was very basic. Any team attempting a project of this scale should have known those basics beforehand, not learn it at such great cost OJT.

A little K can be so dangerous that it takes a lot of K to offset it.
I was an escalation engineer assigned a Severity 1 kernel panic case from a foreign air cargo terminal company. This company had long refused to migrate to our more stable operating system versions and for strange reasons decided to keep their mission-critical system running on Solaris 2.3. After questioning our on-site engineers and customer's sysadmins, checking Explorer (system configuration and log) output and patch levels, I didn't see anything obvious so I began the tedious exercise of performing analysis on a 2 GB kernel core image. Our group had the best tools and full source code access so it was just a matter of time before I would isolate some pattern. Unfortunately, I observed some symbols which didn't match the kernel source and totally stumped me. Way out of my comfort zone, I documented all work up to this stage and escalated the problem to the kernel CTE (corporate technical escalation) group for expert treatment. A few days later, a CTE engineer notified me that the NFS module was definitely from Solaris 2.4. I told our on-site engineer to collect the file checksum of that kernel module and he confirmed that it was indeed from 2.4. I promptly closed the case and warned the customer that their system would not be supported until they performed a complete OS reinstallation. So much bandwidth was wasted because some idiot knew where to replace kernel modules but not how dangerous the action is. More empirical evidence for my compound interest hypothesis. IT complexity is such that there may be perhaps hundreds, perhaps thousands of different ways to build a working system, but certainly many millions of ways to screw it up.

4.7 Training

With proper foundation, acquiring new K is much easier.
In 1995, when Java first exploded onto the scene, I was an escalation engineer in the fly-and-fix team covering Asia. My boss pulled me into his office and asked if I was interested to be a Java trainer. He said that I was the only person in the region available with a strong development background and it would bring the group visibility. I was aware of the history of Java due to being hungry for K in general. I paid close attention to the Internet (and maintained a project website since 1993) and attended TOI (transfer of information) sessions on topics ranging from new microprocessors, cache architectures, computers and peripherals to new operating system capabilities, windowing systems, network protocols, compiler features and development methodologies. So even though it was a tall order to attend a T3 (train the trainer) then teach my first class 2 weeks later, I pounced at the opportunity to immerse myself back into a development related activity. The ramp was not too steep since I had heavy-duty exposure to multithreading (liblwp & libthread), network programming (UDP, TCP & RPC), and GUI programming (SunWin, NeWS, XView, Xlib & CDE Motif), had forays into OO (Smalltalk, C++ & Objective-C) and was aware of garbage collection (Smalltalk & Lisp). I didn't sleep much the 2 weeks after T3, testing every ambiguity and corner case I could think of. People who value K tend to feel a moral obligation not to pass on bogus K to other people. 2 years, 257 students and 5 Asian cities later, I terminated my ad-hoc role as the first Java instructor in Asia with average student feedback of 9.2 (the old range was from 0 to 10). The pace of the IT industry provides ample opportunities for us to increase our K levels. I think the frequency of these opportunities is proportionate to our existing K levels.

The acquisition of new K is dependent upon one's level of interest.
It is no secret that many people are in the IT industry purely for the money. As an extension of that, they would naturally seek to 'pad their resumes' with anything that would increase their market value. I had the displeasure of helping out two such people in the early Java days when we were short on instructors. These guys would have failed my certification test and should have not been allowed to teach but for the education manager seeking to beat his revenue goals. They would go up in front of a class full of eager students and read off the slides, then scurry back to my office to get me to answer their students' questions for them. Some people have no pride, were we grooming Java instructors or parrots? The relevance of this case study is to show that you can't force K into a person who is not motivated to truly learn, not matter how conducive the environment is. It's best to save the resources for people who are really interested, and talk this other category into non-technical work.

An experience is worth a thousand theories.
A support engineer based in another country was given a teaching assignment that I had turned down on principle. His manager continually over-commited his team to tasks they were not qualified for, and often got my team to help them out for the company's sake. I was already laden with teaching and consulting Java on top of my regular escalation queue so I told them I would provide backline support instead of flying there. The root of the problem was that an important customer had already been promised the "Multithreaded Programming" class on-site and nobody in the other team had done it before! It angered me that many managers had no respect for technical K and assumed they could simply assign their staff to quickly pick up the K and use it for business. Invariably, these would be managers who have had little exposure to the type of technical work IT people do nowadays. At best, they have cruised through their lives as individual contributors using soft skills and performing primarily non-technical tasks. Anyway, I prepared this engineer by highlighting the more difficult parts of the course to focus his study on and encouraged him to spend the preceeding 10 days completing all the exercises, and asked him to contact me if he needed clarification. During the first 2 days of the course, he would call me to clarify student questions on thread lifecycle, synchronization primitives etc. and it seemed to be going well. But on the 3rd and last day, he called me in a rather frantic tone stating that all the students programs were executing in single-threaded mode and they were all totally stumped. He emailed me some code which looked as though it should have worked, then wandered off into fantasy land asking if there were kernel bugs or known patches that may not have been applied on the customer's systems. I questioned him about detailed output from the programs and what he saw in the /proc filesystem. Then it dawned upon me that he probably did not know the basics about compiling multithreaded programs. I asked for his Makefile but he wasn't using one so I asked for the exact compilation command. They simply forgot to specify "-mt"!

4.8 Consulting

Unscrupulous people will take advantage if they think you have low K.
We have had a horrendous experience using a contract system from an ISV. It all stemmed from the mandate to "buy instead of build". Buying instead of building makes all the sense in the world if you carry it out dilligently but our software procurement has turned out to be the mother of all nightmares from Elm Street. The root of this evil is that many managers have the misconception that it is easier to buy than to build. I suggest that it is much more difficult to do it well. When you build a system from scratch, your focus is on the business requirements, corporate architecture and interfaces to other systems. With enough time and talent, you will successfully build and deploy it. Talent with high enough K might even deliver a highly supportable system resulting in high E. But when you buy a system, you get sidetracked to features you may not need, politics, platform and pricing issues. Often, you may not dig deep enough to ensure that the extensibility of the product takes care of your esoteric business needs, requirements that users are not going to compromise. IT then ends up hacking the product to the point that it is no longer supportable or upgradable, taking away some important benefits of buying instead of building. Usually, insufficient attention is payed to interfacing with other existing systems and the cost and effort is not factored into the equation. And last but not least, there is not enough emphasis put on conformance to current corporate architecture such that the benefits of having a corporate IT architecture are lost because of the mish-mash of very different architectures in existence. The vicious cycle becomes complete when procurement is not conducted in a very technical manner because the higher K workers will leave for more engineering-oriented jobs. Once the ISVs know they are dealing with a lower K workforce, they will hold your company ransom, charging an arm and a leg to slightly tweak a possibly deficient system when requested. Our experience with our contract system not only fits the above description, it was so inefficient that we had to spend many millions upgrading our networks to carry the load. My boss assigned me to attend a full day presentation by the same vendor to vet their next generation system architecture. They had been working with us for a few years and were quite familiar with our architectural direction. All the right buzzwords were in there, CORBA interfaces, EJB server, Java APIs etc. But by the afternoon, I nitpicked enough to get really suspicious about their implementation and decided to maintain my poker face and go in for the kill. Chatting casually to their chief architect during a coffee break, I got him to say what was obviously omitted from all the slides. It was being built over a Distributed Smalltalk core, something we were not willing to accept. They knew that and attempted to window dress the new system with buzzwords that were in line with our company's direction. Although their next generation looked significantly better than their current (not difficult to), the last thing we needed was a painful migration to another extremely proprietary platform that we weren't going to build expertise on, with lots of excessively expensive consulting necessary to deploy and maintain. With sufficient K, another snow job averted and millions saved for my company.

With proper foundation K, simple mistakes can be avoided.
As a result of teaching many students Java programming (not my job), an important customer made a special request for me to provide them consulting (also not my job). Another $30K revenue for my company and higher visibility for my escalation group was hard to turn down for my boss so I was rewarded for my extra skills with a lot more extra work (that's how it works). The customer operated the busiest port in the world and was having performance problems with an applet they had written for cargo manifest manipulation. This was in the early days of JDK 1.0.2 where object serialization was not available so they had to devise their own protocols to transport a large hierarchy of objects presenting the cargo within containers on a ship. They used an encoding scheme based on concatenating strings with delimiters and found that transmission time for the cargo of a large container ship took 50 seconds. When I got on-site, the first thing I taught them was to instrument their code to profile the transaction and determine where all the time was being spent. It turned out that were simply using the "+" operator to concatenate strings, which are immutable in Java, instead of using the StringBuffer class. After explaining the difference and helping them rewrite the encoding and decoding, transaction time dropped down to a more acceptable 6 seconds. I chided a few of them for not listening carefully in class and encouraged them to look at their notes to strengthen their foundation. Most expensive mistakes come from a lack of proper foundation K.

5. Recurring Themes from the Case Studies

I hope the case studies I detailed have stirred up memories of your own experiences. There are a few recurring themes one can pick up from my case studies. They are consolidated here to reinforce the argument that there is a compound interest type of effect at work.

5.1 Knowledge Ramp Too Steep

When K << Kmin, the probability of failure is very high. The main reason is that the more K you are missing, the more likely you are to NOT know what K you are missing nor how to acquire it. Success or failure is not taken as a binary function here. Failure can come in the form of late delivery, increased costs and/or lower quality. Even though the first 2 forms are more explicit, low system quality can be much more insidious. It can end up costing an IT department much more than the project itself! Owing too much money and never being able to earn enough to pay off the loan is a perfect analogy. After a system goes live, there is natural resistance to perform a major overhaul. Besides the obvious political incorrectness for a group to admit its inadequacy, the overriding reason is now that the business will not tolerate any prolonged disruption. Low quality systems use up a lot of resources for support and maintenance. Resources that could otherwise be put to acquiring K for design and implementation of higher quality systems. So begins a vicious cycle that can bog down departments or divisions for years.

5.2 Multi-Disciplinary Knowledge is Compulsory

I didn't intend to stretch the financial analogy to its limits but this particular comparison is highly relevant. If you take care of your monthly cash flow but neglect retirement planning, you are in financial risk. If you take care of both but don't buy a property, you could get into trouble if property prices skyrocket and raise your rent significantly. And just to flog the dead horse to shreds, if you take care of all of the above but don't have any form of insurance coverage, your family is still susceptible to financial disaster. You get the point, personal finances dictates a holistic approach, and so does IT work. I've met managers who have stated that their staff are "application people" who don't need to be trained in other areas like system administration, databases or networks. -- WRONG! -- Unless all the necessary specialists are ALWAYS accesible to their group in a proactive and reactive way, they are missing the first clue about IT systems. An application invariably runs on top of an operating system which controls access to limited resources. Even if your application has very few users and immensely powerful client and server machines, there is always a risk of creating performance problems from ignorance of how the underlying operating system works. It is safe to say to all enterprise applications are accessed through networks. I have also observed several applications perform poorly and waste network bandwidth because the IT people involved didn't have enough K to factor in the network. All useful systems also need to access databases. It is well known that typical applications get the biggest performance boost from application level tuning, but database tuning comes in a very close second. Expertise in the workings of databases can avoid many performance problems. Performance is just one aspect of it. Failure to understand the central issues in any of the diverse areas can also manifest itself in potentially more serious issues like reliability or data integrity. Many organizations are infected with the "we're not in charge of that" disease. There are usually separate infrastructural groups handling the networks, data and application servers, client desktops, helpdesk etc. Problems occur because groups in charge of applications don't train or staff up for the multi-disciplinary K that is necessary. Upper management is lulled into a false sense of security that the K exists somewhere in the organization but what they fail to realize is that the people with the K don't talk to the application folks everyday. We are taught from young in the context of healthcare that "Prevention is better than cure" . Unfortunately, in this case, cure can be much more expensive because people with the right K to fix any platform-related problems probably won't know the application and may have to take time to understand it before they can locate and correct the problem. In the worst case (which I've seen too often), the application may be implemented in a way that doesn't lend itself to any quick fixes.

5.3 One Step Ahead

When children are taught to save from an early age, they internalize the benefits of delayed gratification for the rest of their lives and are less likely to get into financial trouble. If taught more comprehensive approaches to financial planning, they may even execute a plan that insulates their family monetarily from extreme misfortune. I contend that it is possible to make such an approach work in the IT world as well. I should know because I have lived on both sides of the fence and spent the latter part of my career in positive K-flow. In the early years, I worked in a few groups that supported mission-critical factory applications. Half the time, our pagers interrupted development work in progress and forced us to troubleshoot emergencies that were hindering production. The rest of the time, we were enhancing systems that weren't designed to easily adapt to business process changes. New screens or reports often caused chain reactions of performance degradation or incorrect behaviour. I had a deep sense then that rewriting some of the applications from scratch may have been more cost effective than our patchwork. The biggest turning point for me was joining a technology group which had the charter to determine best practices and tools, and lead department-wide standardization. We were given time to read, experiment, meet vendors, compare tools and techniques, and build prototypes to validate our ideas. We acted as a sounding board for other groups to bounce their ideas off, and we pitched in to help projects requiring expertise we had acquired. It was this foundation that brought my K level up to the point that, as technical lead for one of the geographies in a distributed development project, I contributed to building a new test system that required an order of magnitude less support than before, and offered extensibility, customizability and scalability to the point that it is still in use now 8 years after FCS. My high K manager encouraged our pursuit of K in diverse areas that offered potential for productivity improvements. We introduced incremental compilers and memory checkers to the mainstream and investigated OO languages and databases. It was in such a community that I realized the value of technical journals, interest group meetings, TOI sessions, talks, trade shows and conferences. As a result, I picked up on hardware capabilities, C++, system interface programming, UNIX internals, CORBA and OO design techniques many years before I had to use them for serious work. Through activities as mundane sounding as open houses at our research labs, I have gained exposure to powerful languages like Self and lightning fast implementations like PJava. I was aware of work like the slab allocator and the zero copy framework for I/O long before those features and others from Spring made it into Solaris. It is exposure to these types of ideas that enhances ones base K and pushes the bounds of ones creativity outwards and upwards. It is an unending road. I will always have gaping holes in my K but with dilligence towards proactively mastering frameworks like J2ME, J2EE (Connector, CORBA, EJB, JDBC, JMS, JNDI, JSP, JTS, Servlet and XML) and Jini, and immersive APIs like JavaSpeech, JMF and JTAPI, I will hopefully continue to be able to contribute at a high E level.

5.4 IT Workers subject to Human Emotions

I have spoken to many demoralized people in IT groups working in deficit spending mode. Most of them are either developing against impossible deadlines with deficient tools and insufficient K or supporting unstable systems that demand lots of tedious activity that does not help them build up their K. When people have to face unpleasant situations all the time, are set up for failure, or are not given a sense that they are developing themselves, they will not perform at their best. Short of wasting lots of human resources on these situations, there will be a general downward spiral in the health of the organization and systems. I've seen a system so lame that it has to be patched every week, and another that crashed or hanged so often that the implementation team was fully occupied for a while, just reporting on the failures, discussing it amongst themselves, explaining the situation to the users, and building scaffolding to prop it up artificially. Often, these undesirable situations are of the magnitude that management cannot ignore, yet there is a lot of hesitation to admit the full extent of these mistakes. Only with the acceptance that it is more cost effective to replace rather than maintain the really bad systems will the organization be able to get out of the rut. Reorganizations would actually serve this purpose well. New management is much more likely to recognize the true extent of the problems than those who let it happen in the first place. Unfortunately, what happens in reality is that funding is poured into building different new systems instead of backtracking to fix old systems. A close analogy is a person in negative cashflow who instead of controlling his expenses, spends his energy on starting a sideline business at night. But if he has not learned the real lessons, the sideline will probably be operating at a loss as well. On the flipside, when IT staff are well trained, they will be confident, calm and collected, performing their jobs in an optimal manner. When faced with challenges, their already high morale will provide them guts and composure to execute whatever needs to be done at high E. This is simply the way humans are. We see it in sports teams, and we can certainly see it in IT teams if we look close enough. Just as it is difficult to maintain financial equilibrium, i.e. our assets are either accruing or dwindling at some rate, an IT groups' aggregate K and E levels are either spiraling up or down due to the human factor. It is up to us to ensure the direction is up by focusing on K.

6. Recommendation to Software Engineers

There are many types of software engineers. Whether you're an IT architect, technical lead, business analyst, systems integrator, professional services consultant, programmer, technical writer, systems administrator, database administrator, instructor, support engineer, language lawyer, tools-smith, test engineer, GUI designer, configuration manager etc. etc. (you get the picture), this hypothesis and recommendation probably still applies to you. Unless your job is really simple (in which case you might still want to follow my recommendation since you are easily replaceable :-), the knowledge it requires is probably built on numerous layers of other knowledge. You may know the most applicable layer well, but it is in your interest to make sure you also know some basics about all the other layers. Try to shake off the "I don't really need to know that" attitude. If you really cannot be bothered to extend your K levels beyond the immediately necessary, make sure you have people with K in the other layers involved in your projects. A close parallel I can think of is driving a car. People who spend the least and get the most out of their vehicles are those who know how to take good care of it and do most things themselves. If you can only be bothered to drive and pump fuel in when the 'E' light comes on, the least you should do is make sure a mechanic looks at it every once in a while. Consequences of not doing so can range from a dirty windscreen due to no washer fluid (inconvenience) to blowing your head gasket because your engine oil level is too low (total failure).

If you are already in the high K zone, don't let up. Maintain a high bank balance until you are approaching retirement. One has to put in considerable effort to remain at the cutting edge of software technology. I have heard estimates that the half life of skills you have is approximately 3 to 4 years. Conservatively speaking, if any K you have took 4 years to be worth K/2, you'd have to replenish 16% of your knowledge every year just to remain at the same level. If the half life is 3 years, you'd have to replenish 21% of your skills each year. And if you've been struggling like I have to keep pace with the rate of change surrounding Java and XML, you might agree that the half life feels more like 1 or 2 years! But there is no need to get alarmed, since specific skills are just one part of the equation. K is comprised of specific skills as well as experience. Experience lasts a lot longer, especially if you have internalized it. For example, when Java burst onto the scene, it obsoleted my knowledge of lex and yacc as those were C-based tools. But when I learned JavaCC, my experience with syntactic rules and parsing were very helpful in ramping up my skill with the Java-based tool. The point is that each individual is responsible for his or her own K development. No matter what type of IT job you have, it is in your own interest to acquire of maintain a high K level. I will not get deep into the subject of high quality versus low quality K. Reading what I've written so far would already tell you that I think it's better to know fewer things very well than to have superficial, and sometimes dangerous, knowledge of many things. So if there is K you need to know for your job that you feel shaky about, backtrack and solidify your foundation before going forwards to acquire need K.

If I have convinced you that you that it is worth your while to be on the positive cashflow side of compound interest, I am happy that I've shared something useful. If you follow or are already practising the above recommendations, then I'm happy for you. You must be as happy at your job as I am.

7. Recommendations to IT Managers

Yes, despite my conviction to keep politely declining any 'promotions' to managerial positions (10 and counting) for reasons that I will not disclose here, I do give a damn about you because your conduct or misconduct affects the lives of many who are trying to make a living as IT workers.

One of the most significant contributions I've made to my company is to actively participate in the recruitment process. I have interviewed more than 300 people (I know because I keep the resumes) for positions in various departments, and this has resulted in the hiring of almost 20 good people (not to be confused with 20 almost good people :-). Perhaps more significant is the rejection of the others who were either under-qualified for the job (K << Kmin), didn't fit into our corporate culture, or didn't have the integrity to represent their skillsets accurately. Too often have I observed managers hiring the wrong people because they rushed to fill up positions so as not to 'lose the headcount'. Worst yet, I've seen managers hire the wrong people because no due diligence was applied to ensure that candidates truly knew what they claimed to know. When a ringmaster is "Hiring a Juggler" for the circus, the resume and interview is one consideration, but isn't it most important to get the juggler to juggle in front of you? It is hard work testing every promising looking candidate to see if they are for real, but it is much harder work trying to get someone without the K to be productive. You need high K to ensure recruitment of high K personnel. If you don't have such resources at your disposal, borrow a high K resource from another group just to help with the interview process. Remember that managers are partly measured by what their direct reports produce, so it is in your interest to hire people with high K to salary ratios.

Whether you are in the midst of whirlwind recruitment or a hiring freeze, chances are you already manage quite a few people. You job is to maximize these resources in pursuit of your organizational charter. The single most important long term activity you can undertake is to interactively supervise each person's career development plan, align their personal goals with the goals of the group, and encourage them to raise their K level every year through training and experimentation. Most managers I've observed employ 'Spanish Theory Management' and try to squeeze as much of the 'fixed value' out of employees as possible. The better alternative is to invest in people by raising their capabilities so the compound interest effect can kick in. They will not only produce more and better value for you in the future, you would have differentiated yourself so much that they will probably follow you wherever you go.

My final recommendation for you goes hand-in-hand with the previous. I recommend making a conscious choice of quality over quantity unless commanded by your managers to do otherwise. It seems to be in vogue to report as many milestones in as little time as possible. One seldom hears anyone talking about the high quality of any system, only that it FCS-ed or went into production. If you change the focus of your group to quality instead of quantity, you would at least be pointing your people in the right direction to positive cashflow. The next step is to not cave in to external pressure when you know the right thing to do is budget enough time for projects to include architectural and design reviews, prototyping if applicable, load and regression testing, features for supportability, and mechanisms for user customization. The more we pay up front to make sure a system is sound and usable, the less we have to pay later (with compound interest) to keep it going. Once quality becomes a way of life in your organization, quantity will take care of itself, your high K staff will want to produce more to showcase their high E.

8. Some Final Notes

Despite the lack of formal proof, I hope I've convinced you of the existence of a compound interest effect in the IT world. I suspect some of the same compound interest magic is at play in other industry sectors as well. I know it is more noticeable in software than in hardware due to the intractable nature of software. If a business system is klunky, one can still fix data by hand or work around it using some maintenance screens; but when a microprocessor fails, system failure is usually total. An incompetent (low K) hardware engineer is less likely to cause catastrophic harm than a comparably incompetent software engineer because hardware failures are usually more obvious. In all probability, the low K hardware engineer will not be assigned more responsibilities after demonstrating his or her inadequacy. Since software is so easily patched on the fly, a low K engineer can rise up the ranks from writing lousy routines, then lousy programs, up to lousy collections of systems. I know this because I have witnessed this first-hand. In a culture where there is no time for regular design and code reviews, an engineer's political skills are far more important for career advancement than his technical skills. The pace of the industry is such that upper management is more concerned about immediate problems and willing to live with the existing systems that 'sorta kinda work'.

Abstracting yet another level up, one can say that most human endeavours offer the same choice between instant and delayed gratification. While that may be true, I suspect it is more pronounced in IT for 2 reasons. The first is that IT work is much more knowledge-centric than effort-centric. One way to elaborate on this is to compare a software engineer to a neuro-surgeon. The surgeon may have taken 10 to 15 years to complete specialized training but his contribution to the world still occurs one patient at a time. He has to operate for hours at a time, and only affects the patient, and his team to a lesser extent. Even if he contributed to the body of knowledge by becoming a medical researcher, other surgeons only benefit by reading and internalizing his work. A programmer, still in high school, may come up with a revolutionary new algorithm that improves an area of graphics processing performance. If this were incorporated into the Java Development Kit, millions of other programmers and users would benefit without even lifting a finger. The second reason for the compound interest being more pronounced is that software engineering is highly collaborative in nature. Using the above example in a negative light, even if the surgeon made judgement errors, he can only kill one patient at a time. Though the typical software bug may not result in fatalities, chances are high that it could cause hundreds of people dozens of hours to detect it and find a workaround. This is exacerbated by a lack of mandatory certification and licensing in the software industry.

If you generally agree with what I've presented, chances are you have a wealth of your own case studies to share with others. I am not going to incorporate anyone else's case studies on my site but I'll be happy to maintain a list of links to your own passionate venting.