Irreducible Complexity ReVisited
by Mike Gene
The concept of irreducible complexity (IC), as introduced by biochemist Mike Behe, is often cited as a phenomena with the ability to prove evolution impossible. Unfortunately, the concept of IC cannot shoulder this burden. To appreciate this, let us consider the basic proposal of IC. Behe defined it as follows:
"A single system composed of several
well-matched, interacting parts that contribute to the basic function, wherein
the removal of any one of the parts causes the system to effectively cease
functioning."
IC describes a system whose function is dependent on the interaction of multiple components, such that the removal of even one component results in the complete loss of function. IC can thus be represented as follows:
A + B + C + D ® F
where A,B,C, and D represent specific components (gene products) and F represents the function that is elicited by the interaction of these four parts. From this observation, it is commonly argued that that F could not possibly evolve, as F requires the presence of all four components. In other words, there would be no selective advantage of having parts A, B, and D compared to an organism having only parts A and B. Why? Because both combinations fail to elicit the function.
The basic flaw in this argument is as follows: While it is true that function F requires components A, B, C, and D to exist, it does not follow therefore that parts A,B,C, and D require function F to exist. And it is this basic flaw that has been exploited by the opponents of ID. There are three basic routes to circumvent the IC obstacle. Yet, while they exploit the inability to prove the impossible, whether they account for general explanations for the evolutionary origin of IC systems is highly doubtful. To see this, let's go back to the original IC formula, yet make one modification and discuss systems in which F is dependent on an IC state. That is, the function can only exist if multiple gene products interact with each other.
1. Original Helping Activity (OHA) Becomes Essential
In this scenario, one envisions a component fortuitously associating with a protein complex that initially serves a non-essential, but helpful activity. But as the organism containing this modified system itself evolves, the originally helpful activity now becomes essential in this new context. Yet this explanation is fatally flawed.
OHA may be a plausible explanation for the modification of an IC system, but it fails to explain the origin of the IC system. For example, we can imagine the following modification:
A + B + C ~D ® F
where A,B, and C are essential for function F and ~D helps to make a more efficient F. D is thus not part of the IC system as F can exist without it. Then, as the organism evolves, this increased efficiency becomes essential to maintain to the newly evolved state, giving:
A + B + C + D ® F
The problem is that this scenario begins with an IC system needing A, B, and C. Thus, for OHA to explain the origin of an IC system, we would need to see the following.
First,
A ® F
Then,
A ~B ® F
Then A + B ® F, etc.
But this explanation violates the assumption that F is dependent on an IC state. For example, consider the following molecular machines and their functions (long recognized by molecular biologists):
|
MOLECULAR MACHINE |
FUNCTION |
|
Ribosome |
Translate mRNA to synthesize proteins |
|
Flagellum |
Propel bacterial cells |
|
F-ATP Synthase |
Convert proton/ion gradient into ATP |
|
Replisome |
Replicate pre-existing DNA |
The functions carried out by these machines could no more be carried out by a single gene product than an internal combustion engine could work built from only one part. This is obvious from our study of the functions carried out by these machines, as several subsystems divide up the labor and then integrate to generate the function. Thus, it is no surprise that the entire living world provides not one single example of these functions being carried out by single gene products.
OHA therefore fails completely as an explanation for the origin of IC systems. However, again it may be a useful scenario for explaining how an existing IC system can be modified over time, considering that many IC systems are species specific.
2. Elimination of Function Redundancy
In this scenario, one envisions the original functional state as being complex (involving many gene products) and thus bypasses the fatal flaw of the OHA scenario. However, the complex state originally envisioned is redundantly complex:
A + B + C + D + E + G ® F
where B/E and C/G share the same subfunction. In other words, a loss of B or C or E or G alone still allows for function F to be elicited. Thus, such a system fails the classic definition of IC as provided by Behe. The thinking then is that since B/E and C/G share redundant subfunctions, one member of each two-member family can be lost by mutation. For example, an IC system could then evolve through the loss of components E and G, yeilding:
A + B + C + D ® F.
Thus, an IC system can evolve.
This explanation, however, involves some sleight of hand. While the originally proposed 6-part system does not need all six parts, it still contains an essential IC core. That is, this system still requires a four-part interaction involving A, B/E, C/G, and D to elicit F; IC is simply embedded in the redundant complexity. It is assuming an IC state to explain the origin of IC systems, thus elimination of functional redundancy also fails to explain the origin of the IC system. Like the OHA scenario, however, it may explain how IC systems have been modified since their origin. For example, in using the above scenario, it might explain why one organism carries out function F using components A,B,G, and D and another uses A,E,C, and D, etc. After all, it is not unreasonable to suppose that some originally designed complex systems were redundantly complex. This would buffer these systems against failure and also provide an avenue to front-load evolution. Therefore, if IC harkens back to an originally redundant complexity, this does not damage the IC ® ID inference in one bit.
3. Cooption of Alternative Function (CAF)
This explanation best exploits the logical flaw in the "IC = evolution is impossible" argument. That is, since the existence of A, B, C, and D need not be F-dependent, CAF simply proposes that A, B, C, and D did indeed exist prior to F, whereby these components performed some alternative, original function. As such, this is really the only evolutionary explanation that has the potential to explain the origin of an IC system. Thus, let's take a closer look at it.
This explanation would look as follows:

where G, H, I, and F are functions that previously employed components A, B, C, and D, respectively. A, B, C, and D could be directly donated into the newly formed IC system if functions G - J become disposable. Or, a gene duplication may occur for each of the gene products, allowing the duplicates to be recruited into the newly formed IC system. Or, gene products A - D could exist and now carry out dual roles in the cell.
While such a scenario provides a working explanation for the origin of the IC system, a serious investigator will want to know if there is reason to think this scenario is relevant to the origin of any particular IC system in question (the mere ability to imagine such scenarios is not evidence that such a scenario happened). Put simply, we need evidence to think this scenario applied.
Back in 1997, Julie Thomas posted an analysis of IC to the talk.origins newsgroup that is relevant here. Thomas describes what is needed when considering component (player) C, but keep in mind the same analysis would apply for A, B, and D:
However, in order for alternative activity to pose
a serious challenge to the IC status of actual player C, several things must be
demonstrated:
1. Evidence must exist that indicates the similar
activity is older. Since this explanation proposes the acquisition of function
F after the existence of the similar activity, alternative activity fails as an
objection to IC if the similar activity post-dates function F. Put simply, the
secondary activity must reflect a more ancient state and not a recent
by-product of actual player C's role.
2. The similar activity should exist at
biologically relevant states. This is important as in vitro evidence can be
misleading. For example, if actual player C is a DNA-binding protein, but binds
to RNA in the test tube under conditions that are not seen in the cell, the
similar activity is biologically suspect and may simply be an artifact of the
unnatural in vitro conditions.
3. Is the alternative activity present in the
organism with the IC system in question? Similar activities, detected by in
vitro tests using extracts from two very different organisms is of questionable
biological relevance since the lineage of the organism with the IC system in
question may have never possessed anything like the alternative activity.
4. The similar activity should not be part of
another IC system. Otherwise, the argument travels in a circle. For example,
single-stranded binding (ssb) proteins are involved
in DNA replication and DNA recombination. If one explains away the role of ssb proteins in replication by appealing to recombination,
yet explains away the role of ssb proteins in
recombination by appealing to replication, we have gotten nowhere and have only
the appearance of a refutation of actual player C's role in an IC
system.
Such analyses will go a long way in resolving IC
claims. If the similar activity post-dates player C's role, it fails as an
explanation. If it is found only in test tube assays, the explanation is
severely weakened. If the similar activity is part of another IC system, the
original role may be in question, but some IC role remains.
However, even if a particular system successfully overcomes these obstacles, it is not clear CAF applies. CAF makes an assumption about cell biology than is increasingly untenable, namely that the cell is basically a soup. This soupy aspect of the cell is needed for A, B, C, and D to escape their original functional states in order to fortuitously interact. Yet it is becoming increasingly clear than many machine components are assembled into the complex very quickly after being synthesized and/or targeted to specific sites of assembly. For example, it's becoming more and more clear that certain metabolic enzymes are secured in various places and interact as tightly fitting complexes that directly hand-off product/substrate. Where they are found and how they are arranged is just as important as their existence. This was beautifully illustrated with some mutant work in Drosophila that showed a specific glycolytic isozyme was required for flight, as the existence of another isozyme was not functional due to its mislocation. As one reviewer of this study commented, "The presence of catalytically functional proteins alone is therefore not adequate; they must be properly located." Therefore, for the CAF scenario to work, the alternative function should not anchor the component-to-be-borrowed and if it does, some cellular change must be invoked to liberate it.
Yet the most basic problem with CAF is its complete reliance on chance. If we return to the originally proposed pathway above, we are asked to believe that while A, B, C, and D have long been shaped by selection to carry out their original alternative functions, a fortuitous interaction among them all would spontaneously emerge a brand new function. Selection might be invoked to fine-tune and improve this new function, but the bottom line remains in that raw chance is being credited for the creation of a novel function. I explained this elsewhere as follows:
"Co-option is the most commonly cited non-teleological means to
generate an IC system. Yet, it is essentially a return to raw coincidence to
account for apparent design. The brilliance of
The problem of invoking chance to explain the origin of a new function is quite serious when dealing with IC molecular machines. For these machines to work, their components are usually tightly fitted into a whole through the interactions of their complementary conformations. It would be unlikely for four various proteins, pruned by selection to carry out their original functions, just happened to have sufficient conformational complementarity to assemble into a novel machine with a novel function (which explains why no one has ever observed cooption to spawn a new molecular machine). Unless, of course, certain machines were designed to channel evolution by cooption, meaning that certain cooption events were rigged to occur. And this brings me to another problem I highlighted before:
"But the problems with co-option are deeper. Once we leave the random tweaking of a protein along a linear axis guided by selection and instead appeal the multiple coincidences entailed by different, independent proteins being shaped for various other functions that just happen to coalesce into a brand new system, the role of coincidence itself is brought into question. As I have explained elsewhere, design might also be reflected as a front-loaded state that is likely to find various anticipated solutions. In such cases, IC may serve as the springboard by which one detects design through a front-loaded original state. In other words, IC may have evolved through co-option and RM&NS, but that is not the whole story. The whole story may entail whether the original state was stacked such that a random search was likely to stumble upon a new IC state through shuffling and co-opting what was originally designed. In other words, we may have a situation where evolution was designed to spawn certain IC systems when the conditions were right."
Thus, for CAF to be truly an alternative to ID, we need to modify it as coincidental cooption of alternative functions, CCAF.
There is also the problem of universality found in many large IC cores. CCAF does not predict such universality, but instead predicts various permutations of the core as a consequence of evolving over long periods of time and co-opting any protein that just happens to work. I'll will explain this problem in more detail in another article.
Given the many problems associated with CCAF, we need some rather strong indepenent evidence to ensure that it applies to any system in question. Without such evidence, CCAF functions only to remind us that there are other possible explanations for IC apart from ID. But that's all it can offer.
Irreducible Complexity Again
So where do we stand? IC as formulated by Behe does not prove evolution is impossible (it should be pointed out that Behe himself stated this in his book). So what is its utility to the ID theorist?
I think the primary utility of IC is that it helps bring a high resolution focus to any origin event in question. That is, while IC may not make it impossible for evolution to produce something, I think it can be likened to a rate-limiting step in a metabolic pathway. Biochemists often focus on rate-limiting steps, as these steps are the "logjams" of pathways that not only serve to dictate the ultimate speed of the pathway, but also can serve as effective points of regulation. We can think of IC as obstacles for evolution for the following reasons:
a) IC rules out the Darwinian mechanisms that have been most firmly established and observed, change along a linear axis. Consider the examples of the evolution of the giraffe neck, the finch beak, or wing color in moths. None of these examples represent the evolution of IC. A pre-existing neck is lengthened, a pre-existing beak is reshaped, and a pre-existing wing is darkened. Thus, the most intuitive examples of Darwinian selection provide no basis to infer the same explanation for the origin of IC. This is clearly seen from an example found in Richard Dawkins' writings. Dawkins explains how photoreceptor cells (with ninety-one layers of photon-capturing membranes) could have evolved to become more efficient at capturing photons:
"The point is that ninety-one membranes are
more effective in stopping photons than ninety, ninety are more effective than
eighty-nine, and so on back to one membrane, which is more effective than zero.
This is the kind of thing I mean when I say there is a smooth gradient up
But Dawkins is flat-out wrong in believing there are no such hidden discontinuities, as IC provides one example. In the examples mentioned above, A does not give us 10% function, A+B does not give us 25% function, etc. Instead, removal of A or B or C or D results in complete loss of function, the abrupt precipice. All of this means that Dawkins' example of adding membranes to improve photon capture rates does not apply to the origin of IC systems. The most intuitive and well documented examples of Darwinian evolution are rendered irrelevant by IC. This conclusion was also shared by biologists Thornhill and Ussery who, writing in the only paper that discusses IC in the scientific literature, observed that Darwinian evolution along such a linear axis "cannot generate irreducibly complex structures."
b) Because of a), IC critics turn to OHA, EFR, and CCAF. But as we have seen, the first two explanations don't really explain the origin of IC. For example, consider the bacterial flagellum. When you survey any bacterial species, the flagellum is usually composed of 30-40 gene products. But when you compare all bacteria, only 20-25 are universally shared. We can interpret these 20-25 as a core IC system needed for flagellar function.
OHA fails to explain this set since it would have to assume the propulsion could have been originally carried out by one gene product. But we know this is false as a beaker full of any one of these 20-25 gene products in isolation does not elicit even a trace of rotary motion. In fact, it should be increasingly clear after discussing a) that OHA ultimately boils down to a Dawkinsian explanation along a single axis (where helping activity improves the original function). This is further reason for dismissing this explanation.
EFR fails to explain the original set of 20-25 since it actually assumes an even larger initial set that included these 20-25 gene products. Furthermore, there is not the slightest scrap of evidence for thinking the original flagellum was composed of something like 75-100 proteins (a prediction of EFR).
However, I should mention that both OHA and ERF may explain the differences seen in the 10-15 flagellar proteins that are not part of the IC core, as some of these proteins are widely distributed in a mosaic pattern (hinting of EFR), while others appear more species specific (hinting of OHA). The utility of OHA and EFR are found in explaining the modification of previously IC systems, not their origin.
CCAF is the only viable evolutionary explanation for the origin of the IC core, yet the rather large size of this core appears to preclude coincidence as a mechanism. Furthermore, there is no evidence that the various components of the core existed prior to the flagellum. Nevertheless, this is a commonly invoked explanation in cyberspace, propped up by the finding that many components of the type III protein secretion system share similarity with flagellar components. Yet this appeal fails Thomas' tests mentioned above, given that the evidence indicates the type III system evolved from the flagellum. Thus, the core 20-25 flagellar components remain intact (not to mention that the type III system itself is quite IC and without an evolutionary explanation if we reject its flagellar origin).
While IC may not have proven it is impossible to evolve a flagellum (our example), it does present an obstacle to evolution that can only be overcome by coincidental cooption of alternative functions, an explanation without evidential support and plagued with problems. When we consider that the flagellum is a sophisticated molecular machine, whose appearance coincides with the appearance of bacteria, ID remains a very reasonable explanation for its origin.
Types of IC
I would like to introduce an important distinction concerning the IC nature of molecular machines when compared to the IC nature of metabolic pathways. The IC nature of the two differ in that the former depend on direct, specific contacts between components while the latter do not.
A molecular machine functions when energy (usually in the form of ATP binding/hydrolysis) enters the complex through a specific input portal. This energy input then triggers a cascade of unidirectional, conformational changes among the parts. Using our symbolic conventions above, think of A changing shape to induce a shape change in B, which induces a shape change in C, which induces a shape change in D, which then elicits the functional output. The parts require a series of direct contacts to convey the energy/information/mechanical flow that brings about the function. Removing any part interrupts this flow and renders the entire complex functionless, such that all the non-affected gene products are now without function.
A metabolic pathway functions differently. In this case, protein A reacts with an original substrate to produce product 1. Product 1 is then converted into product 2 by protein B. Protein C then converts product 2 into product 3. And finally, protein D converts product 3 into product 4. We then define the appearance of product 4 from the original substrate as the function (F). One could reasonably interpret this as IC as proteins A,B,C and D are needed to convert the original substrate into product 4. Yet such a pathway could evolve because various metabolic pathways are typically interlocked through their substrates/products. That is, product 2 may bind to protein C (for conversion into product 3), but product 2 may also react with several other proteins and thus be useful elsewhere. Since metabolic pathways produce products that can exist apart from any particular pathway, it is easier for the proteins in a pathway to exist apart from any particular function.
To appreciate just how different this is from an IC machine, consider the following. As mentioned above, you can isolate each individual component on the bacterial flagellum and fill test tubes with those individual parts. Most of the parts will not do anything in that test tube, except for the input portal which binds ATP. In contrast, you can take any protein from a metabolic pathway and fill a test tube with it and it will do something significant - catalyze a specific chemical reaction. In other words, the components of a pathway are not F-dependent; they will perform their subfunction apart from F. In contrast, most parts of a molecular machine are F-dependent and will not perform their subfunction apart from F.
This means that a metabolic pathway may be poised to evolve (as we might expect from ID). Using nothing more than simple chemical rules and selection pressures, various components of metabolic pathways may realign themselves to carry out new functions. Since the proteins of any pathway interact through the intermediaries of their substrates and products, and these can exist independently of the proteins, readjustments are plausible, especially when aided by gene duplication. The parts of a molecular machine, however, are not interacting through independent intermediaries, but through direct physical contact. And this places much more stringent constraints on the system.
I should mention that many metabolic pathways do involve direct communication between proteins. As a result, a product from one enzyme is directly channeled into the active site of another protein. And it appears that the physical contacts between the two proteins may form these channels, such that the product never sees the aqueous state. But such a hookup is different from the molecular machine, as these metabolons (as they are called) work in increase the efficiency of the reaction. We know this because a test tube full of freely diffusing proteins can still carry out the reaction. A molecular machine, in contrast, does not function if its components are freely diffusing about a test tube. For example, ribosomes only function after assembly has occurred. It's the assembly-dependent nature of the machine's function that makes it different from metabolons.
What all of this means is that any evidence for the evolution of metabolic pathways does not translate as evidence for the evolution of IC machines. This is especially true if dealing with a sluggish pathway, composed of three enzymes, all borrowed from other pre-existing pathways, as with the PCP degradation pathway. There is a huge difference between molecular machines like the ribosome or flagellum and the PCP degradation pathway.
Nevertheless, the origin of metabolic pathways remains an interesting question. The explanations above assume the modification of pathways or the creation of new pathways that are really latent in any cell simply as a function of the potential for hooking up new sequences. But there is the thorny question of the minimal number of metabolic pathways required to sustain cellular life and the minimal number of enzymes in those pathways. This minimal set may indeed represent an IC state where the function is Life. Where did it come from? Non-teleological appeals to chance to explain their origin would be quite unconvincing (Behe outlined several problems with traditional explanations for the origin of metabolic pathways). And finally, there are the metabolons. There is decent circumstantial evidence that many are quite ancient (pointing again to the sophistication of the original life forms). What if it is the case for some metabolons, especially core ones, that the efficiency brought about by channeling is indeed essential to support cellular life? An ID theorist might want to look for ways to disrupt structural contacts without affecting catalytic ability to see if such dispersal is lethal or produces a cell unlikely to survive out of the lab. If this is the case, the IC nature of metabolons may join the molecular machines as a serious problem for non-teleologists.
Conclusion
Hordes of IC critics have appeared since Behe published his book. A few books, many review papers, dozens of web pages and thousands of forum messages have dissected Behe's concept in every way imaginable. Yet despite all this effort, the non-teleological payoff has been meager. They have successfully prevented IC from being used as a proof of the impossibility of evolution. But that's about as far as they have gotten. Because of IC, they have lost the most powerful Darwinian mechanism (change along a single axis) and must appeal to indirect explanations, two of which likewise fail to explain the origin of IC, leaving only one mechanism which turns out to be an appeal to raw chance. For example, because of IC analyses, we now know that the bacterial flagellum is a sophisticated molecular machine without any fingerprint of it having a Darwinian origin. Those who still insist on Darwinian explanations for the origin of such a system are drawing upon their expectations that all biotic features have a Darwinian origin. They are free to expect this, but they err in demanding others to think as they do.
[edited 6/20/04]