by Manolis Spanakis

Wednesday, 13 March 2013

Specified Complexity

Criticism of the pseudo-scientific notion of specified complexity as it applied to linguistics and to molecular genetics.

William Albert Dembski is an American mathematician and Evangelical Protestant  philosopher, proponent of Intelligent Design (ID). He is widely known for promoting the concept of Specified Complexity (SC), which, together with M. Behe's Irreducible Complexity (IC), are considered by their proponents to be logical proofs of the involvement of a Designer in the origin of species. The practical consequences of SC are always explained using a handful of examples such as an authored phrase or, from the biological world, the bacterial flagellum. Dembski also proposed a method for detecting ID which he calls Explanatory Filter (EF) and is shown schematically in the flowchart below. SC has never been published in any peer reviewed journal and the mathematics have been widely criticized by specialists. I will try to explain my own objections.
 Explanatory filter
Contingency
In philosophy and logic, contingency is the status of propositions that are neither true under every possible valuation (i.e. tautologies) nor false under every possible valuation (i.e. contradictions). A contingent proposition is neither necessarily true nor necessarily false. Propositions that are contingent may be so because they contain logical connectives which, along with the truth value of any of its atomic parts, determine the truth value of the proposition. This is to say that the truth value of the proposition is contingent upon the truth values of the sentences which comprise it. In common terms, contingency means chance. The first step of the EF method is to decide if an apparent pattern of a natural object is contingent or, instead, can be explained by some physical or chemical necessity. The EF chart does not allow chance and necessity work together as if there were no such possibility. The snowflakes and the rainbows are used here as typical examples of patterns formed by pure necessity. It is true that such phenomena are governed by physical laws. It is also true, however, that chance plays an important role in shaping the details of such patterns. Rainbows are sometimes broken and no two snowflakes are identical in size or shape. Anyway, Dembski considers such patterns as being simple and explains them by physical necessity; there is no design nor Designer. Attention! according to the EF scheme, physical laws have nothing to do with life! and life is not contingent! because life is complex.

File:Frost Water crystal on Mercury 20Feb2010 cu2.jpg

Complexity
In general usage, complexity tends to be used to characterize something with many parts in intricate arrangement. The study of these complex linkages is the main goal of complex systems theory. In science there are at this time a number of approaches to characterizing complexity. Dembski defines complexity as a pattern consisting of many different elements. The higher the number of elements and the greater their diversity, the higher the complexity of the pattern is. A repeatedly mentioned example is a sequence of letters or an authored phrase. Here comes the second step in the EF flowchart for recognition of ID: the probability of reproducing a given pattern by chance increases exponentially with complexity. Given 55 keys (letters and punctuation)  on a computer keyboard, the probability that 3 falling objects will hit the keys DES is 1/3025. That is to say, 3025 objects must fall on the keyboard before the sequence DES could, theoretically, be expected to appear on the screen. The sequences 2 and 3, both consisting of 6 letters in a particular order, have equal probability to be produced by falling objects (chance); this probability can be calculated to be 1/28 billion for each sequence. Sequence 4 is already so complex that if all the stars of the universe fell on my keyboard they couldn't reproduce it exactly. But complexity, Dembski says, is not sufficient to suggest ID. Sequence 3 presents yet another property.
1: DES
2: NEDGIS
3: DESIGN
4: KILDTIKTKAIQAAQDGQTSDSRRALQSDIIRLLEELDNIANTTSFNGQQLLNGSFSNKEFQIGAYSNETVKVS
Before moving to the third step of the EF, let me tell you that I took my dog for a walk on my keyboard and as his nails hit the keys, sequence 4 appeared on the screen. How come? Dembski says it is extremely improbable to observe a sequence of such complexity by chance! Well, in probability theory, there are two types of probability: the a priori probability and the a posteriori probability. Indeed, it is practically impossible to reproduce sequence 4 by throwing objects on my keyboard with eyes closed. But if we see this sequence 4 somewhere in nature, then its probability of occurrence is not astronomically small but is a large as 1; because it did occur. So, now that we know that everything we observe in nature has an a posteriori probability of occurrence 1 and, therefore, it cannot be explained by chance alone, let us decide if the cause of the pattern is an Intelligent Design or... Oups! the EF scheme does not allow any other possibility for specified or non-specified complexity!

Specified Complexity
Dembski says that if we see NEDGIS on the screen, we may conclude that somebody fell asleep on the keyboard. If we see DESIGN on the screen we will certainly conclude, instead,  that somebody wrote the word DESIGN on purpose (intending to write an assay on ID, for example). Why certainly? because not only is the sequence 3 too complex to have been composed by chance, e.g. by somebody falling asleep on the keyboard, but also the sequence DESIGN is a word with a meaning and with a function and it is noble and it is English (well... American) and we understand it and it must, therefore, have been written by an intelligent American author (designer). It is not simply complex, but it has specified complexity. Incidentally, none of the above 4 sequences was typed by chance; I have designed them! No, I am kidding, I have only found them all in nature and I have copied them with a specific purpose.
Paradoxically, if there is one sequence that was really designed in this text, this is NEDGIS. We can be certain that NEDGIS was designed because nobody has ever seen that word anywhere in nature before, and certainly Dembski  has not. He designed it for the purpose of demonstrating non-specified complexity. Dembski  considers frequently observed words as being specifically complex and previously unseen words as non-specifically complex. Yet, what determines the frequency of a pattern is not its probability of appearance but its rate (probability) of reproduction. The term "specified" complexity, thus, refers to complex things that replicate, not to really designed prototypes. Let me continue with some other prototype sequences and ask if they have specified complexity or not.
5: The explanatory filter can be explained using a death investigation by a coroner as an example.
6: The explanatory investigation filter using a coroner as a death can be explained by an example.
Sequence 5 presents specified complexity. It is too complex to have been typed by chance and, what is more, it makes sense. Sequence 6 is more challenging, but as our brain is conditioned to interpret any sequence of recognized words as a sentence, the American reader may well assume that even sequence 6 was designed by an intelligent author; perhaps by a not-so-intelligent French surrealist poet.
7: Gia na exigisume to filtro exigisis mporume na hrisimopiisume tin erevna enos thanatou gia paradigma.
8: GIANAEXIGISUMETOFILTROEXIGISISMPORUMENAHRISIMOPIISUMETINEREVNAENOSTHANATOUGIAPARADIGMA
The spelling correction program of my text editor (representing human intelligence) has underlined practically every word in sequence 7 warning me that none of them has been seen before and/or listed listed in the English dictionary. This speller was mislead to "think" that the little words "to" and "tin" have been seen in English and they are "specific", even though not so complex. From sequence 7, an intelligent American linguist may recognize tiny fragments, such as "paradigm", "thanato" and "flltr". He may also note that the patterns "exigis" and "gia" are represented twice in the sequence and, therefore, must have some function; at least "exigis", that is, because "gia" might be said to be simple enough to have been repeated twice by pure chance; which, given the number of keys on the keyboard and the length of sequence 7, would be astronomically improbable. Someone who does not speak Greek will be unable to tell if the rest of the "sentence" 7 makes any sense, but will usually presume that it does so because complex sequences of letters usually convey a message and are too complex to have been typed by chance anyway. Note that the spaces in sequences 5-7 are critical to our decision that we are, firstly, looking at words and, then, at a sentence; its is only after that decision has been taken when we may try to recognize the specified complexity of the patterns. To the American layman, sequences 8 and 4 look as if they were typed by randomly falling objects having non-specified complexity. It is only after taking some courses of Greek that we may recognize meaning in sequence 8, if there is any of course; and only after taking some courses of genetics we may recognize specified complexity in sequence 4, if there is any. Note that specified complexity is not a property of the patterns, after all, but merely a property of our brain and of the knowledge it carries. How can we affirm that there is no specified complexity in snowflakes if we do not study their language? What I am saying is that SC may be everywhere in the universe that we understand and nowhere in the universe that we do not understand and, if it is so, the concept is useless. We can sit back and admire the universe that we understand because the universe that we do not understand is made by chance and therefore is not worth investigating.

Chance in Design
So, what alternatives do we have? According to the EF scheme, none! If we can recognize specification, then, the pattern is created by design, if not, by chance. For sequence 6, the decision is difficult. Its words, for the sake of argument, have been designed but it seems as they were put together by chance. Once the words have been designed, the complexity of sequence 6 falls exponentially from the value corresponding to 96 elements (characters) arranged in specified order to a value corresponding to only 16 elements (words) arranged in a specified order; and even less, if we consider that some words form phrases and move around together ("The explanatory", "can be explained", "a coroner", "an example"). The probability of forming a sensible sentence with 96 characters by chance may be astronomically small but the probability of placing 11 pre-existing phrases in one of several acceptable orders by chance is relatively very large. Thus, it is not impossible that sequence 6 was created by cooperation of design (phrases) with chance (reordering). This kind of writing was practiced by a number of French surrealist masters, such as André Breton, and is known as "écriture automatique". In the case of sequence 6, a plausible interpretation of the semi-specified complexity may be that a French surrealist poet has used the phrases that were Intelligently pre-Designed by the American philosopher in order to give the sentence another meaning. The opposite interpretation is also plausible:  the Intelligent American philosopher may have read the French surrealist poem and thought 'hmm! sentence 6 does not make much sense' and transcribed sentence 6 into sentence 5 in order to make the poem more accessible to the general public. A third interpretation would be assuming a single author who, for some unclear reason, wrote two different sentences using the same words and phrases in a different order to convey two different messages, or one and the same message with different examples for that matter. Unfortunately we will not be able to decide which interpretation is correct until we determine which of the two sentences (5 or 6) came about first.
To conclude with the Chance-Design interaction, if you give the same task in a Design class, you will have as many designs as you have designers. If you give the same subject to several writers, you will have as many different texts as you have writers. If you pick one designer or writer by chance then you will have a design or text drawn by chance. This is the effect of chance, it is everywhere in nature and you cannot get rid of it. The EF method must take chance into account even when design wins.

Necessity in Design
If chance can be involved in the designer's work, so can necessity; and to a much greater degree so. Sequence 9 contains the same words and in the same order as sequence 5; though, grammatical rules are not respected this time. The so formed sentence is more difficult to read and makes even less sense than sequence 6. Every writer knows that it is not sufficient to use pre-existing words that everybody understands and put them in the right order in order to convey a message. He must also respect the grammar, the syntax and the punctuation and there is so much of poetical license he can use. Are these rules not part of the design? No! The writer, the designer, has no control on these rules; he does not design them, they are imposed to him, they are natural laws representing necessity, he just have to comply with them. Otherwise, the product looks as if it is a product of chance.
9: The explanatories filterred could are explane use a deadly in vestigated bya coronerly as. A examplified
The EF scheme does not contain connection between Design and Necessity and does not allow recognition and measurement of this interaction, i.e. estimation of the degree to which each cause contributes to the specified complexity of a pattern. So, what can we say about the pattern of sequence 9? does it have specification? is it designed?

Chance and Necessity in Design.
Take billions of writers from all over the world. Ask them to write a definition of specified complexity for children. Tell them that they can use any words of any language they know, respecting the grammar. Tell them also that the definition should  not be longer than 100 words and that they must finish the text in 24 hours when the papers will be collected. Then, ask billions of children to read as many definitions in their language as they can and vote for the best ones within 24 hours. Within 48 hours, you will have the best possible definition of specified complexity for children in every language on earth. Publish the best few definitions and discard the worse. Repeat the exercise a few days later. In some languages you may have the same definition with minor revisions winning again. In other languages, another definition may prevail. This is how chance and necessity help to improve design. Chance is represented by the diversity of languages and the diversity and creativity of writers within language. Necessity is represented by the requirements of the readers. Design that involves chance and necessity evolves at each instance of the exercise and adapts to local needs of the readers in real time.
The EF does not have any chance-necessity-design interaction loop. It does not allow for more than one Designer nor for judges. A truly scientific method should provide for all plausible outcomes even if some of them were to be proven impossible. Dembski uses written language (or equivalent formal symbolic strings) as a model for his calculations. Therefore, he has no excuse to exclude the possibility of having more than one writer as well as a few readers in space and time. Besides, he knows very well that language evolves. I have to conclude that EF is simplistic, if honest at all.

Chance and Necessity without design.
Let us allow sequence 5 to evolve by mutation (inversion, insertion, deletion, substitution) so as to make it more functional for my purpose.
5: The explanatory filter can be explained using a death investigation by a coroner as an example.
10: The explanatory filter can be explained using as an example a death investigation by a coroner.
11: The explanatory filter can be explained using an example.
12: An example can explain the explanatory filter.
13  An example explains the explanatory filter.
14: An example explains the method.
14a: SUBJECT VERB OBJECT.
15: For example:
16: Example:
16: e.g.
17:

Sequence 14a is a generic pattern representing necessity. In language, necessity is imposed by the human brain, the way it functions and the way it has been trained (OK!... the way it is Designed). The pattern 14a consists of 3 modules but it is not the only acceptable one. The modules may change order and modules may be added or removed. All the other sequences from 10 to 17 are plausible patterns designed by different authors, or plausible revisions by the same author. The null sequence 17 is listed to indicate that none of the other sequences is indispensable. The author can proceed to describe his example without previously announcing what example he will use. Given a modular structure, the complexity of a pattern may increase, decrease or loop in a stepwise manner. A single mutation, or very few mutations with high overall probability of occurrence would be sufficient for each step. Which version of a variable and/or evolving pattern is sampled by the reader is a matter of chance. For the calculation of his probabilities, Dembski assumes that the pattern of sequence 5, and every pattern, was conceived by its designer as a whole at once and no previous version, nor modules or parts, existed previously on earth. Something like if I were to invent 16 new words that nobody has ever heard before and make with them a sentence that everybody understands (having specified complexity). The probability of success would, indeed, be astronomically small. But specified complexity, and design itself, evolves, at least implicitly in the mind of a designer. All designers know that it is convenient to save a copy of a current version before creating a next one, particularly when more than one mutations will be required. During this evolutionary process, the mutating copy may momentarily loose its function before it gains a new one. Meanwhile, the function is assured by the previously saved version.
So what? you may argue. The modules, the phrases and the words themselves present specified complexity and must have been designed. Must they? I have just copied the word DESIGN and even the word of sequence 4. In fact, I have not designed any word so far. I have designed only the text by selecting words and other modules from the wild and putting them in the particular order that seemed to me appropriate. I did not count how many versions and modifications I have made for each sentence because that would give me headache. Unlike other natural patterns of which it is difficult to establish the origin, language has only been around for only as long as our civilization and, therefore, we can have evidence and theories for the origin of practically all the words and concepts used today.

"Design"
For example, DESIGN first appeared as a verb in the 1540s from the Latin designare (mark out, devise, choose, designate, appoint) and is composed from de- (out; see de-) + signare (to mark); signare, in tern, comes from signum (a mark, sign). The original meaning of DESIGN in English was to designate; designate replaced DISIGN in its original meaning by the 1640s, when the noun DESIGNER also appeared. Many modern uses of DESIGN are metaphoric extensions. As a noun, DESIGN dates from 1580s and came from Middle French desseign. DESIGNER (one who schemes), agent noun from design meaning one who makes an artistic design or a construction plan and is from 1660s. In fashion, bearing the label of a famous clothing designer (thus presumed to be expensive or prestigious) dates from 1966. Designer drug only attested from 1983. The root sign appears first as a verb in early 13c. (gesture or motion of the hand) from Old French signe (sign, mark, signature), which, in turn came from Latin signum (mark, token, indication, symbol), with ultimate origins in the Proto-Indo-European base *sekw- (point out; see). The verb sign appeared c.1300, (to make the sign of the cross) from Old French signer, from Latin signare, from signum (see sign). The sense of "to mark, stamp" is attested from mid-14c.; that of "to affix one's name" is from late 15c; the meaning "to communicate by sign language" is recorded from 1700. The noun sign, meaning a mark or device having some special importance is recorded from late 13c.; that of "a miracle" is from c.1300; the sense of a characteristic device attached to the front of an inn, shop, etc., to distinguish it from others, is first recorded mid-15c. Sign language is recorded from 1847. Therefore, sekw > see > signare > signum > signer > sign > design > designate > design (another meaning) > designer > fashion design, drug design, vector design... the word has always been there in some form or another and keeps evolving. In German, or in Greek, the space between the two words used for a concept would drop and the two words would merge ('pharmacodesign',  pharmacodynamics, pharmacogenetics...), like the words de- and -sign did before. Today, design is used in a myriad of different contexts. An isolated use of the pattern DESIGN on a computer screen has minimal information about the exact meaning of the word. Without context, we would not be too silly to assume that the word was just found on the screen by chance, chosen out of some 120 listed alternatives* in English alone, i.e. without sufficient specified complexity to assume that it forms part of a designed text.
*pattern, motif, configuration, figure, device, decorative pattern, composition, layout, conception, diagram, drawing, sketch, rough representation, draft, blueprint, prototype, picture, tracing, commercial design, architectural design, outline, depiction, chart, map, plan, tracery, delineation, perspective, treatment, idea, study, form, draught (UK), schema, graphic, image, line drawing, arrangement, create, originate, make up, devise, compose, invent, produce, conceive, come up with (slang), fabricate, form, shape, fashion, construct, build, delineate, mean, set apart, aim at, intend, style, destine, angle, pitch, aim, approach, architecture, arrange, arrangement, array, art, artifice, cartoon, compose, composition, conception, concoct, configuration, conformation, contemplation, contrivance, cook up, course, create, custom, decoration, designing, destine, device, diagram, discovery, dope out, draft, draft, drawing, dream up, drive at, emblem, emboss, end, engineer, engineering, ensign, format, intend, invent, lay, lay plans (for), layout

"Pattern"
Pattern is almost synonymous to Design in the context of Specified Complexity. Words and phrases are not products of design but patterns to be copied. The noun appeared in early 14c. It is particularly interesting and relevant to this argument that the word originally meant - not something conceived de novo for a purpose - but "the original proposed to imitation; the archetype; that which is to be copied; an exemplar", from Medieval Latin as written and spoken c.700-c.1500: patronus (see patron), then, from Old French patron. The extended sense of "decorative design" was first recorded 1580s, from an earlier sense of a "patron" as a model to be imitated. The difference in form and sense between patron and pattern wasn't firm till 1700s. The meaning "model or design in dressmaking" (especially one of paper) is first recorded 1792, in Jane Austen. The verb phrase pattern, after "take as a model", is from 1878.
Evolution: patronus > patron > pattern to be copied > pattern as de novo design

"Computer"
From 1640s, "one who calculates," agent noun from compute. The meaning "calculating machine" (of any type) is from 1897; in modern use, "programmable digital electronic computer" (1945; theoretical from 1937, as Turing machine). ENIAC (1946) usually is considered to be the first. The verb compute comes from Latin computare (to count, sum up, reckon together) which comes, in turn, from com- (with; see com-) + putare (to reckon, originally to prune; see pave) and, in 1630s, from French computer. The Proto-Indo-European root *pau- (to cut, strike, stamp) evolved to Latin putare (to prune) and Greek paiein (to strike) then pave; paving (from 1580s) and with co- or con- or com-, compute and computer.

"Explanatory"
From the very similar phonemes p, f, (then, p, f, pf,) and the Proto-Indo-European *pla- (*pla-no-), then plat, flat, plano, planus, plain, with ex- (out of) + planus (flat; plane) we have the Latin explanare (to make plain or clear, explain, make level, flatten); then, explanatorius (having to do with an explanation) and explanatory from 1610s on.
 
We can go on and demonstrate that every word in this text evolved from other, smaller words, by mutation and recombination changing pattern or meaning or both. Today, there is massive variation between and within languages and vast amounts of horizontal transfer and mutation. Set aside some very talented poets, words are not designed de novo but are rather copied from the wild and mutated. Even poetical words derive, most of the time, by recombination and mutation of pre-existing words; new meanings also. Because words and phrases are also modular structures. Linguistic patterns evolve and the direction of their evolution follows the direction of evolution of our culture, most frequently from simple to complex. The specified complexity of sentences can be interpreted to be the product of selection and re-arrangement of naturally occurring linguistic patterns (modules) which is performed by us, the random readers-copiers who apply our own 'necessity'.

Perhaps the  greatest error of the proponents of specified complexity is to present language, a system that obviously evolves, as an example of intelligent design. If we consider language as a human phenotypic trait, we wonder why one human phenotypic trait evolves the way it does (creatively) but none of the other traits does so. Another error is to combine the argument of specified complexity with that of irreducible complexity. This combination works against both arguments. If we were to admit that the sequence 5 were specifically complex, therefore designed, then, Intelligent design does not require irreducible complexity because specified complexity is not necessarily irreducible. The question is, thus, reduced to the elementary one: is there a Designer or not? If there is, then He could have designed everything, specified or not specified, irreducible or reducible, complex or simple. If there is no Designer, then, specified and irreducible complexity must have other causes. I have difficulty to believe that a designer would design specifically and irreducibly complex objects - objects that we perceive as specifically and irreducibly complex - but would leave everything else to chance. It would be easier to accept that we attribute to a designer everything that we understand and, to chance, by definition, everything we cannot explain. But, if the specified complexity of language could have been created by a Darwinian evolutionary process all the way from phonemes to complex sentences, as seems to be the case, then, all other specified complexity could, perhaps, be explained in the same terms.


Biological linguistics.
I will now argue that biological patterns are also modular. The modules consist of smaller, simpler, re-assembled patterns and can be re-arranged to form lager, more complex patterns. The biological patterns are copied and mutated. Some mutants are viable (a posteriori probability of occurrence p = 1) and some not (p ~= 0). Let us take the example of the bacterial flagellum. This consists of a variable number of proteins (several tenths); about 15 of them are thought to be essential parts of the irreducibly complex system because, by definition, they are found in all known flagella. Among those essential proteins is Flagellin, coded by at least 2 genes, FlgA and FlgB. The protein products FlgA and FlgB bind together to form the complex that makes up the spiral tail of the flagellum. The proportions of FlgA and FlgB in the flagellum tail may depend on environmental conditions, as if it is the environment that designs the tail of the flagellum. The story below is about FlgB but molecular biologists know that the phenomena I will describe are observed in every protein ever studied without exception.
clip_image002
All flagella have Flagellin but not all Flagellins are made up of the same proteins (genes) and  not all the bacterial species have flagella. Some bacteria have flagella in some environmental conditions and drop then under other environmental conditions. FlgA and FlgB are proteins with unrelated sequences which bind together and interact to make up a flagellum tail. The sequence 4, above, which would probably seemed to a linguist to have been written by a monkey, is actually part of the sequence of FlgB and, as such, it may be thought to have highly specified complexity.
The figure below shows 16 known different versions of the FlgB protein, each belonging to a different bacterial species. The red regions are identical ("conserved") in that set of 16 species; the blue regions have variable sequence but conserved length and the grey regions have variable sequence and are absent from at least one species; the later are, therefore, non-essential and do not form part of any irreducible complexity that protein might have. Moreover, we do not now if those grey areas have any specified complexity at all, since they can completely deleted without harming the FlgB function. Note that deletion of a grey region is not a "mutation" in the common sense of the term because the organism that bears it is not a mutant but the "wild type" of a different species. Also note that only some 25% of the sequence of FlgB is conserved in the 16 bacterial species that have this gene. The blue regions have variable conservation. Some of them are conserved in all but one species and some are completely variable. The entire gene is replaced with other, more or less similar genes in other bacteria bearing flagella. Therefore, conservation of the specified complexity of FlgB, and indeed of all genes, is gradual and may range from 100% to complete absence of the entire gene; like the specified complexity of the linguistic sequences 10 to 17 above.

imageimage
Let us now take the FlgB sequence from position 81 to 160, which appear to be one of the most conserved segments. The table below shows this segment for the first three species. I have replaced blue with green for the letters that were not conserved in all of the 16 species but are conserved in the first three species.
             90        100       110       120       130       140       150       160
    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
  81
AMDEQIKILDTIKTKAIQAAQDGQTSDSRRALQSDIIRLLEELDNIANTTSFNGQQLLNGSFSNKEFQIGAYSNETVKVS

  81
AMDEQIKILDTIKTKAVQAAQDGQTLESRRALQSDIQRLLEELDNIANTTSFNGQQMLSGSFSNKEFQIGAYSNTTVKAS

  81
AMDEQIKILDTIKTKAVQAAQDGQNADSRRALQSDITRLLEELDNIANTTAFNGQQLLNGSFSNKNFQIGAYSNETVKVS

Back at the times when only the first three FlgB sequences were known, the specified complexity of that protein segment would look like in the following table where conserved specified complexity is shown in red. Note that the specified complexity of a biological sequence depends on the number of species we have sequenced or, in other words, the amount of knowledge we have accumulated. It also depends on chance. We know that the specified complexity of the segment is interrupted at position 146 because we had the chance to isolate and sequence the third species. All the other 15 remaining species would show specified complexity from position 140 to 154. The length of conserved specified complexity tends to shrink with time and there is no reason to believe that it will stop shrinking further with future knowledge added in. If we only were aware of any two of the three sequences, then, the red segments would be even longer. If we only knew one of the sequences, we could have thought that the entire sequence were specifically complex, since it codes for a functional protein, and nothing could be changed. It is impossible to tell with any reasonable confidence what segment of the sequence has specified complexity and where its limits are. The argument: "It is conserved, therefore, it has specified complexity" is an argument from ignorance: "we do not have evidence of variation, therefore, it has specified complexity". Specified complexity can only be defined at a precise point in time and with a precise status of knowledge. Such definition would only be valid and useful if we were to decide to stop all further research. We can either believe in specified complexity or in research, not in both.

            90        100       110       120       130       140       150       160
   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
81
AMDEQIKILDTIKTKAIQAAQDGQTSDSRRALQSDIIRLLEELDNIANTTSFNGQQLLNGSFSNKEFQIGAYSNETVKVS
81
AMDEQIKILDTIKTKAVQAAQDGQTLESRRALQSDIQRLLEELDNIANTTSFNGQQMLSGSFSNKEFQIGAYSNTTVKAS
81
AMDEQIKILDTIKTKAVQAAQDGQNADSRRALQSDITRLLEELDNIANTTAFNGQQLLNGSFSNKNFQIGAYSNETVKVS
image
Incidentally, the argument about lack of intermediate forms in the fossil record, which would validate Darwin's gradual speciation by variation and natural selection, was stated at the first years after the proposition of the Natural selection theory and was not revised ever since. Now, as genetic research advances, more and more evidence is accumulated for gradual evolution and intermediate forms. The FlgB example was selected here because of the Flagellum connection but it also illustrates well the notion of intermediate forms in bacterial speciation. Of course, it would be foolish to expect to find such evidence from "soft" traits in the fossil record.
A linguistic parenthesis
An English-only speaker may think that there is specified complexity in the word "mother". Change, omit or add, any letter to the sequence "mother" and the word will loose its function. A linguist studying Indo-European languages, however, may recognize specified complexity only in the highly conserved letters (phonemes) "m-t-r" and their variants ("d", "dr", "th", "tk" ... ). If we examine a larger sample of world languages we find one phoneme with almost universal conservation: "m", which probably derives from the baby's earliest and commonest primitive sounds "mmm", "ma", "ma - ma"  and which are interpreted to be "addressed" to the person who is most probably holding him/her, i.e. his/her mother. These primitive sounds are, then, formalized  into a word according to each language's grammatical and pronunciation rules. Note that there is variation even within language and there is spatial and temporal conservation even between very diverse languages (e.g. "materi" in the ancient Aeolian Greek dialect and in Slovenian; "mama" in various Indo-European languages but also in Chinese).
Proto-indo-European Mater
Sanskrit Matar
Greek Aeolian 1 Mater (μάτeρ)
Greek Aeolian 2 Materi (μάτeρi)
Greek Classic Mitir (μήτηρ)
Greek Modern 1 Mitera (μητέρα)
Greek Modern 2 Mana (μάνα)
Greek Modern 3 Mama (μαμά)
Afrikaans ma
German Mutter
Basque ama
Byelorussian Maci
English mother
Bulgarian maĭka
Catalan mare
Chinese Māmā
Creole manman
Croat majka
Danish mor
Spanish madre
French 1 mère
French 2 maman
Welch mam
Hindi Māṁ
Hungarian anya
Finish äiti
Irish mháthair
Icelandic 1 móður
Icelandic 2 mamma
Icelandic 3 mömmu
Icelandic 4 móðir
Italian madre
Latin matrem
Latvian māte
Lithuanian motina
Maltese omm
Dutch moeder
Norwegian mor
Polish matka
Portuguese mãe
Rumanian mamă
Serbian мајка
Slovak 1 matka
Slovak 2 mama
Slovenian 1 mater
Slovenian 2 materi
Slovenian 3 mati
Slovenian 4 mama
Slovenian 5 mamo
Swedish mamma
Swahili mama
Czech matka
Thai Mæ̀
Turk anne
Vietnamese mẹ


Back to biological linguistics (parenthesis closed).
Words that have very similar meanings, like those in the above list, are called synonyms. Synonymous sequences are also well documented in molecular genetics. At the nucleic acid level, synonymous variations are those nucleotide sequences that code for the same protein sequence. At the protein level, there are various groups of amino acids with similar physicochemical properties that can interchangeably give rise to proteins with very similar structure and function. Some of the blue amino acids in the sequence of FlgA, above, are synonymous in this sense. At the organelle level, proteins may interchangeably play very similar roles, like FlgB can be replaced by another similar protein in the formation of a flagellum.


What is a sequence logo?

A sequence logo is a graphical display of a multiple sequence alignment consisting of colour-coded stacks of letters representing amino acids at successive positions. Sequence logos provide a richer and more precise description of sequence similarity than consensus sequences and can rapidly reveal significant features of the alignment that could otherwise be difficult to perceive.
The total height of a logo position depends on the degree of conservation in the corresponding multiple sequence alignment column. Very conserved alignment columns produce high logo positions.
The height of each letter in a logo position is proportional to the observed frequency of the corresponding amino acid in the alignment column.
The letter of each stack is ordered from most to least frequent, so that it is possible to read the consensus sequence from the top of the stacks.

PROSITE sequence logos

The sequence logos available from the PROSITE WebSite have been build using WebLogo.
'#' in a sequence logo figure means the number of true positive hits detected in UniProtKB/Swiss-Prot used to build the sequence logo. Sequence logos aren't generated if the number of true positive hits in UniProtKB/Swiss-Prot is below four.
For patterns, each position is shown in the logo, whereas for profiles only match positions are considered, i.e. the length of the logo corresponds to the length of the profile.