|
|
The more I read about molecular biology, the more interesting it
becomes. Ironically, the thing that got me interested in this was
Darwin's Black Box, by Michael Behe, no less. (!)
As everybody knows, DNA is the blueprint that describes everything about
an organism. What colour are your eyes? How long is your left femur? How
fast do your fingernails grow? Each organism has a different genome - a
different set of DNA blueprints. If it were somehow possible to take a
mouse and replace the DNA in every single cell of its body with the DNA
of a dragonfly, the mouse would immediately transform into a perfect
little dragonfly.
...except that almost all of this is completely wrong.
The DNA does /not/ say "grow this femur until it is 47.2cm long". How
long your femur /actually/ ends up being depends on things like
nutrition and so forth as well. But more to the point, the DNA doesn't
contain a diagram that says "this is what the result should look like",
and contains something more like assembly instructions.
(As for turning a mouse into a dragonfly... yeah, good luck with that.
Their biochemistry is wildly different, and there's no easy way to
smoothly transition from one to the other. About the best you could hope
for is to turn a mouse *embryo* into a dragonfly embryo - and even that
would be a rather difficult feat.)
So how does this stuff *actually* work then?
Well, there is DNA - the famous double-helix molecule - and RNA, which
is like one half of the double-helix (but it's not quite that simple).
Either way, a strand of this stuff contains various nucleotides - the
famous A, C, G and T. They stand for adenine, cytosine, guanine and
thymine. (But that's only in DNA; in RNA, thymine is replaced by uracil.)
Each DNA strand is a mirror image of the other, and that's why they zip
together. It also means that if one strand is damaged, it can be
repaired by reading the other strand. So it's a kind of error-correcting
code.
So a strand of DNA (or RNA) encodes a long stream of codes, with 4
possible characters. So what does all this information "do", then?
Well, read a little further, and you'll discover that the code is
grouped into 3-letter chunks called "codons". Each codon uniquely
selects one of the 21 possible amino acids. Read off the codons, glue
the indicated amino acids together in that order, and you've got
yourself a protein.
Proteins catalyse chemical reactions. Translation: proteins make
chemical reactions happen that would never ever happen by themselves.
(Or rather, they would, but excruciatingly slowly. So slowly as to be
basically non-existent.) Just about every single chemical that living
cells make involves one or more proteins synthesizing it.
Proteins that catalyse stuff are called enzymes. There are proteins that
do other stuff too. To quote Behe (!), "proteins are the ropes and
pulleys, motors and scaffolding of the cellular world".
[OK, I lied. I don't have his lie-riddled book in front of me right now.
But you get the gist of what he said - a mental image that you won't get
from any more technical publication, I might add.]
Some parts of cells are made of protein. Other parts are made of
chemicals synthesized by enzymes ( = proteins), usually in long and very
complicated chains of chemical reactions called metabolic pathways.
Proteins are also used to send signals from one part of a cell to
another, or indeed between neighbouring cells.
So, the DNA contains the codons which specify the amino acids which make
the proteins. Each "gene" is the sequence for one particular protein.
Make the proteins, and they either synthesize stuff, or magnetically
lock together to make cellular structures. And that's how cells work. Right?
Well, actually it's not quite so simple.
First of all, 4^3 = 64 possibilities, but there are in fact only 21
amino acids. So actually lots of codons select the exact same amino
acid. As it just so happens, the *last* letter in the codon tends to be
the lest important. (E.g., *every* codon that begins with GG selects
Glycine, regardless of what the final character is.) There's also a
couple of codons that don't select an amino acid, they mark the end of
the sequence - a kind of "stop marker". (There is also a "start marker"
- AUG - but it also codes for methionine.)
The fact that every single living organism on the face of the Earth uses
almost /exactly/ the same codes for the same amino acids is a pretty
good demonstration that all life is related. On the other hand,
http://en.wikipedia.org/wiki/Pyrrolysine
So the code does change slightly in certain obscure organisms. And then
there's things like
http://en.wikipedia.org/wiki/Selenocysteine
So it's not *just* about codons. Put that in your pipe and smoke it!
Next, proteins work not so much because of the chemical units (amino
acids) they contain, but because those units cause the chain to tangle
itself in a very specific way - the so-called "protein folding". Look up
any protein on the Internet and you will find "ribbon diagrams" looking
like tangled string, indicating the structure into which the proteins fold.
Some amino acids have weak electrical charges, some repel water, some
attract water, and so on and so forth, so that by sequencing the amino
acids together, they more or less automatically fold up into the right
shape - and the *shape* is what does the business.
Then we learn that actually, some proteins require special "chaperone"
proteins to help them fold up correctly. Next, some proteins are
actually multiple separate molecules fused together. For example,
haemoglobin is actually 4 "separate" globin proteins connected to a haem
molecule in the centre. (Haem isn't a protein at all, it's some other
random organic molecule, with an iron atom in the middle. It's that iron
atom that does the business.)
Quite apart from many proteins being separate amino acid chains that
need to be joined together, it's very common for proteins to undergo
"post-transcription modification". In other words, after the amino acids
have been strung together, other enzymes come along and modify the
result, possibly by inserting new non-protein bits to it. Alternatively,
some proteins are manufactured in an inactivated form, and an enzyme is
required to activate it. (Usually when you need a lot of active protein
very quickly. You can stockpile the inactive form at your leisure, and
then release the activation enzyme when required.)
On top of all that, there are mechanisms for recycling certain proteins,
and even for detecting proteins which have mis-folded, and either
refolding them or recycling them before they do damage.
So the DNA tells you how to build proteins, one way or another. But what
tells a cell when to synthesize these things? If you look at a
single-celled organism, you find that most of them are capable of
digesting several different types of food. And these require different
sets of enzymes. And the cell only synthesizes the enzymes it needs for
the actual type of food it is currently feeding on. So... how?
Basically, different genes can be "turned on and off". (So-called "gene
regulation".) There are several ways that you can do this. One is to
glue proteins to the actual DNA strands, preventing the transcription
enzymes from locking on to that particular gene and thus synthesizing
the corresponding protein.
A typical sequence might look something like this:
+ There is an inhibitor protein glued to a particular gene.
+ There's an inactivated protein floating around the cell.
+ When something in the cell changes - say, the oxygen concentration
goes sufficiently low - the inactivated protein changes shape and hence
activates.
+ The activated protein strips the inhibitor off of the DNA strand.
+ The protein the gene codes for can now be synthesized - presumably
taking some appropriate action either to conserve remaining oxygen, or
else switch on the anaerobic metabolic pathways.
+ The inhibitor protein continues to be synthesized, but the activated
oxygen-sensor protein keeps chopping it up.
+ When oxygen concentrations rise again, the oxygen-sensor protein again
inactivates.
+ Since the inhibitor protein is no longer getting chopped up, it
reattaches to the DNA strand, disabling the gene.
Of course, in a *real* cell, it wouldn't be nearly that simple. Spend
more than 20 minutes reading about molecular biology and you rapidly
discover that almost every imaginable cellular process happens via the
most complex, indirect route imaginable.
Let's look at that for a moment. Intelligent Design asserts that living
organisms are "too perfect" to have arisen by chance. (Of course,
Darwin's theory of evolution doesn't claim that it _was_ chance!) But
evolution says that life isn't about "perfection", it's about survival.
If it works, keep it. If it doesn't work, throw it. In particular,
things don't have to be "perfect". They just have to be "better than
anybody else".
For example, apparently in New Guinea there are kangaroos that live /in
trees/. This despite the self-evident fact that kangaroos are obviously
"designed" to live on the ground. "A kangaroo in a tree sounds pretty
weird", you might say, "but I bet /these/ kangaroos are supremely well
adapted to it".
Um, not really, no. I watched some video of one, and after it spent 5
minutes hesitantly staggering from one branch to the one next to it, you
can't help feeling like you want to scream at it "what the hell are you
doing?! You obviously can't climb to save your life, so stop prating
around in that tree and get down here where you belong!" But,
apparently, they never do. They live all their lives in trees. Even
though they can barely climb and it takes them hours to get anywhere.
Intelligent Design? I think not. But apparently, on New Guinea, there's
nothing that can climb trees any better, so they survive.
Come to think of it, whales live in the sea... but... they breathe air?
WTF? In fact, I heard the other day that some species need to eat in
order to take in water. If they don't feed, they die of dehydration.
Even though THEY LIVE *IN* WATER! Again, WTF?
No sane Designer would have designed it this way. But random chance
opportunistically finding new uses for old organs? Sure. You see it all
over the place. For example, reptiles have multiple jaw bones, while
mammals have only one bone in each jaw. The extra bones have become the
tiny bones of the middle ear. One organ has turned into something
totally unrelated.
But when you look at molecular biology, you find utter craziness. This
kind of re-purposing has been happening for billions of years, and
unlike at the macroscopic level, you can actually *see* all of the
complexity at the molecular level.
Almost every single molecule has a dozen different purposes, all
/utterly/ unrelated. RNA is for storing data. Except that sometimes it's
also a catalyst. And sometimes it's structural. Glycine is an amino
acid, for building proteins. But it's *also* a neurotransmitter.
Adenosine is a component of DNA and RNA - but *also* a neurotransmitter.
Adrenaline is a hormone, but *also* a neurotransmitter. Haemoglobin
transports oxygen, but it also performs reduction reactions as part of
metabolic pathways. Get the idea?
On top of that, almost every chemical process you can name goes through
four-dozen different stages. For example:
http://en.wikipedia.org/wiki/Citric_acid_cycle
Look at that huge diagram on the right. Note the part where it says
"overview". A similarly intricate thing happens with vision:
http://en.wikipedia.org/wiki/Visual_phototransduction
In short, when a photon hits a protein, it changes shape, which switches
on another protein, which releases a chemical that switches something
else on, and so on and so forth, until eventually one of the ion
channels in the cell wall closes, stopping electrically charged ions
being pumped into the cell, and changing its electrical potential,
triggering a nerve impulse. (Nerve impulses are similarly complex
cascades, by the way.) And then there's a whole *other* cascade to
recharge the proteins ready to detect the next photon.
At the macroscopic level, you can watch organisms evolve. You can
imagine a leg gradually becoming longer, or fur gradually changing
colour. But at the molecular level, you can see actual genes change.
The processes that copy DNA are imperfect. Most of the time, copies are
almost identical. But occasionally, tiny differences arise. There are
several kinds. The most common one is for a single letter to get copied
wrong - a so-called "point mutation". Such a mutation might not actually
do anything; recall that several different codons produce the same amino
acid. And even if the amino acid is different, it might not affect the
functioning of the resulting protein.
Or it might change the protein's properties. And when that happens,
natural selection comes into play.
Some changes make no difference at all. Some changes make only a small
difference. And some changes totally disable some finely tuned mechanism.
For example, if you happen to be born with a defective gene for the lung
surfactant protein... then you die within minutes of the umbilical cord
being cut. Offspring? No, not really.
If you happen to be born with a defective FOXP2 gene... actually, if
FOXP2 is defective, you don't get born. Sperm meets egg, the cell
divides a few dozen times, and then all the cells keel over and die.
In short, if you have a defect in some gene that codes for an utterly
vital protein, you probably aren't going to be alive very long. On the
other hand, suppose you happen to be born with a defective gene for,
say, L-gulonolactone oxidase, then you can't synthesize vitamin C.
...actually, you know what? The entire human race has this defect. As do
some but not all monkey species. It turns out that the food we eat
already contains so much vitamin C that being unable to synthesize it
yourselves isn't fatal, nor even particularly detrimental. So humans
(and relatives) go through the entire synthesis pathway for vitamin C,
but the enzyme at the end is broken. It does nothing. It just sits there.
Fortunately, existing regulatory pathways ensure that not very much of
the precursor chemicals is synthesized. (Given that it isn't being used
up by being converted to vitamin C, it tends to accumulate quickly,
which down-regulates its synthesis like in any well-regulated metabolic
pathway.) Actually, a number of independent species have also got broken
genes for this enzyme, but broken in different ways. That's how we know
the breakage happened independently.
And that leads me to the really interesting part, the part I got from
The Ancestor's Tale, by Richard Dawkins. Mutations happen at a roughly
constant rate. Some mutations are utterly fatal, others are completely
harmless, and there are some in between. A tiny few are even beneficial.
Mutations that are fatal are vigorously eliminated by natural selection.
And indeed, you can find genes that have barely changed for billions of
years. These are the so-called "highly conserved sequences".
On the other hand, mutations that have no effect on fitness (e.g., point
mutations that don't change which amino acid a codon codes for) just
silently collect in the background. And by comparing, say, the actin
gene of a human to the actin gene of a fruit fly, you can see which bits
are the same, and which bits are different. Compare a few dozen species
together and you can figure out who's related to who, and when each
mutation took place.
Of course, it's not impossible for an A to mutate into a T, spread to a
dozen species, and then one of these species to mutate from a T back to
an A, making it look like it's less closely related to everything else.
One single point mutation isn't much evidence. But look at large blocks
of mutations and you can usually tell, in a statistical sense, what the
likely relationships are. (Unsurprisingly, this is way harder than it
sounds, and indeed many scientists devote their entire careers to it.)
In case you want to try it:
http://www.wolframalpha.com/input/?i=ACTG2
That's the human actin gene. If you go to the stream of letters in the
middle and repeatedly jab the "more" button, it'll actually show you the
entire frigging DNA sequence for this gene, taken from the Human Genome
Project. (But this particular gene is quite large - 26,690 letters.) It
also shows you where this gene is, what else is nearby, what variants of
the gene exist in the human population, and so forth.
Scroll down to the very bottom and there's a graph showing you the
equivalent gene in other organisms, and how similar they are to the
human gene.
P. troglodytes = chimpanzee
M. musculus = mouse
R. norvegicus = rat
B. taurus = cow
C. familiaris = dog
G. gallus = red junglefowl (i.e., a bird)
D. rerio = zebrafish
D. melanogaster = fruit fly
C. elegans = a nematode worm
O. sativa = rice plant
A. thaliana = mouse-ear cress
So here we have mammals, birds, invertebrates and even plants, and
apparently the gene is nearly identical in all of them. You would think
that cress would have 0% of the same DNA as humans, but apparently not
so. Let's try another gene:
http://www.wolframalpha.com/input/?i=POMC
This is the gene for the adrenocorticotropic hormone (ACTH). As you can
see, it shows up only in birds and mammals (but that might just be due
to incomplete genome sequencing of other animals, or because we haven't
spotted it in other genomes yet - or because Wolfram's data is
incomplete :-P ). And where it does appear, the similarity values are
way, way lower. Clearly this gene evolved more recently.
It's not just point mutations though. Sometimes entire chunks of DNA
accidentally get skipped, or somehow get duplicated. The duplicates
don't necessarily end up anywhere near the original. For that matter,
sometimes chunks of DNA move from place to place without actually being
deleted or duplicated.
These events are much, much rarer than simple point mutations. And when
a duplication happens, you can usually tell which mutations happened
before the duplication (because they appear in both copies) and which
ones happened afterwards (because they appear in only one copy). This
provides lots of information for figuring out inheritance trees. For
example, humans have 4 slightly different haemoglobin genes, plus half a
dozen broken versions of the gene. By looking at lots of genomes, you
can figure out when these duplications happened, and who's related to who.
Here's an interesting case: Humans have 3-colour vision. How did that
happen? Well, it turns out that mutations in the light-receptive
pigments of the eye can change its absorption spectrum. Usually there's
no advantage or disadvantage to this. But then, what if a gene
duplication happens? Then you might end up with two identical genes,
making the same pigment. Now any point mutations on either gene would
tend to make the two pigments respond to light differently - yielding
2-colour vision.
Apparently almost all mammals have 2-colour vision. But humans and a
very few primates have 3-colour vision. The extra pigment is clearly a
slightly modified version of one of the existing pigments. Somehow it
got duplicated, and then the two copies mutated away from each other.
Natural selection enhanced this process, since superior colour
perception is presumably useful for sometime.
It's the "red" and "green" pigments that are duplicates; the "blue"
pigment is a lot more different. I should probably also point out that
the colours we call "red", "green" and "blue" are not very equally
spaced out; red and green are very similar wavelengths, while blue is
quite some way away by comparison. Also, none of the three pigments
respond to just one colour; actually each has an absorption spectrum
that's quite complicated. The human brain discerns colour by *comparing*
the inputs from each pigment.
Which brings us to another question. A mutation gave humans 3-colour
vision, but how did the brain learn to interpret this new data? For that
matter, how does the brain "know" which nerve signals come from "red"
cells, and which ones come from "green" cells?
Scientists did an experiment where they took mice (which, like most
mammals, have 2-colour vision) and inserted a third pigment gene into
their genome. When the little baby mice grew up, the scientists did
tests on them, and discovered that they actually /have/ superior colour
perception. Presumably the brain "notices" that different groups of
cells fire together, independent of their neighbours, and it can thus
sort out which ones are "red" cells, which are "green", and so on. (It's
worth pointing out that other animals have 4 or even 6 light pigments!
Just not mammals.)
The more you read about molecular biology, the more you realise how
complicated it is. A human being would have designed the system to be as
simple as possible, with everything happening in the most direct way,
with the fewest number of steps. But evolution doesn't work like that.
Everything happens by long and convoluted pathways that were happened
upon by fluke and then preserved by natural selection.
For example, people used to think that each gene creates a protein. But
then it turned out that some active proteins actually contain several
amino acid chains, or that other proteins have to modify it to make it
functional. But OK, whatever.
Actually, the situation is even more complicated. You see, genes have
"introns", bits of genetic data which don't make proteins. So rather
than copying DNA to RNA and then using the RNA to string the amino acids
together, there's also a middle step where all the introns get chopped
out, leaving only the exons (the bits that code for amino acids).
But it's more complicated than that, because some genes can be spliced
more than one way, yielding more than one protein. Sometimes by gluing
different markers to different parts of the RNA, you can select which
splicing you want (and hence, which protein).
On top of all that, RNA isn't just a data storage molecule. It can
catalyse things too (they call it a ribozyme). In fact, there is a
hypothesis ("RNA world") that the first life on Earth used RNA both for
data storage and chemical synthesis, and that protein synthesis and DNA
storage evolved later.
Damnit, the more you look, the more complicated the picture becomes! But
it's fascinating too...
Apparently over 80% of human DNA doesn't describe protein structures. So
what the hell does it do? Well, it turns out some of it is old, broken
copies of genes. (I mentioned there's a dozen or so broken copies of the
globin that makes up haemoglobin, for example.) Some of it is the
markers that gene regulation proteins stick to. In other words, the
non-coding DNA regions can contribute to gene regulation. Indeed, if DNA
is the computer program, the various stuff currently stuck to it could
be considered the "memory" of the computer. I rather suspect that if you
took the entire human genome and just synthesized some DNA with that
sequence, it probably wouldn't work. It needs to be initialised with the
correct inhibitors and promoters to start the process off.
Speaking of which, here's a thing: Every single living cell in the human
body (with a few exceptions) has the exact same genome. Yet these cells
all behave very differently, having totally different sizes, shapes and
chemistries. How does that happen?
Well, I already sketched how a cell can respond to its environment to
change its chemistry. The main question is how cells in a developing
embryo "know" where they are. How does one cell "know" that it's at the
head end, and another "know" that its at the tail end?
The answer is a set of chemical concentration gradients, present in the
cells surrounding the unfertilised egg. These chemical gradients "tell"
the dividing ball of cells which way is up or down (or rather, anterior
and posterior). And here's the astonishing fact: virtually every
multicellular lifeform on Earth seems to use the *exact* same chemicals
for this purpose. Take a look:
http://www.wolframalpha.com/input/?i=HOXA2
There's a whole lot of similarity there. Apparently you can take the
chemical signals from a mouse eye, inject them into a fruit fly, and it
grows an eye at the point where you injected it. A fly eye, mind you,
not a mouse eye. (The chemical doesn't say /how/ to make an eye; it just
labels the place where the eye should go.)
Another rather surprising fact is that apoptosis is quite important to
the structure of an organism. Apoptosis is when living cells
deliberately kill themselves. Doesn't sound very useful, does it? Until
you realise that your hands start out as flat sheets of tissue, and the
only reason that you have separate fingers (not to mention separate
bones) is because the cells between them deliberately committed suicide.
Mind you, apoptosis is an intricate and carefully controlled process.
The cell doesn't just kill itself. It carefully shrinks down to minimal
size, sends signals for workers to come and clear it away, and then
switches itself off. In contrast, if a cell is killed by hostile
influences, it usually ends up spilling its contents all over the place.
In apoptosis, the dead cell is neatly tidied away in an orderly fashion.
I could go on about this all day. Suffice it to say that it's very
interesting stuff, but not very easy to find comprehensible material
about. But take a look at this paper I found yesterday:
http://www.plosbiology.org/article/info:doi/10.1371/journal.pbio.0030387
That is some hard-core statistical analysis, right there. You can start
to see how people can spend an entire career just trying to figure out
one question about how a molecule evolved...
Also interesting is this chart of the evolutionary history of myosin:
http://upload.wikimedia.org/wikipedia/commons/0/0e/MyosinUnrootedTree.jpg
Post a reply to this message
|
|