POV-Ray: Newsgroups: povray.off-topic: Molecular biology: Molecular biology

POV-Ray : Newsgroups : povray.off-topic : Molecular biology : Molecular biology		Server Time 17 Dec 2025 04:56:50 EST (-0500)
From: Invisible
Date: 7 Jan 2011 09:28:04
Message: <4d2722f4$1@news.povray.org>
The more I read about molecular biology, the more interesting it 
becomes. Ironically, the thing that got me interested in this was 
Darwin's Black Box, by Michael Behe, no less. (!)



As everybody knows, DNA is the blueprint that describes everything about 
an organism. What colour are your eyes? How long is your left femur? How 
fast do your fingernails grow? Each organism has a different genome - a 
different set of DNA blueprints. If it were somehow possible to take a 
mouse and replace the DNA in every single cell of its body with the DNA 
of a dragonfly, the mouse would immediately transform into a perfect 
little dragonfly.

...except that almost all of this is completely wrong.

The DNA does /not/ say "grow this femur until it is 47.2cm long". How 
long your femur /actually/ ends up being depends on things like 
nutrition and so forth as well. But more to the point, the DNA doesn't 
contain a diagram that says "this is what the result should look like", 
and contains something more like assembly instructions.

(As for turning a mouse into a dragonfly... yeah, good luck with that. 
Their biochemistry is wildly different, and there's no easy way to 
smoothly transition from one to the other. About the best you could hope 
for is to turn a mouse *embryo* into a dragonfly embryo - and even that 
would be a rather difficult feat.)



So how does this stuff *actually* work then?

Well, there is DNA - the famous double-helix molecule - and RNA, which 
is like one half of the double-helix (but it's not quite that simple). 
Either way, a strand of this stuff contains various nucleotides - the 
famous A, C, G and T. They stand for adenine, cytosine, guanine and 
thymine. (But that's only in DNA; in RNA, thymine is replaced by uracil.)

Each DNA strand is a mirror image of the other, and that's why they zip 
together. It also means that if one strand is damaged, it can be 
repaired by reading the other strand. So it's a kind of error-correcting 
code.

So a strand of DNA (or RNA) encodes a long stream of codes, with 4 
possible characters. So what does all this information "do", then?

Well, read a little further, and you'll discover that the code is 
grouped into 3-letter chunks called "codons". Each codon uniquely 
selects one of the 21 possible amino acids. Read off the codons, glue 
the indicated amino acids together in that order, and you've got 
yourself a protein.

Proteins catalyse chemical reactions. Translation: proteins make 
chemical reactions happen that would never ever happen by themselves. 
(Or rather, they would, but excruciatingly slowly. So slowly as to be 
basically non-existent.) Just about every single chemical that living 
cells make involves one or more proteins synthesizing it.

Proteins that catalyse stuff are called enzymes. There are proteins that 
do other stuff too. To quote Behe (!), "proteins are the ropes and 
pulleys, motors and scaffolding of the cellular world".

[OK, I lied. I don't have his lie-riddled book in front of me right now. 
But you get the gist of what he said - a mental image that you won't get 
from any more technical publication, I might add.]

Some parts of cells are made of protein. Other parts are made of 
chemicals synthesized by enzymes ( = proteins), usually in long and very 
complicated chains of chemical reactions called metabolic pathways. 
Proteins are also used to send signals from one part of a cell to 
another, or indeed between neighbouring cells.

So, the DNA contains the codons which specify the amino acids which make 
the proteins. Each "gene" is the sequence for one particular protein. 
Make the proteins, and they either synthesize stuff, or magnetically 
lock together to make cellular structures. And that's how cells work. Right?



Well, actually it's not quite so simple.

First of all, 4^3 = 64 possibilities, but there are in fact only 21 
amino acids. So actually lots of codons select the exact same amino 
acid. As it just so happens, the *last* letter in the codon tends to be 
the lest important. (E.g., *every* codon that begins with GG selects 
Glycine, regardless of what the final character is.) There's also a 
couple of codons that don't select an amino acid, they mark the end of 
the sequence - a kind of "stop marker". (There is also a "start marker" 
- AUG - but it also codes for methionine.)

The fact that every single living organism on the face of the Earth uses 
almost /exactly/ the same codes for the same amino acids is a pretty 
good demonstration that all life is related. On the other hand,

http://en.wikipedia.org/wiki/Pyrrolysine

So the code does change slightly in certain obscure organisms. And then 
there's things like

http://en.wikipedia.org/wiki/Selenocysteine

So it's not *just* about codons. Put that in your pipe and smoke it!

Next, proteins work not so much because of the chemical units (amino 
acids) they contain, but because those units cause the chain to tangle 
itself in a very specific way - the so-called "protein folding". Look up 
any protein on the Internet and you will find "ribbon diagrams" looking 
like tangled string, indicating the structure into which the proteins fold.

Some amino acids have weak electrical charges, some repel water, some 
attract water, and so on and so forth, so that by sequencing the amino 
acids together, they more or less automatically fold up into the right 
shape - and the *shape* is what does the business.

Then we learn that actually, some proteins require special "chaperone" 
proteins to help them fold up correctly. Next, some proteins are 
actually multiple separate molecules fused together. For example, 
haemoglobin is actually 4 "separate" globin proteins connected to a haem 
molecule in the centre. (Haem isn't a protein at all, it's some other 
random organic molecule, with an iron atom in the middle. It's that iron 
atom that does the business.)

Quite apart from many proteins being separate amino acid chains that 
need to be joined together, it's very common for proteins to undergo 
"post-transcription modification". In other words, after the amino acids 
have been strung together, other enzymes come along and modify the 
result, possibly by inserting new non-protein bits to it. Alternatively, 
some proteins are manufactured in an inactivated form, and an enzyme is 
required to activate it. (Usually when you need a lot of active protein 
very quickly. You can stockpile the inactive form at your leisure, and 
then release the activation enzyme when required.)

On top of all that, there are mechanisms for recycling certain proteins, 
and even for detecting proteins which have mis-folded, and either 
refolding them or recycling them before they do damage.



So the DNA tells you how to build proteins, one way or another. But what 
tells a cell when to synthesize these things? If you look at a 
single-celled organism, you find that most of them are capable of 
digesting several different types of food. And these require different 
sets of enzymes. And the cell only synthesizes the enzymes it needs for 
the actual type of food it is currently feeding on. So... how?

Basically, different genes can be "turned on and off". (So-called "gene 
regulation".) There are several ways that you can do this. One is to 
glue proteins to the actual DNA strands, preventing the transcription 
enzymes from locking on to that particular gene and thus synthesizing 
the corresponding protein.

A typical sequence might look something like this:
+ There is an inhibitor protein glued to a particular gene.
+ There's an inactivated protein floating around the cell.
+ When something in the cell changes - say, the oxygen concentration 
goes sufficiently low - the inactivated protein changes shape and hence 
activates.
+ The activated protein strips the inhibitor off of the DNA strand.
+ The protein the gene codes for can now be synthesized - presumably 
taking some appropriate action either to conserve remaining oxygen, or 
else switch on the anaerobic metabolic pathways.
+ The inhibitor protein continues to be synthesized, but the activated 
oxygen-sensor protein keeps chopping it up.
+ When oxygen concentrations rise again, the oxygen-sensor protein again 
inactivates.
+ Since the inhibitor protein is no longer getting chopped up, it 
reattaches to the DNA strand, disabling the gene.

Of course, in a *real* cell, it wouldn't be nearly that simple. Spend 
more than 20 minutes reading about molecular biology and you rapidly 
discover that almost every imaginable cellular process happens via the 
most complex, indirect route imaginable.



Let's look at that for a moment. Intelligent Design asserts that living 
organisms are "too perfect" to have arisen by chance. (Of course, 
Darwin's theory of evolution doesn't claim that it _was_ chance!) But 
evolution says that life isn't about "perfection", it's about survival. 
If it works, keep it. If it doesn't work, throw it. In particular, 
things don't have to be "perfect". They just have to be "better than 
anybody else".

For example, apparently in New Guinea there are kangaroos that live /in 
trees/. This despite the self-evident fact that kangaroos are obviously 
"designed" to live on the ground. "A kangaroo in a tree sounds pretty 
weird", you might say, "but I bet /these/ kangaroos are supremely well 
adapted to it".

Um, not really, no. I watched some video of one, and after it spent 5 
minutes hesitantly staggering from one branch to the one next to it, you 
can't help feeling like you want to scream at it "what the hell are you 
doing?! You obviously can't climb to save your life, so stop prating 
around in that tree and get down here where you belong!" But, 
apparently, they never do. They live all their lives in trees. Even 
though they can barely climb and it takes them hours to get anywhere. 
Intelligent Design? I think not. But apparently, on New Guinea, there's 
nothing that can climb trees any better, so they survive.

Come to think of it, whales live in the sea... but... they breathe air? 
WTF? In fact, I heard the other day that some species need to eat in 
order to take in water. If they don't feed, they die of dehydration. 
Even though THEY LIVE *IN* WATER! Again, WTF?

No sane Designer would have designed it this way. But random chance 
opportunistically finding new uses for old organs? Sure. You see it all 
over the place. For example, reptiles have multiple jaw bones, while 
mammals have only one bone in each jaw. The extra bones have become the 
tiny bones of the middle ear. One organ has turned into something 
totally unrelated.

But when you look at molecular biology, you find utter craziness. This 
kind of re-purposing has been happening for billions of years, and 
unlike at the macroscopic level, you can actually *see* all of the 
complexity at the molecular level.

Almost every single molecule has a dozen different purposes, all 
/utterly/ unrelated. RNA is for storing data. Except that sometimes it's 
also a catalyst. And sometimes it's structural. Glycine is an amino 
acid, for building proteins. But it's *also* a neurotransmitter. 
Adenosine is a component of DNA and RNA - but *also* a neurotransmitter. 
Adrenaline is a hormone, but *also* a neurotransmitter. Haemoglobin 
transports oxygen, but it also performs reduction reactions as part of 
metabolic pathways. Get the idea?

On top of that, almost every chemical process you can name goes through 
four-dozen different stages. For example:

http://en.wikipedia.org/wiki/Citric_acid_cycle

Look at that huge diagram on the right. Note the part where it says 
"overview". A similarly intricate thing happens with vision:

http://en.wikipedia.org/wiki/Visual_phototransduction

In short, when a photon hits a protein, it changes shape, which switches 
on another protein, which releases a chemical that switches something 
else on, and so on and so forth, until eventually one of the ion 
channels in the cell wall closes, stopping electrically charged ions 
being pumped into the cell, and changing its electrical potential, 
triggering a nerve impulse. (Nerve impulses are similarly complex 
cascades, by the way.) And then there's a whole *other* cascade to 
recharge the proteins ready to detect the next photon.



At the macroscopic level, you can watch organisms evolve. You can 
imagine a leg gradually becoming longer, or fur gradually changing 
colour. But at the molecular level, you can see actual genes change.

The processes that copy DNA are imperfect. Most of the time, copies are 
almost identical. But occasionally, tiny differences arise. There are 
several kinds. The most common one is for a single letter to get copied 
wrong - a so-called "point mutation". Such a mutation might not actually 
do anything; recall that several different codons produce the same amino 
acid. And even if the amino acid is different, it might not affect the 
functioning of the resulting protein.

Or it might change the protein's properties. And when that happens, 
natural selection comes into play.

Some changes make no difference at all. Some changes make only a small 
difference. And some changes totally disable some finely tuned mechanism.

For example, if you happen to be born with a defective gene for the lung 
surfactant protein... then you die within minutes of the umbilical cord 
being cut. Offspring? No, not really.

If you happen to be born with a defective FOXP2 gene... actually, if 
FOXP2 is defective, you don't get born. Sperm meets egg, the cell 
divides a few dozen times, and then all the cells keel over and die.

In short, if you have a defect in some gene that codes for an utterly 
vital protein, you probably aren't going to be alive very long. On the 
other hand, suppose you happen to be born with a defective gene for, 
say, L-gulonolactone oxidase, then you can't synthesize vitamin C.

...actually, you know what? The entire human race has this defect. As do 
some but not all monkey species. It turns out that the food we eat 
already contains so much vitamin C that being unable to synthesize it 
yourselves isn't fatal, nor even particularly detrimental. So humans 
(and relatives) go through the entire synthesis pathway for vitamin C, 
but the enzyme at the end is broken. It does nothing. It just sits there.

Fortunately, existing regulatory pathways ensure that not very much of 
the precursor chemicals is synthesized. (Given that it isn't being used 
up by being converted to vitamin C, it tends to accumulate quickly, 
which down-regulates its synthesis like in any well-regulated metabolic 
pathway.) Actually, a number of independent species have also got broken 
genes for this enzyme, but broken in different ways. That's how we know 
the breakage happened independently.



And that leads me to the really interesting part, the part I got from 
The Ancestor's Tale, by Richard Dawkins. Mutations happen at a roughly 
constant rate. Some mutations are utterly fatal, others are completely 
harmless, and there are some in between. A tiny few are even beneficial.

Mutations that are fatal are vigorously eliminated by natural selection. 
And indeed, you can find genes that have barely changed for billions of 
years. These are the so-called "highly conserved sequences".

On the other hand, mutations that have no effect on fitness (e.g., point 
mutations that don't change which amino acid a codon codes for) just 
silently collect in the background. And by comparing, say, the actin 
gene of a human to the actin gene of a fruit fly, you can see which bits 
are the same, and which bits are different. Compare a few dozen species 
together and you can figure out who's related to who, and when each 
mutation took place.

Of course, it's not impossible for an A to mutate into a T, spread to a 
dozen species, and then one of these species to mutate from a T back to 
an A, making it look like it's less closely related to everything else. 
One single point mutation isn't much evidence. But look at large blocks 
of mutations and you can usually tell, in a statistical sense, what the 
likely relationships are. (Unsurprisingly, this is way harder than it 
sounds, and indeed many scientists devote their entire careers to it.)

In case you want to try it:

http://www.wolframalpha.com/input/?i=ACTG2

That's the human actin gene. If you go to the stream of letters in the 
middle and repeatedly jab the "more" button, it'll actually show you the 
entire frigging DNA sequence for this gene, taken from the Human Genome 
Project. (But this particular gene is quite large - 26,690 letters.) It 
also shows you where this gene is, what else is nearby, what variants of 
the gene exist in the human population, and so forth.

Scroll down to the very bottom and there's a graph showing you the 
equivalent gene in other organisms, and how similar they are to the 
human gene.

P. troglodytes = chimpanzee
M. musculus = mouse
R. norvegicus = rat
B. taurus = cow
C. familiaris = dog
G. gallus = red junglefowl (i.e., a bird)
D. rerio = zebrafish
D. melanogaster = fruit fly
C. elegans = a nematode worm
O. sativa = rice plant
A. thaliana = mouse-ear cress

So here we have mammals, birds, invertebrates and even plants, and 
apparently the gene is nearly identical in all of them. You would think 
that cress would have 0% of the same DNA as humans, but apparently not 
so. Let's try another gene:

http://www.wolframalpha.com/input/?i=POMC

This is the gene for the adrenocorticotropic hormone (ACTH). As you can 
see, it shows up only in birds and mammals (but that might just be due 
to incomplete genome sequencing of other animals, or because we haven't 
spotted it in other genomes yet - or because Wolfram's data is 
incomplete :-P ). And where it does appear, the similarity values are 
way, way lower. Clearly this gene evolved more recently.



It's not just point mutations though. Sometimes entire chunks of DNA 
accidentally get skipped, or somehow get duplicated. The duplicates 
don't necessarily end up anywhere near the original. For that matter, 
sometimes chunks of DNA move from place to place without actually being 
deleted or duplicated.

These events are much, much rarer than simple point mutations. And when 
a duplication happens, you can usually tell which mutations happened 
before the duplication (because they appear in both copies) and which 
ones happened afterwards (because they appear in only one copy). This 
provides lots of information for figuring out inheritance trees. For 
example, humans have 4 slightly different haemoglobin genes, plus half a 
dozen broken versions of the gene. By looking at lots of genomes, you 
can figure out when these duplications happened, and who's related to who.

Here's an interesting case: Humans have 3-colour vision. How did that 
happen? Well, it turns out that mutations in the light-receptive 
pigments of the eye can change its absorption spectrum. Usually there's 
no advantage or disadvantage to this. But then, what if a gene 
duplication happens? Then you might end up with two identical genes, 
making the same pigment. Now any point mutations on either gene would 
tend to make the two pigments respond to light differently - yielding 
2-colour vision.

Apparently almost all mammals have 2-colour vision. But humans and a 
very few primates have 3-colour vision. The extra pigment is clearly a 
slightly modified version of one of the existing pigments. Somehow it 
got duplicated, and then the two copies mutated away from each other. 
Natural selection enhanced this process, since superior colour 
perception is presumably useful for sometime.

It's the "red" and "green" pigments that are duplicates; the "blue" 
pigment is a lot more different. I should probably also point out that 
the colours we call "red", "green" and "blue" are not very equally 
spaced out; red and green are very similar wavelengths, while blue is 
quite some way away by comparison. Also, none of the three pigments 
respond to just one colour; actually each has an absorption spectrum 
that's quite complicated. The human brain discerns colour by *comparing* 
the inputs from each pigment.

Which brings us to another question. A mutation gave humans 3-colour 
vision, but how did the brain learn to interpret this new data? For that 
matter, how does the brain "know" which nerve signals come from "red" 
cells, and which ones come from "green" cells?

Scientists did an experiment where they took mice (which, like most 
mammals, have 2-colour vision) and inserted a third pigment gene into 
their genome. When the little baby mice grew up, the scientists did 
tests on them, and discovered that they actually /have/ superior colour 
perception. Presumably the brain "notices" that different groups of 
cells fire together, independent of their neighbours, and it can thus 
sort out which ones are "red" cells, which are "green", and so on. (It's 
worth pointing out that other animals have 4 or even 6 light pigments! 
Just not mammals.)



The more you read about molecular biology, the more you realise how 
complicated it is. A human being would have designed the system to be as 
simple as possible, with everything happening in the most direct way, 
with the fewest number of steps. But evolution doesn't work like that. 
Everything happens by long and convoluted pathways that were happened 
upon by fluke and then preserved by natural selection.

For example, people used to think that each gene creates a protein. But 
then it turned out that some active proteins actually contain several 
amino acid chains, or that other proteins have to modify it to make it 
functional. But OK, whatever.

Actually, the situation is even more complicated. You see, genes have 
"introns", bits of genetic data which don't make proteins. So rather 
than copying DNA to RNA and then using the RNA to string the amino acids 
together, there's also a middle step where all the introns get chopped 
out, leaving only the exons (the bits that code for amino acids).

But it's more complicated than that, because some genes can be spliced 
more than one way, yielding more than one protein. Sometimes by gluing 
different markers to different parts of the RNA, you can select which 
splicing you want (and hence, which protein).

On top of all that, RNA isn't just a data storage molecule. It can 
catalyse things too (they call it a ribozyme). In fact, there is a 
hypothesis ("RNA world") that the first life on Earth used RNA both for 
data storage and chemical synthesis, and that protein synthesis and DNA 
storage evolved later.

Damnit, the more you look, the more complicated the picture becomes! But 
it's fascinating too...

Apparently over 80% of human DNA doesn't describe protein structures. So 
what the hell does it do? Well, it turns out some of it is old, broken 
copies of genes. (I mentioned there's a dozen or so broken copies of the 
globin that makes up haemoglobin, for example.) Some of it is the 
markers that gene regulation proteins stick to. In other words, the 
non-coding DNA regions can contribute to gene regulation. Indeed, if DNA 
is the computer program, the various stuff currently stuck to it could 
be considered the "memory" of the computer. I rather suspect that if you 
took the entire human genome and just synthesized some DNA with that 
sequence, it probably wouldn't work. It needs to be initialised with the 
correct inhibitors and promoters to start the process off.



Speaking of which, here's a thing: Every single living cell in the human 
body (with a few exceptions) has the exact same genome. Yet these cells 
all behave very differently, having totally different sizes, shapes and 
chemistries. How does that happen?

Well, I already sketched how a cell can respond to its environment to 
change its chemistry. The main question is how cells in a developing 
embryo "know" where they are. How does one cell "know" that it's at the 
head end, and another "know" that its at the tail end?

The answer is a set of chemical concentration gradients, present in the 
cells surrounding the unfertilised egg. These chemical gradients "tell" 
the dividing ball of cells which way is up or down (or rather, anterior 
and posterior). And here's the astonishing fact: virtually every 
multicellular lifeform on Earth seems to use the *exact* same chemicals 
for this purpose. Take a look:

http://www.wolframalpha.com/input/?i=HOXA2

There's a whole lot of similarity there. Apparently you can take the 
chemical signals from a mouse eye, inject them into a fruit fly, and it 
grows an eye at the point where you injected it. A fly eye, mind you, 
not a mouse eye. (The chemical doesn't say /how/ to make an eye; it just 
labels the place where the eye should go.)

Another rather surprising fact is that apoptosis is quite important to 
the structure of an organism. Apoptosis is when living cells 
deliberately kill themselves. Doesn't sound very useful, does it? Until 
you realise that your hands start out as flat sheets of tissue, and the 
only reason that you have separate fingers (not to mention separate 
bones) is because the cells between them deliberately committed suicide.

Mind you, apoptosis is an intricate and carefully controlled process. 
The cell doesn't just kill itself. It carefully shrinks down to minimal 
size, sends signals for workers to come and clear it away, and then 
switches itself off. In contrast, if a cell is killed by hostile 
influences, it usually ends up spilling its contents all over the place. 
In apoptosis, the dead cell is neatly tidied away in an orderly fashion.



I could go on about this all day. Suffice it to say that it's very 
interesting stuff, but not very easy to find comprehensible material 
about. But take a look at this paper I found yesterday:

http://www.plosbiology.org/article/info:doi/10.1371/journal.pbio.0030387

That is some hard-core statistical analysis, right there. You can start 
to see how people can spend an entire career just trying to figure out 
one question about how a molecule evolved...

Also interesting is this chart of the evolutionary history of myosin:

http://upload.wikimedia.org/wikipedia/commons/0/0e/MyosinUnrootedTree.jpg
Post a reply to this message