POV-Ray: Newsgroups: povray.off-topic: Inside Win32

POV-Ray : Newsgroups : povray.off-topic : Inside Win32		Server Time 5 Sep 2024 03:24:13 EDT (-0400)

Goto Latest 10 Messages

Next 10 Messages >>>

From: Invisible
Subject: Inside Win32
Date: 13 Oct 2009 05:08:22
Message: <4ad44386$1@news.povray.org>

There now follows a huge braindump of my research into the inner 
mysteries of how Win32 actually works. The information is correct as 
best as I can tell, although it's rather incomplete in places...



MS-DOS had two formats for runnable programs. There were *.COM files, 
and *.EXE files.

As far as I can tell, a *.COM file is just a file containing raw machine 
code. You load the entire file into memory, starting at a predefined 
base address [I forget what that is], and jump to byte 0. And that's all 
there is to it.

The MS-DOS *.EXE format is more complicated. For starters, it begins 
with a magic number ("MZ" ASCII, apparently some programmer's initials). 
I don't know anything further about it.

VAX VMS uses a format known as Common Object File Format, or COFF. 
Windows NT is based on VMS, and hence uses a format called Portable 
Executable, or PE, which is heavily based on COFF.

Anybody old enough to remember may recall that it used to be possible to 
switch between MS-DOS and Windows at will. (Windows was originally a 
trivial graphical shell that runs under MS-DOS. Only later did it become 
a true OS.) Back in those days, you could quite possibly try to run 
Notepad.exe or something from MS-DOS. Doing so would cause a short error 
message saying "this program requires Windows to run" or similar.

Apparently every PE file is also a valid MS-DOS *.EXE file, and it 
contains a tiny stub program which simply prints out that error message 
and then exits. After that stub program comes the real PE headers.

As well as runnable programs, Windows has Dynamic Link Libraries, of 
DLLs. These apparently use the exact same PE format; there's a single 
bit in the header which is different for a program and a library. Apart 
from that, they're both identical. (Also DLLs aren't always *.DLL; 
there's also *.OCX, *.CPL, etc.)

I never realised this, but apparently PE files can contain machine code 
for many, many different CPU types. (IA32, AMD64, IA64, DEC Alpha, MIPS, 
PowerPC, ARM...)

I've come across vague references to code and data sections in programs, 
but it turns out that a PE file can actually have an arbitrary number of 
sections in it. Each section is loaded into a different memory page, and 
can have a different combination of protection settings (read / write / 
execute / etc).

Apparently the layout of the data on disk is very similar to the layout 
in memory. When a DLL or EXE is loaded into memory, it becomes known as 
a "module". And you can actually page-align the data in the file to make 
it load into memory faster - at the expense of making the file bigger. 
This is why EXE files often have huge chunks of zeros inside them, it seems.

Anyway, it seems that apart from executable machine code, a PE file can 
contain several other kinds of section. (Indeed, it seems to be possible 
to write PE files that contain no runnable code at all!)

- Resource sections contains "resouces". This seems to include icons, 
cursors, bitmaps, menu and GUI descriptions, text strings and so forth. 
Several of these resouces can have multiple versions, and APIs exist to 
fetch the appropriate version automatically. (E.g., you can have 
versions of a text string in multiple languages, and fetch the one for 
the current system language. Or multiple sizes of the same icon and 
fetch the requested size. And so on.) Having a GUI description appears 
to allow you to construct a complex GUI with just one API call, instead 
of programatically building the entire GUI manually.

- Debug symbols. (I.e., names corresponding to various addresses. Useful 
if you want to crawl through the code with a low-level debugger.) I'm 
not sure precisely how this part works - but it's not very relevant to 
me anyway.

- You can have "data" sections which are either read-only or read-write. 
Apparently this kind of thing is used for global variables.

I note in passing that since the PE header is at the front of the file 
and a Zip header is at the back of a file, it's possible to make a file 
which is simultaneously a valid Zip file and a valid PE file. That's 
apparently how self-extracting Zip files work. You can also have other 
random data off the end of the PE file, past the last section (and this 
won't automatically be loaded into memory).

More interesting, a PE file can import and export symbols. A DLL will 
usually export the symbols for externally-callable functions, and EXE 
files typically import various symbols.

It seems there are in fact two ways to call the functions in a DLL. One 
way is to call LoadLibrary() to load your DLL, and GetProcAddress() to 
look up the address of a given function (either by name or by index number).

The *other* way, which I didn't know about, is to import the symbols 
from that DLL. This basically means putting some info into the PE 
headers saying which functions in which DLLs you're interested in. When 
the program is loaded, this table gets copied into memory, the requested 
DLLs are loaded, and the slots in the table are filled in with the 
addresses of the requested functions. The generated machine code can 
then access these functions by indexing the table.

And then there's the fact that a DLL, being an ordinary PE file, can 
*also* import from other DLLs, which can import other DLLs, and so 
forth. Apparently it's possible to "bind" a DLL - that is, to precompute 
and fill out the function table, based on the versions of the referenced 
DLLs installed and their preferred base addresses. And this apparently 
speeds up DLL loading. (But if the information is not current, it is 
automatically ignored.)

And then there's the whole song and dance about what happens if a DLL 
can't be loaded at its preferred base address. (Which would seem a 
freakishly unlikely occurrance, but still...)



OK, so I have a program that will edit resouces, and a program which 
shows exported symbols. But reading the above descriptions, it becomes 
clear that these are just two optional sections in a PE file. Apparently 
compiled .NET programs contain a whole bunch of special sections that 
only appear in .NET programs, and contain compiled CIL code. I haven't 
read anything about COM, but it seems likely that all the COM-related 
data goes in yet another section which I don't yet have a way to inspect 
- which is why some DLLs appear to be "empty". They don't have the 
symbol export section, but they have other useful sections in them.

It also becomes apparent that there's more than one way for the compiler 
and the linker to do their job. (Apparently finished programs are PE 
files, but "object files" are COFF files - go figure!) I'm told the 
section *names* are actually irrelevant [except that the linker uses 
them to decide which sections to merge together], and the OS only cares 
about the section options. It's all quite complex in there...

Well, my head hurts, I've learned a whole crap-load of stuff, and I'm 
still nowhere nearer to my goal of being able to use COM. :-}

Post a reply to this message

From: Fredrik Eriksson
Subject: Re: Inside Win32
Date: 13 Oct 2009 11:56:50
Message: <op.u1qw80n07bxctx@e6600>

On Tue, 13 Oct 2009 11:08:22 +0200, Invisible <voi### [at] devnull> wrote:
>
> Well, my head hurts, I've learned a whole crap-load of stuff, and I'm  
> still nowhere nearer to my goal of being able to use COM. :-}

Because none of that stuff is relevant to using COM. It is kind of like  
learning to drive a car by studying the chemical properties of petroleum  
products.

-- 
FE

Post a reply to this message

From: Mike Raiford
Subject: Re: Inside Win32
Date: 13 Oct 2009 13:48:55
Message: <4ad4bd87$1@news.povray.org>

On 10/13/2009 4:08 AM, Invisible wrote:

> Well, my head hurts, I've learned a whole crap-load of stuff, and I'm
> still nowhere nearer to my goal of being able to use COM. :-}

It will take quite a while to learn to use and understand COM, 
especially from C++.
-- 
~Mike

Post a reply to this message

From: Orchid XP v8
Subject: Re: Inside Win32
Date: 13 Oct 2009 14:37:39
Message: <4ad4c8f3$1@news.povray.org>

Mike Raiford wrote:

> It will take quite a while to learn to use and understand COM, 
> especially from C++.

How about from a high-level cross-platform language which doesn't even 
support COM? :-}

-- 
http://blog.orphi.me.uk/
http://www.zazzle.com/MathematicalOrchid*

Post a reply to this message

From: Mike Raiford
Subject: Re: Inside Win32
Date: 13 Oct 2009 16:41:23
Message: <4ad4e5f3$1@news.povray.org>

On 10/13/2009 1:37 PM, Orchid XP v8 wrote:

>
> How about from a high-level cross-platform language which doesn't even
> support COM? :-}
>

Good Luck!
-- 
~Mike

Post a reply to this message

From: Orchid XP v8
Subject: Re: Inside Win32
Date: 13 Oct 2009 16:42:43
Message: <4ad4e643$1@news.povray.org>

> Good Luck!

Why thank you. I believe I'll be needing it... :-/

-- 
http://blog.orphi.me.uk/
http://www.zazzle.com/MathematicalOrchid*

Post a reply to this message

From: Mike Raiford
Subject: Re: Inside Win32
Date: 13 Oct 2009 16:45:10
Message: <4ad4e6d6$1@news.povray.org>

On 10/13/2009 3:42 PM, Orchid XP v8 wrote:
>> Good Luck!
>
> Why thank you. I believe I'll be needing it... :-/
>

Indeed! You will need it.

-- 
~Mike

Post a reply to this message

From: Invisible
Subject: Re: Inside Win32
Date: 14 Oct 2009 04:24:39
Message: <4ad58ac7$1@news.povray.org>

>>> Good Luck!
>>
>> Why thank you. I believe I'll be needing it... :-/
> 
> Indeed! You will need it.

All we need now is for Warp to pop up and tell me that if only I used a 
*real* programming environment like VisualStudio C++, I would only need 
to click on a button and everything would instantly work right...

Post a reply to this message

From: Invisible
Subject: Re: Inside Win32
Date: 14 Oct 2009 04:39:06
Message: <4ad58e2a$1@news.povray.org>

>> Well, my head hurts, I've learned a whole crap-load of stuff, and I'm 
>> still nowhere nearer to my goal of being able to use COM. :-}
> 
> Because none of that stuff is relevant to using COM. It is kind of like 
> learning to drive a car by studying the chemical properties of petroleum 
> products.

Maybe. But it's pretty interesting, all the same... ;-)

Post a reply to this message

From: scott
Subject: Re: Inside Win32
Date: 14 Oct 2009 04:52:38
Message: <4ad59156@news.povray.org>

> All we need now is for Warp to pop up and tell me that if only I used a 
> *real* programming environment like VisualStudio C++, I would only need to 
> click on a button and everything would instantly work right...

Or Scott to pop up and tell you to just use a .net language and be done with 
COM altogether.

What do you think of this?

http://www.ffconsultancy.com/dotnet/fsharp/rule30/code/1/rule30.fs

Post a reply to this message

Goto Latest 10 Messages

Next 10 Messages >>>