 |
 |
|
 |
|
 |
|  |
|  |
|
 |
|
 |
|  |
|  |
|
 |
In a nutshell:
http://msdn.microsoft.com/en-gb/magazine/cc163603.aspx
Make of that what you will...
--
http://blog.orphi.me.uk/
http://www.zazzle.com/MathematicalOrchid*
Post a reply to this message
|
 |
|  |
|  |
|
 |
|
 |
|  |
|  |
|
 |
Warp schrieb:
> Invisible <voi### [at] dev null> wrote:
>> Their approach seems to be to "verify" each program before it runs,
>> checking that it doesn't do any "bad" things.
>
> That's impossible. It can be proven that it's an unsolvable problem,
> exactly for the same reason as the halting problem is unsolvable. There's
> no way for any program to check if a piece of code is executed and how.
>
> It's also impossible for it to know, for example, the addresses of all
> pointers by simply examining the program (for example the address of a
> pointer could be calculated from user input).
If you only allow safe-mode managed code, pointer arithmethic is not
possible. I don't see a big problem to validate managed code, ensuring
it doesn't do anything "bad" for a fixed definition of "bad".
Manuel
Post a reply to this message
|
 |
|  |
|  |
|
 |
|
 |
|  |
|  |
|
 |
Invisible wrote:
> I don't know about Singularity and it's particular design goals, but the
> other day I was thinking: What would happen if you set out to design a
> new OS completely from scratch? What would that look like?
Dude? That's Singularity. :-)
Download the release and read the design notes, too.
http://codeplex.com/singularity
> What if we wanted to shake that up a litle? What might we decide to do
> differently?
Read the Singularity papers. This is exactly the premise they started with.
> How about if, say, the program could somehow "tell" the OS what
> arguments it actually accepts?
Yep! And what if, when you installed the program, it said "Sorry, but
this program requires the ability to connect to the USB printer, and you
don't have the right USB driver installed"? Or "I can't let you install
telnet[1] until you have some sort of TCP/IP stack installed."
[1] Note that "install" means "make available for running." Of course
the uninstalled code can sit out there.
> Hey, let's go one better. The majority of CLI arguments are either
> on/off switches or filenames, right?
Only in a UNIX-like OS. The majority of arguments are instructions on
how to treat the other arguments, or references to something in the file
system namespace, there.
Once you can specify as arguments things beyond stuff represented as
strings, you get a whole nuther ball of wax. You're not stuck with
stdin, stdout, and stderr, for example. You can say "Hey, this program
needs an interrupt and two DMA channels to serve as a driver" for example.
> Once you start thinking this way, you start to see that actually, if we
> get the OS to interpret the arguments and pass *structured* data to the
> program [rather than just a blob of textual data], suddenly all sorts of
> interesting ideas become possible.
Read the bits about "compile-time reflection". You really don't even
need to "tell" the OS this info, if it's reflected in the types your
system supports.
> If nothing else, it means that CLI arguments now have a standardised
> format, enforced by the OS, which makes it easier to learn how to
> operate each new program. Maybe all the program does is somehow list a
> bunch of settings it requires? Maybe then you can specify those either
> by CLI arguments, or a per-user or per-machine set of defaults? Maybe
> the OS has a database of these default settings somewhere? All kinds of
> interesting ideas to throw around.
Yep. Singularity calls it the "application manifest". An application is
a first-class object, rather than being "a pile of files full of code".
> Maybe we can do something more interesting here? Maybe we can pass
> *structured* data around instead of just plain character streams?
Yes. Structured and typed, including a finite state machine to say when
it's OK to send and what you need to be ready to receive, checked at
compile time to ensure your code actually obeys the protocol, then
compiled down to native code and never checked again.
> Maybe we can do something better here too? Maybe we could have a small
> set of standard categories like "program bug", "resource exhaustion",
> "the network won't answer me", and provide a set of application-specific
> codes for the actual failures that a particular program can have?
Nah. You just answer back on the stream that goes to whoever invoked
you. :-) Why would only the parent want to know how you exited?
> OS how to invoke different levels of logging? Just some ideas.
SDN 14 Tracing.pdf
> I suppose you could go down the route of having files contain structured
> data - but again you're going to get people arguing over the best way of
> structuring things.
Not any more than saying "you'll have people arguing about the best way
to represent structures".
You're still thinking UNIXy. Get rid of the mindset that you have to
agree on data formats and embrace the mindset that you only have to
agree on APIs. You don't need to stick some Perl script in the middle of
a pipeline to transform your data. You present the data in a
semantically-meaningful way.
I.e., your directory isn't a file with 16-byte entries, the first two
bytes of which is a i-node number, and if non-zero, is followed by up to
14 bytes nul-terminated file name.
Your directory, instead, is a set of function calls like "read first",
"read next", "provide details". You don't have to come up with some
on-disk format to define.
> I've often thought about what would happen if, say, Smalltalk was the
> entire OS. Then the OS would "know about" the internal workings of each
> program to a large degree, and that opens up some rather interesting
> possibilities. Things like highly structured IPC and so forth. Trouble
> is, now you can only run stuff implemented in Smalltalk...
Yep. That's traditionally been the problem. Singularity does this, but
makes MSIL the bottom level for applications and such. So anything you
can compile into structured typed assembler language you can use. This
includes C#, F#, Iron Python, etc.
> In short, once you sit down and start to question the way OSes work
> today, you start to see that there are actually many things we could be
> doing differently - ranging from the conservative to the highly radical.
> (To me, really radical ideas are interesting to think about but probably
> wouldn't work too well in practice.)
It seems to be working well in practice. For example, one radical idea
(which I always thought would be a good idea) is to use safe languages
for everything. Singularity does this, and in so doing, can run
everything in Ring 0 and with no hardware memory protection. It actually
runs faster, because it takes less time to enforce array bound checks
(for example) than it does to go thru the memory mapping hardware on
every access to memory. Turning off memory protection more than makes up
for doing it in software. And then, doing things like scheduling
threads, adding space to the thread stack, allocating and freeing memory
blocks ... all that is in user space, because the compiler can inline
the kernelesque instructions.
> Heh, if *I* had 3 years to sit and write an OS, maybe I could experiment
> with a few of these ideas? ;-)
Read the papers first. It's exactly what I've been wanting to do myself,
except they figured out what seems a really good way of doing it.
I like the stuff on permissions, too. Stuff like "setuid" not being a
privileged operation is kind of funky. :-)
Really, all the stuff you're speculating about, they've written about in
detail and implemented. It's very cool. I highly suggest if the idea
"what if we started over in *this* millenium?" interests you, you read
the literature they've published. :)
--
Darren New / San Diego, CA, USA (PST)
Ever notice how people in a zombie movie never already know how to
kill zombies? Ask 100 random people in America how to kill someone
who has reanimated from the dead in a secret viral weapons lab,
and how many do you think already know you need a head-shot?
Post a reply to this message
|
 |
|  |
|  |
|
 |
|
 |
|  |
|  |
|
 |
Manuel Kasten <kas### [at] gmx de> wrote:
> If you only allow safe-mode managed code, pointer arithmethic is not
> possible.
So you can't have arrays?
--
- Warp
Post a reply to this message
|
 |
|  |
|  |
|
 |
|
 |
|  |
|  |
|
 |
Warp wrote:
> Invisible <voi### [at] dev null> wrote:
>> Their approach seems to be to "verify" each program before it runs,
>> checking that it doesn't do any "bad" things.
>
> That's impossible. It can be proven that it's an unsolvable problem,
> exactly for the same reason as the halting problem is unsolvable. There's
> no way for any program to check if a piece of code is executed and how.
This isn't quite true. If you restrict the forms of programs you accept
to something you can verify, then you can do this pretty easily. It is,
for example, why people are willing to run java applets in their browser.
While it's true the halting problem prevents you from knowing whether
any arbitrary TM will halt, it doesn't prevent you from knowing whether
any arbitrary regular expression will halt, for example. If you
eliminate the instructions that let you write to arbitrary parts of
memory you don't own, then it's not too hard to check your language
works fine.
It's also the case that hardware doesn't 100% solve the problem either.
You have to (a) trust the hardware not to be buggy, and (b) trust the OS
to correctly set up the hardware.
> It's also impossible for it to know, for example, the addresses of all
> pointers by simply examining the program (for example the address of a
> pointer could be calculated from user input).
No, because the OS won't install a program that calculates the address
of a pointer calculated from user input. Basically, you use C# or one of
the other .NET languages, that compiles down to a strongly-typed
assembly language. Then, before you run the program, you gather up all
the strongly typed assembler,
>> Presumably verifying whether a program does or does not do something
>> "bad" is formally equivilent to the halting problem, so I imagine they
>> apply some arbitrary set of restrictions to simplify the problem.
>
> Those restrictions could seriously hinder compiler optimizations.
Actually, it turns out the compiler can do a *much* better job, because
it can track the usage of a whole bunch of stuff that's hard to track
when you allow arbitrary pointers.
> For example accessing the nth element of an array can usually be done
> with a simple CPU opcode. However, if the system restricts this because
> it cannot prove what that n might contain, it means that the compiler
> cannot generate the single opcode for accessing that array, but must
> perform something much more complicated to keep the system happy.
Right. They actually check this, and discover it's about a 4% overhead
to do the checks in software. And it's about a 6% overhead to do the
checks in hardware. Where Is Your God Now? Mwa ha ha ha! ;-)
And it's about a 33% overhead to actually put processes in different
address spaces and enforce that they can't change the VM mapping by
taking away the ring-0 instructions, compared to checking at compile
time that you don't go out of bounds and enforcing at runtime where you
can't check at compile time, once you count up TLB misses, TLB flushes,
frobbing stacks around during an interrupt, etc.
> Ah, but that's the trend nowadays: Computers get faster and the amount
> of RAM grows exponentially with time. There's no need for highly optimized
> code.
You should read the papers. *Because* the input is actually structured,
they can compile the stuff and throw away (for example) fields and
methods that aren't used, include a GC that's specific to the problem
being solved (e.g., a higher-overhead real-time GC only for real-time
programs), and they get a tremendous efficiency boost.
--
Darren New / San Diego, CA, USA (PST)
Ever notice how people in a zombie movie never already know how to
kill zombies? Ask 100 random people in America how to kill someone
who has reanimated from the dead in a secret viral weapons lab,
and how many do you think already know you need a head-shot?
Post a reply to this message
|
 |
|  |
|  |
|
 |
|
 |
|  |
|  |
|
 |
Warp wrote:
> Manuel Kasten <kas### [at] gmx de> wrote:
>> If you only allow safe-mode managed code, pointer arithmethic is not
>> possible.
>
> So you can't have arrays?
>
You can, but not in the sense of a contiguous block of memory containing
the data sense.
The question I have is what if I want to develop an application (such as
a high performance image analysis package) against a platform that uses
only managed code. Could it be done using the CPU the most efficiently?
If I were restricted to "safe" code, is there a way to remove that
restriction for that app?
Post a reply to this message
|
 |
|  |
|  |
|
 |
|
 |
|  |
|  |
|
 |
Mike Raiford <mra### [at] hotmail com> wrote:
> > So you can't have arrays?
> You can, but not in the sense of a contiguous block of memory containing
> the data sense.
But then the answer is "no".
--
- Warp
Post a reply to this message
|
 |
|  |
|  |
|
 |
|
 |
|  |
|  |
|
 |
Darren New <dne### [at] san rr com> wrote:
> > For example accessing the nth element of an array can usually be done
> > with a simple CPU opcode. However, if the system restricts this because
> > it cannot prove what that n might contain, it means that the compiler
> > cannot generate the single opcode for accessing that array, but must
> > perform something much more complicated to keep the system happy.
> Right. They actually check this, and discover it's about a 4% overhead
> to do the checks in software. And it's about a 6% overhead to do the
> checks in hardware. Where Is Your God Now? Mwa ha ha ha! ;-)
I have really hard time believing that if you, for example, calculate
the sum of all the integers in an array, adding boundary checks to every
single read operation will add only 4% of overhead.
Even if the boundary check would take 1 clock cycle, that would mean
that reading the value from the array and adding its value to a register
takes 25 clock cycles.
--
- Warp
Post a reply to this message
|
 |
|  |
|  |
|
 |
|
 |
|  |
|  |
|
 |
Warp wrote:
> I have really hard time believing that if you, for example, calculate
> the sum of all the integers in an array, adding boundary checks to every
> single read operation will add only 4% of overhead.
Accessing "everything in this array" is a pretty common operation - and
one that an optimising compiler can presumably spot and optimise pretty
easily.
Now, if you start accessing an array in some really random order... (And
let's face it, what the hell are arrays especially good at?)
--
http://blog.orphi.me.uk/
http://www.zazzle.com/MathematicalOrchid*
Post a reply to this message
|
 |
|  |
|  |
|
 |
|
 |
|  |
|  |
|
 |
>> I don't know about Singularity and it's particular design goals, but
>> the other day I was thinking: What would happen if you set out to
>> design a new OS completely from scratch? What would that look like?
>
> Dude? That's Singularity. :-)
Not quite. They didn't do it the way *I* would do it. ;-)
Actually, I can think of *several* ways to do it, and I'd probably spend
the rest of my life analysing them and never write any real code! :-/
> Download the release and read the design notes, too.
> http://codeplex.com/singularity
Uh... why? It's an experimental research prototype. It probably doesn't
even run yet.
>> What if we wanted to shake that up a litle? What might we decide to do
>> differently?
>
> Read the Singularity papers. This is exactly the premise they started with.
They were focused specifically on "how do we increase security?" I'm
just thinking "what widely-used but suboptimal abstractions might we
change?"
> Or "I can't let you install
> telnet[1] until you have some sort of TCP/IP stack installed."
Isn't this what RPM does?
> Once you can specify as arguments things beyond stuff represented as
> strings, you get a whole nuther ball of wax. You're not stuck with
> stdin, stdout, and stderr, for example. You can say "Hey, this program
> needs an interrupt and two DMA channels to serve as a driver" for example.
A device driver is a rather unusual type of program. I'm thinking more
about end-user level stuff. You know - the kind of thing you might
invoke by hand.
> Yep. Singularity calls it the "application manifest". An application is
> a first-class object, rather than being "a pile of files full of code".
As an aside... Whenever I compile a Haskell program, it generates a
*.manifest file that contains some random XML. Any idea WTF that's about?
>> Maybe we can do something better here too? Maybe we could have a small
>> set of standard categories like "program bug", "resource exhaustion",
>> "the network won't answer me", and provide a set of
>> application-specific codes for the actual failures that a particular
>> program can have?
>
> Nah. You just answer back on the stream that goes to whoever invoked
> you. :-) Why would only the parent want to know how you exited?
Maybe because it's a lights-out system and you want the failed process
to be started back up again? IDK.
>> I suppose you could go down the route of having files contain
>> structured data - but again you're going to get people arguing over
>> the best way of structuring things.
>
> Not any more than saying "you'll have people arguing about the best way
> to represent structures".
>
> You're still thinking UNIXy. Get rid of the mindset that you have to
> agree on data formats and embrace the mindset that you only have to
> agree on APIs.
I guess if you follow all this to its logical conclusion, you end up
with "the filesystem is a relational database" - and we all know what a
bad idea *that* was!
>> I've often thought about what would happen if, say, Smalltalk was the
>> entire OS. Then the OS would "know about" the internal workings of
>> each program to a large degree, and that opens up some rather
>> interesting possibilities. Things like highly structured IPC and so
>> forth. Trouble is, now you can only run stuff implemented in Smalltalk...
>
> Yep. That's traditionally been the problem. Singularity does this, but
> makes MSIL the bottom level for applications and such. So anything you
> can compile into structured typed assembler language you can use. This
> includes C#, F#, Iron Python, etc.
(Or Haskell, when they fix the bitrot in the MSIL backend.)
One day, I'll have to sit down and find out how the Java VM or the CLR work.
>> (To me, really radical ideas are interesting to think about
>> but probably wouldn't work too well in practice.)
>
> It seems to be working well in practice. For example, one radical idea
> (which I always thought would be a good idea) is to use safe languages
> for everything. Singularity does this, and in so doing, can run
> everything in Ring 0 and with no hardware memory protection.
Yah, but this only really works if you're not going to execute arbitrary
C code - which would be kind of a problem.
>> Heh, if *I* had 3 years to sit and write an OS, maybe I could
>> experiment with a few of these ideas? ;-)
>
> Read the papers first. It's exactly what I've been wanting to do myself,
> except they figured out what seems a really good way of doing it.
>
> I like the stuff on permissions, too.
Specifying access control by application seems like a perfectly logical
thing to want to do. That whole Unixy trip with creating a user and
group named "apache" and making sure the Apache httpd runs under that
account just seems like a huge kludge to me...
> Really, all the stuff you're speculating about, they've written about in
> detail and implemented. It's very cool. I highly suggest if the idea
> "what if we started over in *this* millenium?" interests you, you read
> the literature they've published. :)
...and what do you think I just spent my entire afternoon doing? :-P
--
http://blog.orphi.me.uk/
http://www.zazzle.com/MathematicalOrchid*
Post a reply to this message
|
 |
|  |
|  |
|
 |
|
 |
|  |
|
 |