|
|
|
|
|
|
| |
| |
|
|
|
|
| |
| |
|
|
Warp wrote:
> Invisible <voi### [at] devnull> wrote:
>> Their approach seems to be to "verify" each program before it runs,
>> checking that it doesn't do any "bad" things.
>
> That's impossible. It can be proven that it's an unsolvable problem,
> exactly for the same reason as the halting problem is unsolvable. There's
> no way for any program to check if a piece of code is executed and how.
This isn't quite true. If you restrict the forms of programs you accept
to something you can verify, then you can do this pretty easily. It is,
for example, why people are willing to run java applets in their browser.
While it's true the halting problem prevents you from knowing whether
any arbitrary TM will halt, it doesn't prevent you from knowing whether
any arbitrary regular expression will halt, for example. If you
eliminate the instructions that let you write to arbitrary parts of
memory you don't own, then it's not too hard to check your language
works fine.
It's also the case that hardware doesn't 100% solve the problem either.
You have to (a) trust the hardware not to be buggy, and (b) trust the OS
to correctly set up the hardware.
> It's also impossible for it to know, for example, the addresses of all
> pointers by simply examining the program (for example the address of a
> pointer could be calculated from user input).
No, because the OS won't install a program that calculates the address
of a pointer calculated from user input. Basically, you use C# or one of
the other .NET languages, that compiles down to a strongly-typed
assembly language. Then, before you run the program, you gather up all
the strongly typed assembler,
>> Presumably verifying whether a program does or does not do something
>> "bad" is formally equivilent to the halting problem, so I imagine they
>> apply some arbitrary set of restrictions to simplify the problem.
>
> Those restrictions could seriously hinder compiler optimizations.
Actually, it turns out the compiler can do a *much* better job, because
it can track the usage of a whole bunch of stuff that's hard to track
when you allow arbitrary pointers.
> For example accessing the nth element of an array can usually be done
> with a simple CPU opcode. However, if the system restricts this because
> it cannot prove what that n might contain, it means that the compiler
> cannot generate the single opcode for accessing that array, but must
> perform something much more complicated to keep the system happy.
Right. They actually check this, and discover it's about a 4% overhead
to do the checks in software. And it's about a 6% overhead to do the
checks in hardware. Where Is Your God Now? Mwa ha ha ha! ;-)
And it's about a 33% overhead to actually put processes in different
address spaces and enforce that they can't change the VM mapping by
taking away the ring-0 instructions, compared to checking at compile
time that you don't go out of bounds and enforcing at runtime where you
can't check at compile time, once you count up TLB misses, TLB flushes,
frobbing stacks around during an interrupt, etc.
> Ah, but that's the trend nowadays: Computers get faster and the amount
> of RAM grows exponentially with time. There's no need for highly optimized
> code.
You should read the papers. *Because* the input is actually structured,
they can compile the stuff and throw away (for example) fields and
methods that aren't used, include a GC that's specific to the problem
being solved (e.g., a higher-overhead real-time GC only for real-time
programs), and they get a tremendous efficiency boost.
--
Darren New / San Diego, CA, USA (PST)
Ever notice how people in a zombie movie never already know how to
kill zombies? Ask 100 random people in America how to kill someone
who has reanimated from the dead in a secret viral weapons lab,
and how many do you think already know you need a head-shot?
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
Warp wrote:
> Manuel Kasten <kas### [at] gmxde> wrote:
>> If you only allow safe-mode managed code, pointer arithmethic is not
>> possible.
>
> So you can't have arrays?
>
You can, but not in the sense of a contiguous block of memory containing
the data sense.
The question I have is what if I want to develop an application (such as
a high performance image analysis package) against a platform that uses
only managed code. Could it be done using the CPU the most efficiently?
If I were restricted to "safe" code, is there a way to remove that
restriction for that app?
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
Mike Raiford <mra### [at] hotmailcom> wrote:
> > So you can't have arrays?
> You can, but not in the sense of a contiguous block of memory containing
> the data sense.
But then the answer is "no".
--
- Warp
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
Darren New <dne### [at] sanrrcom> wrote:
> > For example accessing the nth element of an array can usually be done
> > with a simple CPU opcode. However, if the system restricts this because
> > it cannot prove what that n might contain, it means that the compiler
> > cannot generate the single opcode for accessing that array, but must
> > perform something much more complicated to keep the system happy.
> Right. They actually check this, and discover it's about a 4% overhead
> to do the checks in software. And it's about a 6% overhead to do the
> checks in hardware. Where Is Your God Now? Mwa ha ha ha! ;-)
I have really hard time believing that if you, for example, calculate
the sum of all the integers in an array, adding boundary checks to every
single read operation will add only 4% of overhead.
Even if the boundary check would take 1 clock cycle, that would mean
that reading the value from the array and adding its value to a register
takes 25 clock cycles.
--
- Warp
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
Warp wrote:
> I have really hard time believing that if you, for example, calculate
> the sum of all the integers in an array, adding boundary checks to every
> single read operation will add only 4% of overhead.
Accessing "everything in this array" is a pretty common operation - and
one that an optimising compiler can presumably spot and optimise pretty
easily.
Now, if you start accessing an array in some really random order... (And
let's face it, what the hell are arrays especially good at?)
--
http://blog.orphi.me.uk/
http://www.zazzle.com/MathematicalOrchid*
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
>> I don't know about Singularity and it's particular design goals, but
>> the other day I was thinking: What would happen if you set out to
>> design a new OS completely from scratch? What would that look like?
>
> Dude? That's Singularity. :-)
Not quite. They didn't do it the way *I* would do it. ;-)
Actually, I can think of *several* ways to do it, and I'd probably spend
the rest of my life analysing them and never write any real code! :-/
> Download the release and read the design notes, too.
> http://codeplex.com/singularity
Uh... why? It's an experimental research prototype. It probably doesn't
even run yet.
>> What if we wanted to shake that up a litle? What might we decide to do
>> differently?
>
> Read the Singularity papers. This is exactly the premise they started with.
They were focused specifically on "how do we increase security?" I'm
just thinking "what widely-used but suboptimal abstractions might we
change?"
> Or "I can't let you install
> telnet[1] until you have some sort of TCP/IP stack installed."
Isn't this what RPM does?
> Once you can specify as arguments things beyond stuff represented as
> strings, you get a whole nuther ball of wax. You're not stuck with
> stdin, stdout, and stderr, for example. You can say "Hey, this program
> needs an interrupt and two DMA channels to serve as a driver" for example.
A device driver is a rather unusual type of program. I'm thinking more
about end-user level stuff. You know - the kind of thing you might
invoke by hand.
> Yep. Singularity calls it the "application manifest". An application is
> a first-class object, rather than being "a pile of files full of code".
As an aside... Whenever I compile a Haskell program, it generates a
*.manifest file that contains some random XML. Any idea WTF that's about?
>> Maybe we can do something better here too? Maybe we could have a small
>> set of standard categories like "program bug", "resource exhaustion",
>> "the network won't answer me", and provide a set of
>> application-specific codes for the actual failures that a particular
>> program can have?
>
> Nah. You just answer back on the stream that goes to whoever invoked
> you. :-) Why would only the parent want to know how you exited?
Maybe because it's a lights-out system and you want the failed process
to be started back up again? IDK.
>> I suppose you could go down the route of having files contain
>> structured data - but again you're going to get people arguing over
>> the best way of structuring things.
>
> Not any more than saying "you'll have people arguing about the best way
> to represent structures".
>
> You're still thinking UNIXy. Get rid of the mindset that you have to
> agree on data formats and embrace the mindset that you only have to
> agree on APIs.
I guess if you follow all this to its logical conclusion, you end up
with "the filesystem is a relational database" - and we all know what a
bad idea *that* was!
>> I've often thought about what would happen if, say, Smalltalk was the
>> entire OS. Then the OS would "know about" the internal workings of
>> each program to a large degree, and that opens up some rather
>> interesting possibilities. Things like highly structured IPC and so
>> forth. Trouble is, now you can only run stuff implemented in Smalltalk...
>
> Yep. That's traditionally been the problem. Singularity does this, but
> makes MSIL the bottom level for applications and such. So anything you
> can compile into structured typed assembler language you can use. This
> includes C#, F#, Iron Python, etc.
(Or Haskell, when they fix the bitrot in the MSIL backend.)
One day, I'll have to sit down and find out how the Java VM or the CLR work.
>> (To me, really radical ideas are interesting to think about
>> but probably wouldn't work too well in practice.)
>
> It seems to be working well in practice. For example, one radical idea
> (which I always thought would be a good idea) is to use safe languages
> for everything. Singularity does this, and in so doing, can run
> everything in Ring 0 and with no hardware memory protection.
Yah, but this only really works if you're not going to execute arbitrary
C code - which would be kind of a problem.
>> Heh, if *I* had 3 years to sit and write an OS, maybe I could
>> experiment with a few of these ideas? ;-)
>
> Read the papers first. It's exactly what I've been wanting to do myself,
> except they figured out what seems a really good way of doing it.
>
> I like the stuff on permissions, too.
Specifying access control by application seems like a perfectly logical
thing to want to do. That whole Unixy trip with creating a user and
group named "apache" and making sure the Apache httpd runs under that
account just seems like a huge kludge to me...
> Really, all the stuff you're speculating about, they've written about in
> detail and implemented. It's very cool. I highly suggest if the idea
> "what if we started over in *this* millenium?" interests you, you read
> the literature they've published. :)
...and what do you think I just spent my entire afternoon doing? :-P
--
http://blog.orphi.me.uk/
http://www.zazzle.com/MathematicalOrchid*
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
Orchid XP v8 wrote:
>>> I don't know about Singularity and it's particular design goals, but
>>> the other day I was thinking: What would happen if you set out to
>>> design a new OS completely from scratch? What would that look like?
>>
>> Dude? That's Singularity. :-)
>
> Not quite. They didn't do it the way *I* would do it. ;-)
Then you should have said "What would happen if *I* set out to design a
new OS completely from scratch?" ;-)
>> Download the release and read the design notes, too.
>> http://codeplex.com/singularity
>
> Uh... why? It's an experimental research prototype. It probably doesn't
> even run yet.
You say this with such confidence. Yet, surprisingly, you didn't
actually (say) read any of the papers, and noticed they're all dated
five years ago.
> They were focused specifically on "how do we increase security?" I'm
> just thinking "what widely-used but suboptimal abstractions might we
> change?"
No. You didn't read the papers, so you don't know what they're working
on. It's better to read what the authors wrote than what some blogger
says about what the authors wrote.
>> Or "I can't let you install telnet[1] until you have some sort of
>> TCP/IP stack installed."
>
> Isn't this what RPM does?
No.
>> Once you can specify as arguments things beyond stuff represented as
>> strings, you get a whole nuther ball of wax. You're not stuck with
>> stdin, stdout, and stderr, for example. You can say "Hey, this program
>> needs an interrupt and two DMA channels to serve as a driver" for
>> example.
>
> A device driver is a rather unusual type of program. I'm thinking more
> about end-user level stuff. You know - the kind of thing you might
> invoke by hand.
Right. Of course, since they're writing the OS, they're worried about
the sorts of problems drivers cause. But the same result applies to
things you invoke by hand or from other programs.
> As an aside... Whenever I compile a Haskell program, it generates a
> *.manifest file that contains some random XML. Any idea WTF that's about?
Well, a manifest is a list of what's included. Other than that, I
couldn't help you guess without seeing one.
>> Nah. You just answer back on the stream that goes to whoever invoked
>> you. :-) Why would only the parent want to know how you exited?
>
> Maybe because it's a lights-out system and you want the failed process
> to be started back up again? IDK.
No, I'm saying why would you want the exit status to *ONLY* go to the
parent process, and not to whoever you want it to go to? Why not list in
the application manifest all the applications that'll be interested in
knowing that program X failed? Wouldn't you want everyone using the TCP
stack to know that the nic driver failed?
>> You're still thinking UNIXy. Get rid of the mindset that you have to
>> agree on data formats and embrace the mindset that you only have to
>> agree on APIs.
>
> I guess if you follow all this to its logical conclusion, you end up
> with "the filesystem is a relational database" - and we all know what a
> bad idea *that* was!
Uh, no. You wind up with "everything is strongly typed", not necessarily
"everything is the same type". I am not sure I've discovered exactly
what they store in files - I'm still going thru the papers - but I'm
pretty sure it's not relational.
It's not unlike the Amiga OS in that respect, except safe and strongly
typed.
>> Yep. That's traditionally been the problem. Singularity does this, but
>> makes MSIL the bottom level for applications and such. So anything you
>> can compile into structured typed assembler language you can use. This
>> includes C#, F#, Iron Python, etc.
>
> (Or Haskell, when they fix the bitrot in the MSIL backend.)
Yep. Or most anything. I'm pretty impressed that they managed to get
functional languages doing their thing in an OO assembler language.
> One day, I'll have to sit down and find out how the Java VM or the CLR
> work.
It's ugly. I think the JVM is probably a little easier to understand.
But think of it merely as strongly typed assembler language with lots of
metadata about types and layouts.
> Yah, but this only really works if you're not going to execute arbitrary
> C code - which would be kind of a problem.
Exactly. Why do you need to execute arbitrary C code, tho? Other than
compatibility?
> Specifying access control by application seems like a perfectly logical
> thing to want to do. That whole Unixy trip with creating a user and
> group named "apache" and making sure the Apache httpd runs under that
> account just seems like a huge kludge to me...
Yes, exactly. Singularity lets you specify it as both, including the
history. So "PHP running from user Fred invoked via Apache" can have
different permissions from "PHP running from user Fred invoked via bash".
> ....and what do you think I just spent my entire afternoon doing? :-P
OK. Well, some of your assertions about how it works were at odds with
what they wrote, so I assumed you hadn't read all the way through.
--
Darren New / San Diego, CA, USA (PST)
Ever notice how people in a zombie movie never already know how to
kill zombies? Ask 100 random people in America how to kill someone
who has reanimated from the dead in a secret viral weapons lab,
and how many do you think already know you need a head-shot?
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
Mike Raiford wrote:
> Warp wrote:
>> Manuel Kasten <kas### [at] gmxde> wrote:
>>> If you only allow safe-mode managed code, pointer arithmethic is not
>>> possible.
>>
>> So you can't have arrays?
>>
>
> You can, but not in the sense of a contiguous block of memory containing
> the data sense.
Actually, yes, you can. You just bounds-check the array. You can do
that in a C implementation, even. People just don't for some reason.
Indeed, the language they use for the OS has "representation structures"
which are specifically designed to (for example) land in certain
memory-mapped hardware bits.
There's no problem supporting arrays. Arrays are objects. The problem is
supporting arbitrary untyped pointers assigned non-pointer values -
i.e., the problem is casting an integer to a pointer.
> The question I have is what if I want to develop an application (such as
> a high performance image analysis package) against a platform that uses
> only managed code. Could it be done using the CPU the most efficiently?
Sure, why not? If the compiler can prove you're not violating the memory
constraints, why not? Note that their tests show it's actually more
efficient to check in software than in hardware, and the software checks
are pretty efficient.
> If I were restricted to "safe" code, is there a way to remove that
> restriction for that app?
No. That's the point.
I mean, I suppose, sure, you could. But it's not going to be a regular
app. You'd need to install it differently and prove you're allowed to.
That's how the kernel, for example, works. It's like asking "can I
write code in a Linux app that bypasses the memory mapping hardware?"
Sure, but it's far from normal.
--
Darren New / San Diego, CA, USA (PST)
Ever notice how people in a zombie movie never already know how to
kill zombies? Ask 100 random people in America how to kill someone
who has reanimated from the dead in a secret viral weapons lab,
and how many do you think already know you need a head-shot?
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
Warp wrote:
> Mike Raiford <mra### [at] hotmailcom> wrote:
>>> So you can't have arrays?
>
>> You can, but not in the sense of a contiguous block of memory containing
>> the data sense.
>
> But then the answer is "no".
I don't think Mike read the papers. Of course you can have an array of
contiguous memory. You declare it as an array of structs, just like you
would in C.
They even have a mechanism whereby you can declare a struct with a
definite memory layout that multiple different languages can reference,
and an operator that says "treat this as the representation of an
object", which basically adds the vtable after the fact for your
particular program. I.e., you can cast an object in memory from a flat
data structure into a full object-oriented object with inheritance and
methods and all that, without moving the memory that holds the fields.
And since you're sharing that memory with different languages, you can
have the different languages cast it into different objects without
munging it up for any one particular language.
--
Darren New / San Diego, CA, USA (PST)
Ever notice how people in a zombie movie never already know how to
kill zombies? Ask 100 random people in America how to kill someone
who has reanimated from the dead in a secret viral weapons lab,
and how many do you think already know you need a head-shot?
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
Warp wrote:
> I have really hard time believing that if you, for example, calculate
> the sum of all the integers in an array, adding boundary checks to every
> single read operation will add only 4% of overhead.
The compiler can be pretty smart. You can actually optimize out the
bounds checking most of the time.
int x[50]; int y;
for (i = 0; i < 50; i++) x[i] = i;
for (i = 0; i < 50; i++) y += x[i];
That won't have any bounds-checking code included.
--
Darren New / San Diego, CA, USA (PST)
Ever notice how people in a zombie movie never already know how to
kill zombies? Ask 100 random people in America how to kill someone
who has reanimated from the dead in a secret viral weapons lab,
and how many do you think already know you need a head-shot?
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
|
|