| 
|  |  |  
|  |  |  |  |  |  |  |  |  |  |  
|  |  |  |  |  |  |  |  |  |  |  
|  |  | Warp wrote:
>   *That* is the beauty of encapsulation.
OK. So you agree that Python also has encapsulation?
Certainly that's one form of encapsulation, and indeed, it's a form also 
supported by DLLs, client-server architectures, operating systems, 
sandboxes, and many of the other things you said weren't encapsulation.
It's also very useful when designing the original program. But it's more the 
modularity part than the data-hiding part. In other words, if you go back to 
your original wikipedia link, it's the "language construct that facilitates 
the bundling of data with methods operating on that data", not the 
"mechanism for restricting access".
You can do that same sort of thing in any language, if you just make clear 
what's the public API and what isn't. C does this with FILE* structures, for 
example, as well as all the other standard libraries, without any real 
features for encapsulation beyond incomplete types. You can also do it in 
Smalltalk, which very clearly doesn't have anything beyond documentation to 
enforce its modularity.
>   Do you really think that the compiler telling you "you must not access
> this member, I refuse to compile it" is not a better way to enforce
> encapsulation than just some documentation that nobody is going to read
> anyways? 
Not if you're not going to enforce it the other ways as well.
>   If you must have "visible" members in the class declaration for technical
> and/or efficiency reasons, it's better if the compiler offers a tool for
> telling the user "even though you can see this you really shouldn't be
> referencing it directly" rather than relying solely on a comment line.
Sure. What about a naming convention that "comes with the language"? I.e., 
not just a naming convention you made up, but one in common for every single 
program written in the language from day one?
>   So yes, it's much better if the compiler tells the user he is accessing
> what he shouldn't.
I agree it's better. But since you seem to like binary definitions, and you 
don't like me saying "C++ has a little bit of encapsulation but not much", 
I'm curious if you think that (say) Python has encapsulation, given that 
everyone knows that names with _ are private to the class they're in?
>   But where do you draw the line between what is and isn't encapsulation?
I don't make it a binary choice. I say "C has no encapsulation, Python has 
very little (more modularity than encapsulation), C++ has a little more than 
Python (because the compiler refuses to compile stuff that violates the 
language rules), Java and C# have more (because you have to use reflection 
or unsafe code), Erlang has a whole bunch (because only bugs in unsafe code 
can violate it), and Sing# and Hermes have the most (because they have no 
unsafe subsets at all)."
For me, the amount of benefit that encapsulation gives relative to 
documentation is nowhere near the amount of benefit that encapsulation gives 
relative to guarantees about relationships between code and data enforced by 
the language.
>   Clearly, a C struct offers no encapsulation. Also, clearly, a faulty RAM
> chip trashing your program's memory doesn't mean there's no encapsulation
> in the programming language. However, where is the line between those two
> extremes, where encapsulation becomes non-encapsulation?
Myself, I'd say when you can accidentally violate encapsulation, you have 
poor encapsulation.
>   If you call a library offered by the programming language, and that
> library is buggy and trashes your program's memory, does that break
> encapsulation? What if the library calls a system library (such as clib)
> and *that* has a bug which trashes your program's memory? Does that break
> encapsulation? What if instead of a language's own library it's a third-party
> library? What if the kernel has a bug which trashes your program's memory?
> Where exactly lies the line between a bug which trashes the program's memory
> not breaking encapsulation, and doing so? And why?
Those are all breaks of encapsulation, certainly. But those are in the 
implementations, not the languages. A buggy RAM chip that randomly changes 
values on the stack has nothing to do with whether C++ the language offers 
valuable encapsulation.
>> In contrast, C++ has well-defined mechanisms for bypassing the encapsulation 
>> that are used as a normal part of programming in that language.
> 
>   Actually it doesn't. You can read the binary representation of an object
> in memory, but the standard gives no guarantees as to which byte means what.
I don't believe that's true. It certainly wasn't in C.
> The memory layout of objects is completely implementation-defined
Nonsense. Arrays are contiguous. The elements of a structure are all between 
&x and &x + sizeof(x).  There's lots of guarantees. And I'm pretty sure that 
if you have "struct x {int a; int b} y;" that the standard guarantees
&b > &a. Positive integers are represented in twos-complement coding. 
Unsigned integers occupy the the number of bits indicated by multiplying 
sizeof(unsigned) by bits_per_char or whatever the appropriate #define is.
Now, if you're going to argue those guarantees don't count, Ok. I'd say some 
do and some don't - I wouldn't argue that point very hard. But if they're 
there, it's because enough people writing code in the language rely on them 
being there that it was worth standardizing.
>   The standard allows doing that even if it's undefined behavior simply
> because imposing restrictions on that would mean that compilers would
> have to generate significantly less efficient code even for programs
> which don't even attempt doing that. Many people don't mind.
Can I get you to come work for my company? :-)
-- 
Darren New, San Diego CA, USA (PST)
   The question in today's corporate environment is not
   so much "what color is your parachute?" as it is
   "what color is your nose?"
Post a reply to this message
 |  |  |  |  |  |  |  |  
|  |  |  |  |  |  |  |  |  |  |  
|  |  | >> Scientific applications are a vanishingly small market segment.
> 
> I'm not so sure about that.
How many people use computers to do scientific calculations?
How many people use computers to run their office suite?
Yeah, exactly.
>>> That's not "mainstream", that's "popular". :-)
>> There's a difference?
> 
> Yes.
Care to elaborate on what it is?
>> Why would a branch have only one child? 
> 
> Populate the tree one element at the time. At least half the time, 
> you'll have a node with only one child.
How do you figure that?
Tree with 1 element: L
Tree with 2 elements: L + L
Tree with 3 elements: L + (L + L)
Tree with 4 elements: (L + L) + (L + L)
Every node has either zero or two children.
> Perhaps the reason you don't see it a lot in Haskell is because the 
> definition of a node with only one leaf would be ugly. :-)
No, it would be as trivial as the usual binary tree.
>> I haven't seen an N-ary tree in an OO language. ;-) Only strictly 
>> *binary* trees.
> 
> Wow. You need to read more code.  You've never seen a hashtable full of 
> hashtables, or a parse tree?
I've never read the implementation of a hash table, for that matter.
Now a *parse tree* probably has a much more complicated structure than a 
simple binary tree. I can well imagine it containing nodes with all 
sorts of numbers of children. (And there might well be some complicated 
inheritance relationships between the node types.)
> Let's ask this: what good is encapsulation? Why do people think 
> encapsulation is a good thing?
> 
> Answer: because it limits where you have to look to understand the 
> behavior of code. If your language has encapsulation, you can look at 
> the class of that object to determine what accesses and modifies its 
> internal state. If the language has encapsulation but also an escape to 
> an unsafe language (like JNI or P/Invoke or so), then you have to look 
> at your class and all unsafe code that might change the contents of your 
> class.
> 
> Let's ask this: Why doesn't C++ have encapsulation?
> 
> Answer: Where in your code might lie a bug that is causing your class to 
> violate its invariants? If you have two variables in your instance, one 
> of which must always be two times the other, where do you have to look 
> if that is not the case? How much of your code base might cause that 
> change?
In just about any programming language imaginable, it would be possible 
to look at the raw assembly code generated to figure out where some data 
is physically stored in memory, and then link in some raw assembly to 
directly access that data. So by this argument, because it's technically 
possible to circumvent the language restrictions in just about any 
programming language, no programming language exists which supports 
encapsulation.
Which is, of course, nonesense. Real programs don't do this kind of 
thing. If your program doesn't attempt to circumvent encapsulation, then 
you get the benefits of encapsulation.
>>> What is this?
>>>
>>>    (lambda (x) (x + 1))
>>>
>>> It's a lambda expression.
>>>
>>>    y = (lambda (x) (x + 1))
>>>
>>> What is y?  It's a closure.
>>
>> I still don't get it. (But then, I don't even know what language that 
>> is...)
> 
> Do you understand the difference between classes and instances?
Yes. (But I'm not sure how this is related to functions...)
>> YOU wrote this paragraph, not me. ;-)
> 
> OK. My bad.  Altho that would certainly explain why I was confused. ;-)
Heh. Talking to yourself is the first sign of madness. ;-)
>> And this isn't the case for Haskell?
> 
> I don't know. Maybe it is, but you don't make it sound that way.
I can't even remember what this statement refers to now...
>> As I said, the likes of Galois and Well Typed make their money 
>> primarily designing systems where correctness is vital. What kind of 
>> systems do you suppose those are?
> 
> I couldn't guess. Galios has an empty web site and Well Typed are 
> consultants.
> 
> I'd honestly be quite surprised if either one builds systems in Haskell 
> where correctness is actually vital. As in, an error in the program 
> means people die.
No, I think it's more "failure would cost us craploads of money, very 
fast". I believe that's the kind of area they work in. (They also seem 
to do quite a bit of government work.)
>> (It's also conspicuous that both Galois and Well Typed employ people 
>> who are also GHC developers... so maybe that's your answer!)
> 
> Yep.
Of course, if you're the language implementers, you can implement 
whatever you want. ;-) (Within reason, of course...)
I wonder if any of the other Haskell compilers will be production-ready 
any time soon?
 Post a reply to this message
 |  |  |  |  |  |  |  |  
|  |  |  |  |  |  |  |  |  |  |  
|  |  | Invisible wrote:
>>> Scientific applications are a vanishingly small market segment.
>>
>> I'm not so sure about that.
> 
> How many people use computers to do scientific calculations?
> How many people use computers to run their office suite?
> Yeah, exactly.
Sure. I wouldn't say "vanishingly small", given the number of offices that 
actually do science for a living.
> Care to elaborate on what it is?
Not especially. Let's just mention "COBOL".
> Every node has either zero or two children.
Fair dinkum. I was thinking of a different algorithm.
> I've never read the implementation of a hash table, for that matter.
You should. I expect things like skip lists and skip graphs and DHTs are the 
sorts of things you'd get into.
> In just about any programming language imaginable, it would be possible 
> to look at the raw assembly code generated to figure out where some data 
> is physically stored in memory, and then link in some raw assembly to 
> directly access that data. 
So, because you can link assembly code to your Haskell, that means Haskell 
isn't functional? Because assembly code isn't functional?
> Which is, of course, nonesense. Real programs don't do this kind of 
> thing. If your program doesn't attempt to circumvent encapsulation, then 
> you get the benefits of encapsulation.
Uh, no. That's my point. You might get some of the benefits of encapsulation 
(like correct code being easier to understand), but if neither the compiler 
nor the runtime enforce encapsulation and you have no way of detecting 
errors that violate encapsulation, then your language is doing a poor job of 
encapsulation, at least for many of the benefits of encapsulation.
>>>> What is this?
>>>>
>>>>    (lambda (x) (x + 1))
>>>>
>>>> It's a lambda expression.
>>>>
>>>>    y = (lambda (x) (x + 1))
>>>>
>>>> What is y?  It's a closure.
>>>
>>> I still don't get it. (But then, I don't even know what language that 
>>> is...)
>>
>> Do you understand the difference between classes and instances?
> 
> Yes. (But I'm not sure how this is related to functions...)
"Lambda" is the class. "Closure" is the instance.
Lambda is an expression. A closure is the value of the expression.
What is this:
    \ x . x + y
It's a lambda expression. It's not a value, because you don't know what "y" 
is. It means different things depending on where it is in the program and 
when you evaluate it. Every time you evaluate it, you'll get something 
different back, depending on "y", yes?
That thing you get back, that encapsulates the value of "y"? That's a closure.
> Heh. Talking to yourself is the first sign of madness. ;-)
Oh, you can't help that. We're all mad here.
-- 
Darren New, San Diego CA, USA (PST)
   The question in today's corporate environment is not
   so much "what color is your parachute?" as it is
   "what color is your nose?"
Post a reply to this message
 |  |  |  |  |  |  |  |  
|  |  |  |  |  |  |  |  |  |  |  
|  |  | Darren New <dne### [at] san rr  com> wrote:
> Warp wrote:
> >   *That* is the beauty of encapsulation.
> OK. So you agree that Python also has encapsulation?
  Does the compiler enforce it?
> >   Do you really think that the compiler telling you "you must not access
> > this member, I refuse to compile it" is not a better way to enforce
> > encapsulation than just some documentation that nobody is going to read
> > anyways? 
> Not if you're not going to enforce it the other ways as well.
  Right. So if it's possible to modify the data by having some wild pointer
bug in the program, then it's not worth having compiler checks at all and
just have everything public.
  As you said, we'll just have to agree to disagree.
> >   If you must have "visible" members in the class declaration for technical
> > and/or efficiency reasons, it's better if the compiler offers a tool for
> > telling the user "even though you can see this you really shouldn't be
> > referencing it directly" rather than relying solely on a comment line.
> Sure. What about a naming convention that "comes with the language"? I.e., 
> not just a naming convention you made up, but one in common for every single 
> program written in the language from day one?
  By that reasoning you could argue that C is an object-oriented language.
After all, you can get OO'ish behavior if you follow certain coding
conventions.
> >   So yes, it's much better if the compiler tells the user he is accessing
> > what he shouldn't.
> I agree it's better. But since you seem to like binary definitions, and you 
> don't like me saying "C++ has a little bit of encapsulation but not much", 
> I'm curious if you think that (say) Python has encapsulation, given that 
> everyone knows that names with _ are private to the class they're in?
  How is it encapsulation if everything is public?
> >   But where do you draw the line between what is and isn't encapsulation?
> I don't make it a binary choice. I say "C has no encapsulation, Python has 
> very little (more modularity than encapsulation), C++ has a little more than 
> Python (because the compiler refuses to compile stuff that violates the 
> language rules)
  That actually contradicts what you said originally. Originally you said
that C++ has *no* encapsulation. Clearly you were drawing a clear line
somewhere.
> >> In contrast, C++ has well-defined mechanisms for bypassing the encapsulation 
> >> that are used as a normal part of programming in that language.
> > 
> >   Actually it doesn't. You can read the binary representation of an object
> > in memory, but the standard gives no guarantees as to which byte means what.
> I don't believe that's true. It certainly wasn't in C.
  What wasn't in C?
  As for the C++ standard, it does certainly not give any guarantees about
the bit representation of objects in memory. It's implementation-defined.
You can read the bits, but there's no guarantee about their meaning (nor
that the meaning won't change from class to class).
> > The memory layout of objects is completely implementation-defined
> Nonsense. Arrays are contiguous.
  Arrays are not objects. Arrays are a collection of objects.
> The elements of a structure are all between 
> &x and &x + sizeof(x).
  No, they aren't. The standard doesn't specify how much padding there may
be between struct elements. The compiler can add as much padding as it wants
(or none at all).
  Also, the standard doesn't guarantee that the first member of a struct
starts from the same memory address as the struct instance itself. (In
practice this can actually differ if the struct has virtual functions.
In that case the pointer-to-struct-instance will not point to the first
member of the struct. There will be some implementation-dependent vtable
pointer there instead.)
  If a member is private, you have no safe, portable way of accessing it
from the outside.
>  There's lots of guarantees. And I'm pretty sure that 
> if you have "struct x {int a; int b} y;" that the standard guarantees
> &b > &a.
  Probably, but if a and b are private, you have no portable way of getting
an address to them. You don't know the offset.
> Positive integers are represented in twos-complement coding. 
  I'm not completely sure the standard guarantees that.
> Unsigned integers occupy the the number of bits indicated by multiplying 
> sizeof(unsigned) by bits_per_char or whatever the appropriate #define is.
  Well, duh, because sizeof(type) is *defined* as telling the amount of
bytes that the type requires.
  That still doesn't help you accessing private data of an object portably.
> Now, if you're going to argue those guarantees don't count, Ok.
  Don't count to what? The issue was whether it's *well-defined* to break
encapsulation in C++. Your own words.
  It's not well-defined. You can try, but the results are not guaranteed.
> >   The standard allows doing that even if it's undefined behavior simply
> > because imposing restrictions on that would mean that compilers would
> > have to generate significantly less efficient code even for programs
> > which don't even attempt doing that. Many people don't mind.
> Can I get you to come work for my company? :-)
  You mean C++ is causing problems there?
  I suppose I should consider myself lucky in that I get to work in
projects where I don't have to go fixing existing code made by incompetent
C++ programmers...
-- 
                                                          - Warp Post a reply to this message
 |  |  |  |  |  |  |  |  
|  |  |  |  |  |  |  |  |  |  |  
|  |  | Warp wrote:
>   Right. So if it's possible to modify the data by having some wild pointer
> bug in the program, then it's not worth having compiler checks at all and
> just have everything public.
Well, sure, it's a little bit better. Not a whole lot better. As I said, 
it's a sliding scale.
For me, not sufficiently important that it's worth worrying about. The fact 
that I can use a pointer to step through the data of a class with private 
members is no better than the fact that I can iterate over the members of a 
Python instance by groping thru the hash table that implements it.
>> Sure. What about a naming convention that "comes with the language"? I.e., 
>> not just a naming convention you made up, but one in common for every single 
>> program written in the language from day one?
> 
>   By that reasoning you could argue that C is an object-oriented language.
> After all, you can get OO'ish behavior if you follow certain coding
> conventions.
Well, that's what I'm asking. I find it odd that "encapsulation" covers 
exactly everything that C++ does, no more no less.
>>>   So yes, it's much better if the compiler tells the user he is accessing
>>> what he shouldn't.
> 
>> I agree it's better. But since you seem to like binary definitions, and you 
>> don't like me saying "C++ has a little bit of encapsulation but not much", 
>> I'm curious if you think that (say) Python has encapsulation, given that 
>> everyone knows that names with _ are private to the class they're in?
> 
>   How is it encapsulation if everything is public?
"""
A language construct that facilitates the bundling of data with the methods 
operating on that data.
"""
Again, it's the definition you pointed to.
>   That actually contradicts what you said originally. Originally you said
> that C++ has *no* encapsulation. Clearly you were drawing a clear line
> somewhere.
It has so little encapsulation (in the "restricting access" part of the 
definition) that it might as well not have encapsulation. If you're talking 
about the data hiding part, I'd say that unsafe languages don't have that. 
Many have the other "bundling data and code" kind of encapsulation.
In other words, I find that marking something as "private" doesn't give 
significantly more protection to private data than marking something with an 
underline does in Python, in practice. In practice, both are trivial to get 
around, accidentally or on purpose.
>>>> In contrast, C++ has well-defined mechanisms for bypassing the encapsulation 
>>>> that are used as a normal part of programming in that language.
>>>   Actually it doesn't. You can read the binary representation of an object
>>> in memory, but the standard gives no guarantees as to which byte means what.
> 
>> I don't believe that's true. It certainly wasn't in C.
> 
>   What wasn't in C?
That the standard said there's no guarantees as to which bytes mean what or 
how things are laid out in memory.
>   As for the C++ standard, it does certainly not give any guarantees about
> the bit representation of objects in memory. It's implementation-defined.
Funky. I was pretty sure at least unsigned had a guarantee that they were 
represented in two's complement binary (e.g., that the bit pattern 0x11 was 
the integer value 3).
>>> The memory layout of objects is completely implementation-defined
>> Nonsense. Arrays are contiguous.
>   Arrays are not objects. Arrays are a collection of objects.
Yeah, I figured that would be your answer.
>> The elements of a structure are all between 
>> &x and &x + sizeof(x).
> 
>   No, they aren't. The standard doesn't specify how much padding there may
> be between struct elements. The compiler can add as much padding as it wants
> (or none at all).
That doesn't invalidate the formula. I didn't say there was nothing else there.
>   Also, the standard doesn't guarantee that the first member of a struct
> starts from the same memory address as the struct instance itself. (In
> practice this can actually differ if the struct has virtual functions.
> In that case the pointer-to-struct-instance will not point to the first
> member of the struct. There will be some implementation-dependent vtable
> pointer there instead.)
I didn't say that it did.
Wasn't one idea of C++ was that structs without vtables would layout the 
same way as C structures?
>   If a member is private, you have no safe, portable way of accessing it
> from the outside.
Right. But I'm not talking about the safe parts. :-)
In any case, yes, there is, because the class can return a pointer to the 
private variable. You might say "that means it's not private any more", but 
it's still accessing the program's private variables. (Uh, that *is* legal 
in C++, right? Returning an int* that points to a private instance variable? 
If not, then color me suitably impressed. :-)
>>  There's lots of guarantees. And I'm pretty sure that 
>> if you have "struct x {int a; int b} y;" that the standard guarantees
>> &b > &a.
> 
>   Probably, but if a and b are private, you have no portable way of getting
> an address to them. You don't know the offset.
You do inside the class. :-)  And besides, I'm just disputing your assertion 
that the language makes no guarantees about the layout of data.
I've worked with languages that make *no* guarantees about the layout. Where 
the lowest bit of *every* integer was 1, regardless of its value. Where 
pointers have reference counts and type flags in the high bits. Stuff like that.
Trust me, C++ makes some guarantees. :-)
>> Positive integers are represented in twos-complement coding. 
>   I'm not completely sure the standard guarantees that.
> 
>> Unsigned integers occupy the the number of bits indicated by multiplying 
>> sizeof(unsigned) by bits_per_char or whatever the appropriate #define is.
> 
>   Well, duh, because sizeof(type) is *defined* as telling the amount of
> bytes that the type requires.
Yes. And then you multiple by bits_per_char to find out the range, which 
means unsigneds don't have padding in the middle, for example.
>   That still doesn't help you accessing private data of an object portably.
That's not what I'm discussing at this point. Besides, you can *access* them 
portably. You can't *interpret* them portably, perhaps, if the structure is 
complicated enough.
>> Now, if you're going to argue those guarantees don't count, Ok.
> 
>   Don't count to what? The issue was whether it's *well-defined* to break
> encapsulation in C++. Your own words.
Don't count the guarantees I listed. You said "C++ makes no guarantess on 
layout" and I listed a bunch it makes.
And breaking encapsulation is portably accessing private members from 
outside the class. Which I can certainly do using memcpy for example.
>   It's not well-defined. You can try, but the results are not guaranteed.
So write(out, &xyz, sizeof(xyz)) where xyz is a class or struct with private 
members might actually crash?
memcpy(myarray[0], myarray[1], sizeof(myarray[0])) isn't well-defined? 
There's no meaning for that statement when myarray[0] is a struct with 
private members?
Regardless of whether you know which private variable is laid out where, 
you're still "accessing private variables" in ways that more secure 
languages disallow.
>   You mean C++ is causing problems there?
Oh, you wouldn't believe. Not C++ per se, but unsafe languages in general, 
poorly organized and badly documented.
>   I suppose I should consider myself lucky in that I get to work in
> projects where I don't have to go fixing existing code made by incompetent
> C++ programmers...
Yep. Well, honestly, the whole development chain is screwed. Not only is the 
code buggy, but the authors are unwilling to fix it, describe it, document 
it, or really take any responsibility at all for its sorry shape.
-- 
Darren New, San Diego CA, USA (PST)
   The question in today's corporate environment is not
   so much "what color is your parachute?" as it is
   "what color is your nose?"
Post a reply to this message
 |  |  |  |  |  |  |  |  
|  |  |  |  |  |  |  |  |  |  |  
|  |  | Darren New <dne### [at] san rr  com> wrote:
> >   How is it encapsulation if everything is public?
> """
> A language construct that facilitates the bundling of data with the methods 
> operating on that data.
> """
> Again, it's the definition you pointed to.
  Only if you ignore the data hiding part, which is essential for
abstraction.
> >   That actually contradicts what you said originally. Originally you said
> > that C++ has *no* encapsulation. Clearly you were drawing a clear line
> > somewhere.
> It has so little encapsulation (in the "restricting access" part of the 
> definition) that it might as well not have encapsulation.
  You are exaggerating. On purpose.
> In other words, I find that marking something as "private" doesn't give 
> significantly more protection to private data than marking something with an 
> underline does in Python, in practice. In practice, both are trivial to get 
> around, accidentally or on purpose.
  You are exaggerating. On purpose.
  You can't bypass the access rights in C++ if you want to write a program
which is correct and portable. If you try to bypass it, your program will
not be portable, will be against the standard, and thus incorrect. C++ allows
you to write incorrect programs which will compile and run (on one given
system) for efficiency reasons, but that doesn't make the program correct.
  That's a *completely* different thing than having everything public and
only using a naming convention to denote "private" members. If the member
is public, then there will be a 100% correct way of accessing it from the
outside which is not implementation-defined.
  There's a huge difference. You make it sound like there was no difference
at all. Just for the sake of argument.
> >> I don't believe that's true. It certainly wasn't in C.
> > 
> >   What wasn't in C?
> That the standard said there's no guarantees as to which bytes mean what or 
> how things are laid out in memory.
  I don't think the C standard guarantees how much padding there will be
between members of a struct either. The bit representation of a struct
instance may change from system to system, and even from compiler to
compiler, and hence you must not make assumptions if you want your program
to be portable.
  (Moreover, C doesn't guarantee endianess either.)
> >   As for the C++ standard, it does certainly not give any guarantees about
> > the bit representation of objects in memory. It's implementation-defined.
> Funky. I was pretty sure at least unsigned had a guarantee that they were 
> represented in two's complement binary (e.g., that the bit pattern 0x11 was 
> the integer value 3).
  An unsigned is not an object, mind you. (We are talking, after all, about
accessing the private members of an object. An unsigned integer has no such
thing.)
  Also the standard doesn't guarantee word endianess, so the bit pattern
may actually differ from one system to the next even when we are talking
about unsigned integers.
> >> The elements of a structure are all between 
> >> &x and &x + sizeof(x).
> > 
> >   No, they aren't. The standard doesn't specify how much padding there may
> > be between struct elements. The compiler can add as much padding as it wants
> > (or none at all).
> That doesn't invalidate the formula. I didn't say there was nothing else there.
  So you said that "a given type, all by itself, will require sizeof(type)
bytes of memory". Well, duh, that's how sizeof() is defined, after all.
  However, the subject in question here is objects and accessing their
private members. You have no portable way of knowing the offset of members
inside objects because their layout is not guaranteed. The first member
might not start from the same address as the object's pointer, and the
amount of padding between members is implementation-defined. Just because
you know that a member requires n bytes of memory doesn't help you much
in resolving its location inside the object.
  (That doesn't mean you can't have a pointer pointing to a member of an
object, in a portable way. However, that requires either for the member
to be public or for the object to give you the pointer.)
> >   Also, the standard doesn't guarantee that the first member of a struct
> > starts from the same memory address as the struct instance itself. (In
> > practice this can actually differ if the struct has virtual functions.
> > In that case the pointer-to-struct-instance will not point to the first
> > member of the struct. There will be some implementation-dependent vtable
> > pointer there instead.)
> I didn't say that it did.
  But we are talking about accessing private members here.
> Wasn't one idea of C++ was that structs without vtables would layout the 
> same way as C structures?
  But the C standard doesn't give guarantees about padding between members
either, AFAIK.
> In any case, yes, there is, because the class can return a pointer to the 
> private variable. You might say "that means it's not private any more", but 
> it's still accessing the program's private variables.
  Of course if the object exposes the private member in its public interface,
then the outside can have access to it. Returning a reference or pointer to
a member exposes it to the outside.
  (There's still the minor difference that it's slightly more abstract than
having the member in the public part of the class in that you could change
the method to return a reference/pointer to something else.)
> (Uh, that *is* legal 
> in C++, right? Returning an int* that points to a private instance variable? 
> If not, then color me suitably impressed. :-)
  It is possible to return a pointer to a member variable. (It's generally
regarded as bad design, though. Returning a const reference is a borderline
case, though. Sometimes used because returning big members by value can be
quite inefficient.)
> >>  There's lots of guarantees. And I'm pretty sure that 
> >> if you have "struct x {int a; int b} y;" that the standard guarantees
> >> &b > &a.
> > 
> >   Probably, but if a and b are private, you have no portable way of getting
> > an address to them. You don't know the offset.
> You do inside the class. :-)
  "From the outside" implied that the struct doesn't want to expose them.
>  And besides, I'm just disputing your assertion 
> that the language makes no guarantees about the layout of data.
  Well, it doesn't. Not any that would help you accessing them from the
outside if they are unexposed.
  (Btw, I didn't mean "doesn't give any guarantees whatsoever". I meant
"doesn't give any guarantees that would help you resolving the address of
a private member from the outside in a portable way".)
> >> Positive integers are represented in twos-complement coding. 
> >   I'm not completely sure the standard guarantees that.
> > 
> >> Unsigned integers occupy the the number of bits indicated by multiplying 
> >> sizeof(unsigned) by bits_per_char or whatever the appropriate #define is.
> > 
> >   Well, duh, because sizeof(type) is *defined* as telling the amount of
> > bytes that the type requires.
> Yes. And then you multiple by bits_per_char to find out the range, which 
> means unsigneds don't have padding in the middle, for example.
  But if you had, for example, an array of two unsigneds, I'm not sure the
standard guarantees they will be at consecutive memory locations, with no
padding between them. (I could be wrong on this one, though.)
  Not that any compiler in the universe would put padding between them,
mind you, but it's probable that the standard doesn't *force* compilers
to not to use padding (eg. because an exotic hardware requires it).
> >   That still doesn't help you accessing private data of an object portably.
> That's not what I'm discussing at this point. Besides, you can *access* them 
> portably. You can't *interpret* them portably, perhaps, if the structure is 
> complicated enough.
  Well, if "access" means "reading their bit pattern in memory", then yes,
you can "access" them portably.
> >> Now, if you're going to argue those guarantees don't count, Ok.
> > 
> >   Don't count to what? The issue was whether it's *well-defined* to break
> > encapsulation in C++. Your own words.
> Don't count the guarantees I listed. You said "C++ makes no guarantess on 
> layout" and I listed a bunch it makes.
  The context in which I said it implied "no guarantees which would help you
resolve the address of a member variable from the outside".
> And breaking encapsulation is portably accessing private members from 
> outside the class. Which I can certainly do using memcpy for example.
  What you are using with memcpy is copying the bit pattern of the object
as it is stored in memory. It's arguable whether that's "accessing the
private members" of an object.
  To me, accessing means that you actually read (or even write) a private
member variable and write code which depends on it. You can't do that with
memcpy because it doesn't tell you the value of *that* member variable in
question. It only gives you the byte data of the entire object. You can't
start interpreting that byte data in a portable way.
> >   It's not well-defined. You can try, but the results are not guaranteed.
> So write(out, &xyz, sizeof(xyz)) where xyz is a class or struct with private 
> members might actually crash?
  Exactly which member variable did you just access? "All of them" isn't an
answer.
  What you did was take the bit pattern of the object in memory and write
it somewhere.
> memcpy(myarray[0], myarray[1], sizeof(myarray[0])) isn't well-defined? 
  Only in a very limited set of cases, actually. If myarray consists of
class instances, the result is undefined behavior (because you are
bypassing copy constructors).
  With basic types it's well-defined, and maybe for PODs with trivial
constructors and destructors.
> There's no meaning for that statement when myarray[0] is a struct with 
> private members?
  It depends on what kind of constructor/destructor/copy constructor the
struct has.
> Regardless of whether you know which private variable is laid out where, 
> you're still "accessing private variables" in ways that more secure 
> languages disallow.
  Well, C++ doesn't stop you from reading the bytes of an object in memory.
Nobody is denying that.
  Whether that constitutes "accessing a private member" is arguable.
> >   You mean C++ is causing problems there?
> Oh, you wouldn't believe. Not C++ per se, but unsafe languages in general, 
> poorly organized and badly documented.
  Safe languages never cause problems, then?
  (If I'm to believe thedailywtf.com, I don't think that's the case either.)
> >   I suppose I should consider myself lucky in that I get to work in
> > projects where I don't have to go fixing existing code made by incompetent
> > C++ programmers...
> Yep. Well, honestly, the whole development chain is screwed. Not only is the 
> code buggy, but the authors are unwilling to fix it, describe it, document 
> it, or really take any responsibility at all for its sorry shape.
  I'm not sure using a "safe" language would fix those problems either.
-- 
                                                          - Warp Post a reply to this message
 |  |  |  |  |  |  |  |  
|  |  |  |  |  |  |  |  |  |  |  
|  |  | Warp wrote:
> Darren New <dne### [at] san rr  com> wrote:
>>>   How is it encapsulation if everything is public?
> 
>> """
>> A language construct that facilitates the bundling of data with the methods 
>> operating on that data.
>> """
> 
>> Again, it's the definition you pointed to.
> 
>   Only if you ignore the data hiding part, which is essential for
> abstraction.
Really, Python isn't noticeably less private than C++. C++ shows you 
everything. The compiler says "don't reference that by name", instead of the 
IDE doing it.
>> It has so little encapsulation (in the "restricting access" part of the 
>> definition) that it might as well not have encapsulation.
> 
>   You are exaggerating. On purpose.
I disagree. I'm simply expressing an opinion that differs from yours. If I 
know something is private, I don't try to access it.
>> In other words, I find that marking something as "private" doesn't give 
>> significantly more protection to private data than marking something with an 
>> underline does in Python, in practice. In practice, both are trivial to get 
>> around, accidentally or on purpose.
> 
>   You are exaggerating. On purpose.
Again, I disagree. I disagree that C++ refusing to compile when you access 
by name something marked as private is significantly more encapsulated than 
Python's naming convention that you shouldn't access fields marked with an 
underscore by name.
>   There's a huge difference. You make it sound like there was no difference
> at all. Just for the sake of argument.
In practice, I disagree. I recognize the difference. I'm just of the opinion 
that the difference is so small as to be irrelevant. You think that having 
the compiler complain is important. I don't.
Do you think Java doesn't have private members because there's a way of 
accessing them via reflection?
>   I don't think the C standard guarantees how much padding there will be
> between members of a struct either. The bit representation of a struct
> instance may change from system to system, and even from compiler to
> compiler, and hence you must not make assumptions if you want your program
> to be portable.
Right. But that's far from "makes no guarantees" you see. You often make a 
broad sweeping statement, and when I call you on it, you backpedal and then 
say I'm exaggerating "on purpose".
>>>> The elements of a structure are all between 
>>>> &x and &x + sizeof(x).
>>>   No, they aren't. The standard doesn't specify how much padding there may
>>> be between struct elements. The compiler can add as much padding as it wants
>>> (or none at all).
> 
>> That doesn't invalidate the formula. I didn't say there was nothing else there.
> 
>   So you said that "a given type, all by itself, will require sizeof(type)
> bytes of memory". Well, duh, that's how sizeof() is defined, after all.
You said "there are no guarantees". I said yes, there is at least the 
guarantee that the elements of a class even with private members are laid 
out in the space between the start and the end.
What you mean to say is "there aren't enough guarantees on layout to ensure 
that you can do something reasonable in a portable way by breaking 
encapsulation."  That's far different from "there are no guarantees."
You exaggerate, on purpose. ;-)
>   (That doesn't mean you can't have a pointer pointing to a member of an
> object, in a portable way. However, that requires either for the member
> to be public or for the object to give you the pointer.)
Yep. Fully agreed.
>> I didn't say that it did.
>   But we are talking about accessing private members here.
I can still do that. I can't do something *useful* with them, but *in 
practice*, I can zero out your object by mistake. You may discount that as 
irrelevant, but I don't. I don't think that any amount of definition 
wrangling is going to resolve *that* disagreement.
Unsafe languages don't keep code from accessing your private variables. They 
keep you from reliably portably accessing private variables in a correctly 
written system where every piece is bug free.  You're taking the latter to 
mean the former. I'm not.
>>  And besides, I'm just disputing your assertion 
>> that the language makes no guarantees about the layout of data.
> 
>   Well, it doesn't. Not any that would help you accessing them from the
> outside if they are unexposed.
See what you did there?
>   (Btw, I didn't mean "doesn't give any guarantees whatsoever". I meant
> "doesn't give any guarantees that would help you resolving the address of
> a private member from the outside in a portable way".)
OK. I'll agree with that. :-)
>   But if you had, for example, an array of two unsigneds, I'm not sure the
> standard guarantees they will be at consecutive memory locations, with no
> padding between them. (I could be wrong on this one, though.)
I'm not sure.
>   Not that any compiler in the universe would put padding between them,
> mind you, but it's probable that the standard doesn't *force* compilers
> to not to use padding (eg. because an exotic hardware requires it).
Certainly there are machines where floats, for example, need to be on 
particular boundaries and which might not be packed to those boundaries. 
Stupid design, but conceivable. :-)  Of course, on such machines, you 
usually can't get a C compiler.
>   Well, if "access" means "reading their bit pattern in memory", then yes,
> you can "access" them portably.
And, as I said, you can overwrite them, which *I* feel violates 
data-security encapsulation.
>   To me, accessing means that you actually read (or even write) a private
> member variable and write code which depends on it. 
OK. And to me, "accessing" means reading it or writing it without going thru 
the interface defined by the class.
>> memcpy(myarray[0], myarray[1], sizeof(myarray[0])) isn't well-defined? 
> 
>   Only in a very limited set of cases, actually. If myarray consists of
> class instances, the result is undefined behavior (because you are
> bypassing copy constructors).
Fair dinkum.
>   With basic types it's well-defined, and maybe for PODs with trivial
> constructors and destructors.
And for structs or classes with no vtable, as those are defined to have the 
same layout as in C, and it works in C, yes?
>   Well, C++ doesn't stop you from reading the bytes of an object in memory.
> Nobody is denying that.
> 
>   Whether that constitutes "accessing a private member" is arguable.
Right. I think it is, because if you can change the values of my private 
variables without going thru my member functions, I have no way of enforcing 
my code's correctness or my invariants.
>>>   You mean C++ is causing problems there?
> 
>> Oh, you wouldn't believe. Not C++ per se, but unsafe languages in general, 
>> poorly organized and badly documented.
> 
>   Safe languages never cause problems, then?
Safe languages merely make it easier to prove who has the bug. It also tends 
to catch the bug when (say) you open the file, instead of 10 minutes later 
when you try to parse what you read from it.
>> Yep. Well, honestly, the whole development chain is screwed. Not only is the 
>> code buggy, but the authors are unwilling to fix it, describe it, document 
>> it, or really take any responsibility at all for its sorry shape.
> 
>   I'm not sure using a "safe" language would fix those problems either.
Not completely, no. But at least you could assign blame. Right now, if I 
write code that invokes their libbrary and it segfaults, yet it doesn't 
segfault in their trivial test application that only ever invokes one 
operation in the library before exiting, then it must be my fault.
In a safe language, you could say "Look, it throws an exception in your code 
. Find where that exception is, and see what causes it."
-- 
Darren New, San Diego CA, USA (PST)
   The question in today's corporate environment is not
   so much "what color is your parachute?" as it is
   "what color is your nose?" Post a reply to this message
 |  |  |  |  |  |  |  |  
|  |  |  |  |  |  |  |  |  |  |  
|  |  | Darren New wrote:
> Really, Python isn't noticeably less private than C++. C++ shows you 
> everything. The compiler says "don't reference that by name", instead of 
> the IDE doing it.
Or, to phrase it a different way, I've never had a problem caused by someone 
intentionally accessing a private member in a Python-like environment that 
would have been solved by making the code not compile when that happens.
I have never known someone to accidentally reference x._y and think they 
were supposed to be accessing _y as a normal part of the functioning of the 
program.
Have you ever known someone to accidentally name a C function __XYZ__ and 
not be aware they might be stepping on the compiler's namespace? Is it 
really a problem that the compiler doesn't prevent you from using names like 
that?
-- 
Darren New, San Diego CA, USA (PST)
   The question in today's corporate environment is not
   so much "what color is your parachute?" as it is
   "what color is your nose?"
 Post a reply to this message
 |  |  |  |  |  |  |  |  
|  |  |  |  |  |  |  |  |  |  |  
|  |  | Warp wrote:
>   (That doesn't mean you can't have a pointer pointing to a member of an
> object, in a portable way. However, that requires either for the member
> to be public or for the object to give you the pointer.)
Actually, just thinking about it a bit more, if the class returns a pointer 
to a member variable, and you can look in the header and see that pointer is 
in an array, then you have a well-defined way of accessing the other private 
data in the same array.
Like, if the class returns a FILE* from open(), and you look in the class 
declaration and there's something along the lines of
    FILE open_files[MAX_OPEN_FILES];
then chances are you can wander up and down that array as you like. :-)
I wouldn't say that particularly breaks encapsulation tho. Just a thought.
-- 
Darren New, San Diego CA, USA (PST)
   The question in today's corporate environment is not
   so much "what color is your parachute?" as it is
   "what color is your nose?"
Post a reply to this message
 |  |  |  |  |  |  |  |  
|  |  |  |  |  |  |  |  |  |  |  
|  |  | Darren New <dne### [at] san rr  com> wrote:
> Again, I disagree. I disagree that C++ refusing to compile when you access 
> by name something marked as private is significantly more encapsulated than 
> Python's naming convention that you shouldn't access fields marked with an 
> underscore by name.
  Then show me correct C++ which reads a certain private member variable of
a class, eg. by reading its value to a variable of the same type, without
the class deliberately exposing it. (Note that the class may have eg. virtual
functions.)
> >   There's a huge difference. You make it sound like there was no difference
> > at all. Just for the sake of argument.
> In practice, I disagree.
  In practice? Well, then show me in practice. Some actual code, please.
> >   I don't think the C standard guarantees how much padding there will be
> > between members of a struct either. The bit representation of a struct
> > instance may change from system to system, and even from compiler to
> > compiler, and hence you must not make assumptions if you want your program
> > to be portable.
> Right. But that's far from "makes no guarantees" you see. You often make a 
> broad sweeping statement, and when I call you on it, you backpedal and then 
> say I'm exaggerating "on purpose".
  You are calling me a liar now? Like, when I said "the standard offers no
guarantees about the layout of a class" I was really meaning "the standard
offers no guarantees whatsoever about the memory layout of any type" rather
than "the standard offers you no guarantees about the layout which would
allow you to access the private members of a class", but later I claimed
otherwise? I was always talking about accessing private members and why
you can't access them in a portable way. Calling me a liar is offensive.
  I might well claim that you "backpedaled" when you first claimed that
C++ has *no* encapsulation, and later changed it to "little" encapsulation.
> >   So you said that "a given type, all by itself, will require sizeof(type)
> > bytes of memory". Well, duh, that's how sizeof() is defined, after all.
> You said "there are no guarantees".
  I was talking about the memory layout of an object. There *are* no
guarantees about it.
> I said yes, there is at least the 
> guarantee that the elements of a class even with private members are laid 
> out in the space between the start and the end.
  You seriously don't understand the difference between "size" and "layout"?
> >   Well, if "access" means "reading their bit pattern in memory", then yes,
> > you can "access" them portably.
> And, as I said, you can overwrite them, which *I* feel violates 
> data-security encapsulation.
  Then, as said in earlier posts, there are no languages with encapsulation
because you can't physically protect data from being corrupted. There can
always be a glitch somewhere that corrupts it.
  (By your principle of "if it can't be protected from everything it's not
worth protecting at all", all these "safe" languages are useless. They can't
protect themselves from everything, hence why try it at all? Better, to just
go back to C, then. No useless protection there.)
> >   With basic types it's well-defined, and maybe for PODs with trivial
> > constructors and destructors.
> And for structs or classes with no vtable, as those are defined to have the 
> same layout as in C, and it works in C, yes?
  No, if they or any of the members have a non-trivial copy constructor.
> Not completely, no. But at least you could assign blame. Right now, if I 
> write code that invokes their libbrary and it segfaults, yet it doesn't 
> segfault in their trivial test application that only ever invokes one 
> operation in the library before exiting, then it must be my fault.
  You don't use a debugger to resolve where the crash happens?
-- 
                                                          - Warp Post a reply to this message
 |  |  |  |  |  |  |  |  
|  |  |  |  |  |  |  |  |  |