|
|
|
|
|
|
| |
| |
|
|
|
|
| |
| |
|
|
Alain <ele### [at] netscapenet> wrote:
> > And thus, like any decent variable-length encoding scheme, it tries to
> > assign short codes to common symbols. (Although UTF-8 probably fails
> > horribly for, say, Japanese text. I don't actually know...)
> For Japanese and Chinese, it average around 3 bytes per characters. It's not so
> bad after all, as each characters in those represent a whole word, some even
> represent a whole phrase or some complexe concept.
UTF16 is better because it uses 2 bytes for the vast majority of the most
commonly used kanjis and other symbols used in Japanese.
--
- Warp
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
Invisible wrote:
> (and AFAIK in plain C) you *must* expose all implementation details of a
> type in order for anybody else to be able to touch it.
Nah. Just use a forward-declared struct and pass pointers to it. You
know, like fopen/fread/fwrite/fclose. Note those haven't changed since
the first edition of K&R.
--
Darren New / San Diego, CA, USA (PST)
Remember the good old days, when we
used to complain about cryptography
being export-restricted?
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
Alain wrote:
> Garbage collection is BAAAAD!!!
> Any implementation that permit you not to use that is GOOD.
Uh, not sure why you'd say this. You're looking at one implementation of
one garbage collector, and deducing from this that the entire concept is
bad? You *are* aware there are such things as real-time garbage
collectors, for example?
I can't imagine a GC that takes 15 minutes on anything that isn't paging
like mad. In which case the problems isn't the GC, but the paging, in
which case you should say "paging is baaaaad!" :-)
I mean, really, how can you imagine any decent GC would take more than
(say) three times the length of time it takes to read all physical memory?
--
Darren New / San Diego, CA, USA (PST)
Remember the good old days, when we
used to complain about cryptography
being export-restricted?
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
Orchid XP v7 wrote:
> Yeah, old GC algorithms used to do this. Research has been done,
> solutions have been found, etc.
Plus, a lot of the complaints about GC slowness are caused in part by
the GC not interacting with the paging system well. I.e., the OS isn't
prepared for programs that're going to do GC in particular patterns.
There are read-aheads on disk files. Why would you wait until the GC
hits the bad page before starting to read it in?
Of course, paging systems nowadays don't even know the physical layouts
of the disks, so it's generally hard to do things like select sectors on
the swap partition in a way that paging things back in will be fast.
--
Darren New / San Diego, CA, USA (PST)
Remember the good old days, when we
used to complain about cryptography
being export-restricted?
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
Warp nous apporta ses lumieres en ce 2007/11/02 16:14:
> Alain <ele### [at] netscapenet> wrote:
>> Using UTF16 encoding, any character is 2 BYTES long, for a grand total is 65536
>> possible characters, not all of them been printable.
>
> Wrong. UTF16-encoding results in either 2-byte or 4-byte characters,
> depending on the unicode value.
>
> Perhaps you are confusing it with UCS2?
>
I think that I missed the bit about 4 bytes UTF16 characters...
--
Alain
-------------------------------------------------
History, in general, only informs us of what bad government is.
Thomas Jefferson
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
Darren New nous apporta ses lumieres en ce 2007/11/02 20:04:
> Alain wrote:
>> Garbage collection is BAAAAD!!!
>> Any implementation that permit you not to use that is GOOD.
>
> Uh, not sure why you'd say this. You're looking at one implementation of
> one garbage collector, and deducing from this that the entire concept is
> bad? You *are* aware there are such things as real-time garbage
> collectors, for example?
>
> I can't imagine a GC that takes 15 minutes on anything that isn't paging
> like mad. In which case the problems isn't the GC, but the paging, in
> which case you should say "paging is baaaaad!" :-)
>
> I mean, really, how can you imagine any decent GC would take more than
> (say) three times the length of time it takes to read all physical memory?
>
OK, it was a LONG time ago!
There was no paging, there could'nt be any paging, there was no HD to receive
anything. ALL of the data was in RAM. It was before HD where common on PC's.
--
Alain
-------------------------------------------------
You know you've been raytracing too long when you know how to render a truly
photorealistic compact disc, and you're not going to tell anyone
(least of all a POV user ;) ).
-- Alex McLeod a.k.a. Giant Robot Messiah
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
Warp nous apporta ses lumieres en ce 2007/11/02 17:10:
> Alain <ele### [at] netscapenet> wrote:
>>> And thus, like any decent variable-length encoding scheme, it tries to
>>> assign short codes to common symbols. (Although UTF-8 probably fails
>>> horribly for, say, Japanese text. I don't actually know...)
>> For Japanese and Chinese, it average around 3 bytes per characters. It's not so
>> bad after all, as each characters in those represent a whole word, some even
>> represent a whole phrase or some complexe concept.
>
> UTF16 is better because it uses 2 bytes for the vast majority of the most
> commonly used kanjis and other symbols used in Japanese.
>
But then, you don't have place for the Chinese ones, then you need Vietnamese,
Corean, Hindu, Sanskrit, Latin, Cyrilic, Arabic, Inuctituk, Math symbols,...
--
Alain
-------------------------------------------------
Make yourself a better person and know who you are before you try and know
someone else and expect them to know you.
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
Alain <ele### [at] netscapenet> wrote:
> > UTF16 is better because it uses 2 bytes for the vast majority of the most
> > commonly used kanjis and other symbols used in Japanese.
> >
> But then, you don't have place for the Chinese ones, then you need Vietnamese,
> Corean, Hindu, Sanskrit, Latin, Cyrilic, Arabic, Inuctituk, Math symbols,...
UTF16 can handle all those too, although some require 4 bytes.
I suppose Chinese might be more compact with UTF8 than with UTF16,
especially if you use lots of kanjis not in the 2-byte UTF16 range
(because those require 4 bytes with UTF16, while in UTF8, AFAIK, they
require only 3).
--
- Warp
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
Darren New wrote:
> Invisible wrote:
>> (and AFAIK in plain C) you *must* expose all implementation details of
>> a type in order for anybody else to be able to touch it.
>
> Nah. Just use a forward-declared struct and pass pointers to it. You
> know, like fopen/fread/fwrite/fclose. Note those haven't changed since
> the first edition of K&R.
I have no concept of what you're talking about - but you're probably
right...
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
Orchid XP v7 wrote:
> Darren New wrote:
>> Invisible wrote:
>>> (and AFAIK in plain C) you *must* expose all implementation details
>>> of a type in order for anybody else to be able to touch it.
>>
>> Nah. Just use a forward-declared struct and pass pointers to it. You
>> know, like fopen/fread/fwrite/fclose. Note those haven't changed
>> since the first edition of K&R.
>
> I have no concept of what you're talking about - but you're probably
> right...
In C, you declare a pointer to a structure without defining the layout
of the structure. C allows this, because it assumes all pointers are the
same size and have a unified alignment requirement and can all live in
the same kinds of registers and things like that.
Then in your code file, you define the structure. Creating an object
involves allocating the structure and returning the opaque pointer to
it. Using the object requires passing the opaque pointer back in.
No implementation details are visible at all.
http://en.wikipedia.org/wiki/C_file_input/output
--
Darren New / San Diego, CA, USA (PST)
Remember the good old days, when we
used to complain about cryptography
being export-restricted?
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
|
|