POV-Ray : Newsgroups : povray.off-topic : The truth about PDF Server Time
3 Sep 2024 19:21:05 EDT (-0400)
  The truth about PDF (Message 1 to 6 of 6)  
From: Invisible
Subject: The truth about PDF
Date: 1 Oct 2010 05:38:20
Message: <4ca5ac0c@news.povray.org>
There is a wide-spread belief that PDF is just a kind of "binary version 
of" PostScript. For example,

http://ansuz.sooke.bc.ca/software/pdf-append.php

"PDF is basically PostScript with compression and some document 
structuring conventions that supposedly make it easier to do operations 
like concatenation."

This is a wholly inaccurate description.

First: PostScript is, surprisingly, a general Turing-complete 
programming language. (Not many people seem to know this.)

PostScript and PDF do share similarities, and this apparently leads some 
people to believe that they are "the same thing". This is manifestly not 
the case.

There are many technical differences between the two, but let me skip 
straight to the really big difference: PDF IS NOT TURNING-COMPLETE!

This single difference is probably the largest advantage of PDF for its 
intended purpose - representing finished *documents*.

A PostScript file is basically an arbitrary computer program that can 
optionally (!) generate printed pages as a side-effect of its execution. 
Note that it's completely possible to write PostScript programs that do 
normal file I/O and generally do what programs in any other languages 
do. (It's just that nobody ever actually does this.) It's not uncommon, 
though, to have a PostScript file that doesn't actually print anything, 
it just configures the printer for a different mode, or installs a new 
font or something.

A PS file is a linear sequence of operators and operands, not unlike an 
assembly language program. In particular, it's tricky to find "page N" 
in a PS file. In the general case, you basically have to execute the 
whole program until N pages have been emitted, because page N could be 
produced inside a for-loop, or some other control structure. (And the 
"emit the page now" command could be arbitrarily renamed during the 
execution of the program, so it's no good just doing a textual search 
for it.)

A PDF file is a random-access container of "objects", some of which 
describe what to print on the page. The page description language lacks 
flow control (no branching, looping or conditional execution) and 
doesn't even have variables. It's literally just a list of coordinates. 
This is COMPLETELY DIFFERENT to PostScript!

In PDF, if you want 10 lines, you have to actually write ten line 
descriptions. In PS, you could just write a program loop.

In addition, while a PS file is a linear series of program tokens that 
must be executed to yield pages, a PDF file is random-access. It is 
filled with self-contained "objects" which may be accessed in any order. 
Actually there is a nominated "root object", which links to the other 
objects to form a graph. (It's not actually a *tree* since it contains 
cycles - e.g., each page points to the "next" and "previous" pages.)

Most especially, each page in a PDF file is completely independent. 
Nothing that has happened in any other page has any effect at all on the 
current page. There are "document structuring conventions" for 
PostScript which urge you to avoid affecting later pages in the current 
one, but nothing *enforces* this. It's just a "convention" that you're 
supposed to follow. With PDF, it's hard-wired into the document spec. 
Pages *cannot* affect each other, the end.

A PDF file also contains a crapload of metadata that a PostScript file 
does not. A PS file prints pages. Using something like Ghostscript, you 
can view those pages on screen rather than print them. But still, PS is 
for printing. PDF is for printing *and viewing*. It can contain 
*hyperlinks*. It can present electronic forms that you can fill in. It 
can contain JavaScript to validate those forms. It can contain metadata 
such as title, author, date, etc. Pages can be numbered. (Why? Well, in 
case you have blank pages or title pages which are not numbered. So you 
might want "page 1" to actually be the third sheet of paper.) You can 
even set up a PDF file to be like a PowerPoint slide show.

In short, PostScript is an executable program. PDF is a passive data 
container. And this makes the latter infinitely more easy to deal with.

Also - not many people know this - you can write a PDF file as plain 
ASCII! All the main document constructs are plain ASCII. Usually various 
objects are compressed to save space (particularly fonts, images and 
page descriptions), but it *is* possible to write PDF files using 
Notepad. (You'd be insane to try, of course, but it's doable.) PDF can 
also be encrypted.

So there we have it. PDF is *not* a binary PostScript. The two *do* have 
the same drawing operators. (Draw a line, draw a 3rd-order Bezier 
spline, draw a circular arc, transform the coordinate space, etc.) But 
that's the sum total of their similarity. (Indeed, PDF supports 
transparency, which PS curiously does not. PS supports "spot colours", 
which PDF does not. PS has support for special printer instructions that 
PDF does not. PDF supports various interactive features like hyperlines 
and forms that PS does not. And so on.)


Post a reply to this message

From: Nekar Xenos
Subject: Re: The truth about PDF
Date: 1 Oct 2010 09:10:35
Message: <op.vjwevguaufxv4h@go-dynamite>
On Fri, 01 Oct 2010 11:38:18 +0200, Invisible <voi### [at] devnull> wrote:

> There is a wide-spread belief that PDF is just a kind of "binary version  
> of" PostScript.

Interesting thing I've read somewhere: PDF is derived from .ai (Adobe  
Illustrator). You can just rename a .pdf to .ai and it will open in  
Illustrator. This works most of the time but not always. I've also tried  
renaming .ai files to .pdf and it doesn't open in Acrobat...

-Nekar Xenos-


Post a reply to this message

From: Invisible
Subject: Re: The truth about PDF
Date: 1 Oct 2010 10:43:06
Message: <4ca5f37a@news.povray.org>
>> There is a wide-spread belief that PDF is just a kind of "binary
>> version of" PostScript.
>
> Interesting thing I've read somewhere: PDF is derived from .ai (Adobe
> Illustrator). You can just rename a .pdf to .ai and it will open in
> Illustrator. This works most of the time but not always. I've also tried
> renaming .ai files to .pdf and it doesn't open in Acrobat...

Having never used or even seen Illustrator, I couldn't comment. I'd 
imagine whatever the file is named, it's looking for the magic number at 
the start to decide what to do with it.

There was once a rumour that if you rename a PostScript file to PDF, 
Acrobat Reader would be able to open it. This isn't true, and when you 
consider that PDF is just a data format while PostScript requires a 
language interpreter to actually *execute* it, it's not surprising really.


Post a reply to this message

From: Francois Labreque
Subject: Re: The truth about PDF
Date: 4 Oct 2010 09:53:42
Message: <4ca9dc66$1@news.povray.org>

> But still, PS is for printing.

Not necessarily.  It may have been designed by Apple fro printing, but 
the display system on the the NeXT Cube (Steve Jobs' foray into the 
high-end workstation market, while on a break from Apple) was 
Postscript-based, instead of being X11-based, like the other workstations.

-- 
/*Francois Labreque*/#local a=x+y;#local b=x+a;#local c=a+b;#macro P(F//
/*    flabreque    */L)polygon{5,F,F+z,L+z,L,F pigment{rgb 9}}#end union
/*        @        */{P(0,a)P(a,b)P(b,c)P(2*a,2*b)P(2*b,b+c)P(b+c,<2,3>)
/*   gmail.com     */}camera{orthographic location<6,1.25,-6>look_at a }


Post a reply to this message

From: Invisible
Subject: Re: The truth about PDF
Date: 4 Oct 2010 10:22:04
Message: <4ca9e30c$1@news.povray.org>
>> But still, PS is for printing.
>
> Not necessarily. It may have been designed by Apple fro printing, but
> the display system on the the NeXT Cube (Steve Jobs' foray into the
> high-end workstation market, while on a break from Apple) was
> Postscript-based, instead of being X11-based, like the other workstations.

That's true. My real point is that the design goal for PS was as a 
language for controlling physical printers (which is why it has 
sophisticated colour management, spot colours, screen print settings, 
etc.) while PDF was aimed at documents you can read on a screen as well 
as print (and hence PDF has interactive features that PS does not).

Of course, PS is a programming language. You can quite easily add any 
features to it that aren't there as standard. (E.g., by default there's 
no way to do transparency, but you could design a PS-based system that can.)


Post a reply to this message

From: Darren New
Subject: Re: The truth about PDF
Date: 4 Oct 2010 12:13:41
Message: <4ca9fd35$1@news.povray.org>
Francois Labreque wrote:
> the display system on the the NeXT Cube 

... was a pale and lame light next to the beauty that was NeWS, where 
*everything* was Postscript and thereby eliminated 95% of the problem X has 
with slow network links, as well as all the crap focus-lock BS you need when 
things happen asynchronously.

-- 
Darren New, San Diego CA, USA (PST)
   Serving Suggestion:
     "Don't serve this any more. It's awful."


Post a reply to this message

Copyright 2003-2023 Persistence of Vision Raytracer Pty. Ltd.