POV-Ray : Newsgroups : povray.off-topic : gathering infos from web pages : Re: gathering infos from web pages Server Time
15 Nov 2024 01:14:50 EST (-0500)
  Re: gathering infos from web pages  
From: Darren New
Date: 21 Nov 2007 11:48:44
Message: <4744616c$1@news.povray.org>
Nicolas Alvarez wrote:
> I would do it with PHP (outside a webserver), because I did many 
> scraping scripts that way. It's easy to parse HTML with PHP's DOM and 
> loadHTML, handles all the bad syntax for you.

As long as you start a new process for each page, you'll be OK. From 
what I can tell, PHP never, ever deallocates memory.  Try walking thru 
and processing a 600-megaline database table in CLI PHP, and you'll 
regret it.

You could write one that sucks up URLs (or runs wget), then iterates 
over the resulting files with one PHP script each or something.

Or use Tcl, which is what I did.

-- 
   Darren New / San Diego, CA, USA (PST)
     It's not feature creep if you put it
     at the end and adjust the release date.


Post a reply to this message

Copyright 2003-2023 Persistence of Vision Raytracer Pty. Ltd.