POV-Ray: Newsgroups: povray.beta-test: Timed out waiting for worker thread startup

POV-Ray : Newsgroups : povray.beta-test : Timed out waiting for worker thread startup		Server Time 5 Jul 2025 21:15:57 EDT (-0400)

Goto Latest 10 Messages

Next 1 Messages >>>

From: Le Forgeron
Subject: Timed out waiting for worker thread startup
Date: 2 Mar 2008 04:37:48
Message: <47ca756c$1@news.povray.org>

Hello,

while looking for the reason of "Timed out waiting for worker thread 
startup" error message (I'm on Gentoo ~amd64), I got a strange feelings.

Under ddd (gdb X11 interface), no problem.
Under normal shell, it happens (and a lot, after the first run).

Looking at vfeSession::Initialize (in vfe/vfesession.cpp), the lock on 
which the timed_wait is performed (and failed in case of that issue), is, 
if my C++ is not too rusty, built using m_InitializeMutex, a protected 
member of vfeSession (class in vfe/vfesession.h)

And it seems it is never initialized (only one reference) explicitely, so 
only the constructor of vfesession might in fact create that mutex.

It might be that g++ is over-optimizing the constructor of vfesession by 
simply mapping the memory area, without resetting the mutex ?

I'm trying to re-compile with other g++ versions (my current is 4.2.3) 
and see if it is better...

Post a reply to this message

From: Nicolas Alvarez
Subject: Re: Timed out waiting for worker thread startup
Date: 2 Mar 2008 10:08:30
Message: <47cac2ee$1@news.povray.org>

Le Forgeron escribió:
> Under ddd (gdb X11 interface), no problem.
> Under normal shell, it happens (and a lot, after the first run).

A heisenbug!

http://en.wikipedia.org/wiki/Unusual_software_bug#Heisenbugs

Post a reply to this message

From: Sławomir Szczyrba
Subject: Re: Timed out waiting for worker thread startup
Date: 4 Mar 2008 04:29:59
Message: <slrn.fsq5k4.7m9.steev@hot.pl>

Tako rzecze Le Forgeron :

> [...] "Timed out waiting for worker thread 
> startup" error message (I'm on Gentoo ~amd64), [...]
>
Same thing here.
Fedora 8 / 32bit

> I'm trying to re-compile with other g++ versions (my current is 4.2.3) 
> and see if it is better...
>
Tested with :
gcc version 4.1.2
icc Version 10.1

Slawek
-- 
  ________ Segmentation fault. Core dumped.
_/ __/ __/ But the memory remains. -- Bart
 \__ \__ \_______________________________________________________________
 /___/___/ Slawomir Szczyrba                          steev/AT/hot\dot\pl

Post a reply to this message

From: Chris Cason
Subject: Re: Timed out waiting for worker thread startup
Date: 4 Mar 2008 06:17:56
Message: <47cd2fe4@news.povray.org>

Le Forgeron wrote:
> And it seems it is never initialized (only one reference) explicitely, so 
> only the constructor of vfesession might in fact create that mutex.

the mutex is a class in its own right and thus has its own constructor,
which is invoked when vfesession is constructed. so that's not likely to be
the issue.

looking at the code, I can see one possibility that jumps at me: it may be
I misread the usage for boost::condition at the time, or simply slipped up,
but it appears that if the worker thread created via 'new boost::thread()'
 at line 1082 of vfesession.cpp#23 was able to run as far as its call to
'm_InitializeEvent.notify_all()' (line 638) *prior* to the next line of
vfeSession::Initialize() executing, the notify_all would have no effect and
the subsequent timed_wait would time out*. (if this happens at all, it
would be more likely to occur on machines with a single core).

to test this theory, please try adding a printf to line 1082 which reads
like this:

  fprintf(stderr, "%p %d\n", m_Frontend, m_BackendState);

so you will then have:

  m_WorkerThread = new boost::thread(vfeSessionWorker(*this));
  fprintf(stderr, "%p %d\n", m_Frontend, m_BackendState);
  if (m_InitializeEvent.timed_wait(lock, t) == false)

if all is well you ought to get 0 and 0 as the output. if you get an
initialized pointer and a backend state that's not 0, it's highly likely
that what I am describing is what is happening (if so, it won't be
difficult to fix, and probably should be irregardless of the outcome of
your test).

-- Chris

* I do note I have a 'FIXME' on the worker thread code at around that
  point, indicating I was not happy with it at the time (e.g. just from
  glancing at it now I see the exception thrown at line 636 is unlikely to
  ever be seen; throwing it would also trigger a timeout in the waiting
  code; and what's worse m_LastError is not set in that case. yuck!).

Post a reply to this message

From: Sławomir Szczyrba
Subject: Re: Timed out waiting for worker thread startup
Date: 4 Mar 2008 08:26:20
Message: <slrn.fsqjf7.6g4.steev@hot.pl>

Take the red pill, Chris Cason...

[...]

> if all is well you ought to get 0 and 0 as the output. if you get an
> initialized pointer and a backend state that's not 0, it's highly likely
>
Usually it displays:

./povray
0x83024b8 1
Timed out waiting for worker thread startup

(first number differs, of course :)
and sometimes :

./povray
(nil) 0
No input file provided

> -- Chris

Slawek
-- 
  ________ 
_/ __/ __/ BOFH excuse 48: bad ether in the cables
 \__ \__ \_______________________________________________________________

Post a reply to this message

From: Chris Cason
Subject: Re: Timed out waiting for worker thread startup
Date: 4 Mar 2008 12:58:59
Message: <47cd8de3@news.povray.org>


> Usually it displays:
> 
> ./povray
> 0x83024b8 1
> Timed out waiting for worker thread startup
> 
> (first number differs, of course :)
> and sometimes :
> 
> ./povray
> (nil) 0
> No input file provided

I presume that in this latter case there is no pause ... that pretty much
nails it. it's a race condition not yet catered for. a temporary
work-around that will catch most instances is fairly simple: change

  if (m_InitializeEvent.timed_wait(lock, t) == false)

to read

  if (m_BackendState == kUnknown && m_InitializeEvent.timed_wait(lock, t)
    == false)

this is not a 100% solution as it still leaves a race condition, but the
window is much narrower and ought to get you going in most instances until
it's fixed properly.

-- Chris

Post a reply to this message

From: Le Forgeron
Subject: Re: Timed out waiting for worker thread startup
Date: 4 Mar 2008 13:49:00
Message: <47cd999c@news.povray.org>

Le Wed, 05 Mar 2008 04:58:56 +1100, Chris Cason a modifié des petits
morceaux de l'univers pour nous faire lire :
 
> this is not a 100% solution as it still leaves a race condition, but the
> window is much narrower and ought to get you going in most instances
> until it's fixed properly.
> 
> -- Chris

It's a working workaround for me. Good job.
(Additional information, if needed later: amd64 3500+ (2.2GHz), one core, 
Linux Gentoo, gcc 4.2.3, 64 bit binary(x86-64), kernel 2.6.23-gentoo-r9 
(stable kernel, remaining ~amd64: unstable, usually), libstdc++ 5.0.7 v3) 

fprintf display: 
(nil) 0
0x.... 1

(nil) on first run, 0x... thereafter. Might be linked to code already in 
cache. Or something totally different.

At least now, it works here.

Post a reply to this message

From: Joe Peterson
Subject: Re: Timed out waiting for worker thread startup
Date: 4 Apr 2008 10:35:13
Message: <47f64ab1@news.povray.org>

I get this error fairly often (and randomly) as well.  I am running the
new beta25 Gentoo ebuild on x86.

					-Joe

Post a reply to this message

From: Olaf Leidinger
Subject: Re: Timed out waiting for worker thread startup
Date: 18 Aug 2015 04:50:01
Message: <web.55d2f11ea5b681e4e72a1f2c0@news.povray.org>

On one machine I get:
$ povray
povray: cannot open the user configuration file
/home/usersLS/oleid/.povray/3.7/povray.conf: No such file or directory
(nil) 0

Problem with option setting
povray
No input file provided

When copying the same binary to another I get on every run:

$ /tmp/povray
povray: cannot open the user configuration file
/home/usersLS/oleid/.povray/3.7/povray.conf: No such file or directory
(nil) 0
Timed out waiting for worker thread startup


Any idea how to further reduce the chance of race conditions?

Post a reply to this message

From: Olaf Leidinger
Subject: Re: Timed out waiting for worker thread startup
Date: 18 Aug 2015 05:25:01
Message: <web.55d2f8faa5b681e4e72a1f2c0@news.povray.org>

UPDATE:

It seems to work for me when increasing the waiting time:

diff -Nur povray-3.7.0.0.orig/vfe/vfesession.cpp
povray-3.7.0.0/vfe/vfesession.cpp
--- povray-3.7.0.0.orig/vfe/vfesession.cpp      2015-08-18 11:09:01.182876808
+0200
+++ povray-3.7.0.0/vfe/vfesession.cpp   2015-08-18 11:08:37.902965214 +0200
@@ -1066,11 +1066,11 @@
   boost::xtime t;
   boost::xtime_get (&t, POV_TIME_UTC);
   t.sec += 3 ;
-#ifdef _DEBUG
+
   t.sec += 120;
-#endif
+
   t.nsec = 0;
   m_WorkerThread = new boost::thread(vfeSessionWorker(*this));

Post a reply to this message

Goto Latest 10 Messages

Next 1 Messages >>>