POV-Ray : Newsgroups : povray.general : Server failure Server Time
18 Sep 2021 15:11:55 EDT (-0400)
  Server failure (Message 79 to 88 of 88)  
<<< Previous 10 Messages Goto Initial 10 Messages
From: Chris Cason
Subject: Re: Server failure
Date: 25 May 2021 20:07:58
Message: <60ad915e@news.povray.org>
On 26/05/2021 06:05, clipka wrote:
> Just trying to slowly get back up to speed, reading up on what's been 
> happening while I wasn't looking, and found that Thunderbird refused to 
> load a suspiciously large number of messages from back then, saying they 
> didn't exist anymore. Tried to repair the index, and now all messages 
> from 2019 seem to have been gone entirely (as well as a few months 
> before and after).
> 
> Could that be related to the March 9 server failure, or was there an 
> earlier separate problem that ate all the posts?

Any group in particular that is missing posts, or is it all?

For povray.general at least, posts seem to be there ...

   telnet news.povray.org 119
   group povray.general
   211 54252 3332 83275 povray.general selected

Estimated 54,252 articles in total, first article 3332 and last is 
83275. I'm able to retrieve both the first (from 1998) and the last 
(from today).

However that's not to say I know what exact requests Thunderbird is 
sending. If it happens again I suggest running tcpdump or wireshark on 
your end when you do the fetch so you can see exactly what Thunderbird 
is asking and what the server tells it in response. If you can send that 
to me I can take a look at it.

-- Chris


Post a reply to this message

From: clipka
Subject: Re: Server failure
Date: 26 May 2021 08:40:18
Message: <60ae41b2$1@news.povray.org>
Am 26.05.2021 um 02:06 schrieb Chris Cason:

> Estimated 54,252 articles in total, first article 3332 and last is 
> 83275. I'm able to retrieve both the first (from 1998) and the last 
> (from today).

That about matches the number of posts I'm seeing now, after rebuilding 
the index (54243; not sure where the difference of 10 is from). Before 
though, there were something like 70k articles for which Thunderbird had 
retrieved header information.

Posts as far back as 1998 are fine, as are recent posts. What I'm 
SPECIFICALLY NOT seeing anymore are any posts dating between 2018-10-07 
and 2020-04-18. And I know there were quite a lot, even a couple I had 
read just before rebuilding the index. Even a couple I had posted myself.

The same happens in each and every other group: Any posts from 2019 that 
are still in the index over here, but that I hadn't read yet, all just 
prompt a message saying they no longer exist. And if I rebuild the index 
of any group, all posts from 2019 - including ones I had read just 
minutes before - are just gone, without any trace whatsoever.

The web interface still seems to have them, for some reason. But the 
news server apparently doesn't.

> However that's not to say I know what exact requests Thunderbird is 
> sending. If it happens again I suggest running tcpdump or wireshark on 
> your end when you do the fetch so you can see exactly what Thunderbird 
> is asking and what the server tells it in response. If you can send that 
> to me I can take a look at it.

I have no idea what tcpdump or wireshark even do (I may be exaggerating 
here, but not by much), let alone how to run them on my Windows machine 
to put them to any good use.


Post a reply to this message

From: Cousin Ricky
Subject: Re: Server failure
Date: 26 May 2021 10:12:19
Message: <60ae5743$1@news.povray.org>
On 2021-05-26 8:40 AM (-4), clipka wrote:
> 
> Posts as far back as 1998 are fine, as are recent posts. What I'm
> SPECIFICALLY NOT seeing anymore are any posts dating between 2018-10-07
> and 2020-04-18. And I know there were quite a lot, even a couple I had
> read just before rebuilding the index. Even a couple I had posted myself.
> 
> The same happens in each and every other group: Any posts from 2019 that
> are still in the index over here, but that I hadn't read yet, all just
> prompt a message saying they no longer exist. And if I rebuild the index
> of any group, all posts from 2019 - including ones I had read just
> minutes before - are just gone, without any trace whatsoever.
> 
> The web interface still seems to have them, for some reason. But the
> news server apparently doesn't.

I have the same problem.


Post a reply to this message

From: Thomas de Groot
Subject: Re: Server failure
Date: 26 May 2021 10:55:58
Message: <60ae617e$1@news.povray.org>
Op 26-5-2021 om 16:12 schreef Cousin Ricky:
> I have the same problem.
> 
As do I also.

-- 
Thomas


Post a reply to this message

From: Chris Cason
Subject: Re: Server failure
Date: 30 May 2021 23:14:09
Message: <60b45481@news.povray.org>
On 26/05/2021 22:40, clipka wrote:
> The web interface still seems to have them, for some reason. But the 
> news server apparently doesn't.

Sounds like the server has a problem with re-indexing. I know it did 
have an issue with XOVER for more recent dates and hence I had to turn 
that off.

If it turns out 2018-10-07 refers to some 'special' number in 32-bit 
seconds since 1970 (or close to one) then it's likely an internal issue. 
If it doesn't then I'll have to dig further.

If it turns out I can't fix it I'll have to change to another NNTP 
server (and that assumes I can find a way to get the missing messages 
imported).

The message spool files are basically large text blobs so are readble 
and I know I've had to do that once before maybe 20 years ago. But 
that's a last resort as the spam filtering the server uses is a script 
custom-written for the current server. If e.g. INN supports spam filters 
I expect there will be a way to get it ported.

-- Chris


Post a reply to this message

From: clipka
Subject: Re: Server failure
Date: 31 May 2021 09:09:12
Message: <60b4dff8$1@news.povray.org>
Am 31.05.2021 um 05:14 schrieb Chris Cason:

> If it turns out 2018-10-07 refers to some 'special' number in 32-bit 
> seconds since 1970 (or close to one) then it's likely an internal issue. 
> If it doesn't then I'll have to dig further.

2018-10-07 00:00 UTC would be 1539734400 seconds since beginning of the 
Unix epoch (i.e. 1970).

The magic rollover for 32-bit signed values (or 31-bit unsigned) will be 
in early 2038.

The magic rollover for 1 bit less would have been in early 2004.

So no, that doesn't quite fit the symptoms.


Post a reply to this message

From: Chris Cason
Subject: Re: Server failure
Date: 9 Jun 2021 05:08:56
Message: <60c08528$1@news.povray.org>
On 26/05/2021 22:40, clipka wrote:
> Posts as far back as 1998 are fine, as are recent posts. What I'm 
> SPECIFICALLY NOT seeing anymore are any posts dating between 2018-10-07 
> and 2020-04-18. And I know there were quite a lot, even a couple I had 
> read just before rebuilding the index. Even a couple I had posted myself.

I can confirm I see this behavior locally now also. It's not *every* 
message, but certainly a lot of them, predominantly high-traffic groups.

In terms of what's causing it, I do note that the last message in one of 
the internal spool files (db_312.itm) is dated 7 Oct 2018 17:36:06. The 
next message in that thread, according to the web view, is 
<5bbacb83$1@news.povray.org>, dated 7 Oct 2018 23:14:11.

The next db_*.itm file, sequentially, is db_314.itm (there's no 313), 
however it contains posts from 2006. A grep through all the *.itm files 
shows that <5bbacb83$1@news.povray.org> is, in fact, missing entirely.

(The db_*.itm files aren't always sequential; some are, but sometimes it 
starts writing to a different sequence. I'm not sure of the trigger for 
this, but suspect it is intentional).

I did the same grep on a backup I have locally of the server prior to 
the crash and DID find the article: it's in db_1.itm, and in fact is 
also the very first article.

So: at some point on 7 October 2018 the news server switched from 
writing to db_312.itm to db_1.itm. There's nothing in the command log 
file indicating I issued any particular command (e.g. re-index) on that 
day, nor do the server logs show a reboot or anything interesting, and 
finally the news server read log for the day shows articles being read 
before and after the cutoff time right through the evening without any 
apparent major interruption or break.

Returning to this year: grepping the nntp server log revealed that on 25 
March during a re-index the server decided that db_1.idx (and a bunch of 
others) was 'empty lost' (whatever that means) and nuked them. 
Subsequently (31 March) it created a new db_1.itm and filled that with 
new articles as they came in.

So ... the server nuked a bunch of article files for a reason that is 
not clear and has subsequently started re-using the db sequence numbers.

The good news is I have all the removed .itm files, other than the 
below, in a backup. The exception is that on Mar 27, Mar 31 and May 6 
the server also nuked, respectively, db_0.itm, db_1.itm and db_2.itm 
again, and these have subsequently been re-used. As these were 
post-crash I only have the current versions of them in my backup.

TL;DR a bunch of items were lost. I can try to restore them by copying 
over the nuked .itm files from a backup, re-naming them where necessary, 
in the hope that when I re-start the news server it will ingest them 
rather than deleting them. I'll experiment with this on a test server 
rather than risk breaking this one more.

-- Chris


Post a reply to this message

From: Chris Cason
Subject: Re: Server failure
Date: 9 Jun 2021 05:15:01
Message: <60c08695$1@news.povray.org>
Following on from my previous post about the cause of the missing articles:

In the long term given I really don't have any insights into how our 
current news server does stuff internally (it's not open-source) and the 
fact it's old and unsupported I think I have no choice but to migrate to 
a different server (probably INN).

I should be able to code a means of converting articles between the 
respective spool formats to avoid needing to have the new server take a 
feed from the current one (this would add an unnecessary new component 
to the path of each article, amongst other possible tweaks).

For any articles still missing after I process the current spool and the 
previously nuked but saved db_*.itm files, I should be able to fetch 
them from the webview database. Once done I just need to port the spam 
filter and I should then be able to bring the new NNTP server up.

I will, if I can find a way to do it, attempt to keep the article 
numbers within the groups identical. If that's not possible, though, a 
full re-fetch of the groups will be necessary for NNTP users.

The priority I place on doing this will depend on whether or not I can 
get the missing articles back into the current server without it 
throwing a hissy fit and deleting them again.

-- Chris


Post a reply to this message

From: Tor Olav Kristensen
Subject: Re: Server failure
Date: 9 Jun 2021 17:30:00
Message: <web.60c1323ec68e683b8e52cc8789db30a9@news.povray.org>
Thank you very much for trying to fix this Chris !

--
Tor Olav


Post a reply to this message

From: Chris Cason
Subject: Re: Archive was Server failure
Date: 9 Jun 2021 20:59:21
Message: <60c163e9$1@news.povray.org>
On 15/04/2021 16:30, ingo wrote:
> Over the years I've lost my newsgroup archive several times and currently
> don't maintain it. Could it be made available as a (always up to date)
> download in some way?

If I end up migrating to another news server I will need to collate all 
the messages I can find (including getting any missing ones from the 
webview DB).

If I do that I could provide the collated messages in some form of 
download. I may be able to keep that updated by appending to it new 
messages as the come in, provided there's enough interest in it.

One complication of a simple append scheme, though, is that 
(intentionally) deleted messages won't get removed, and I think that any 
archive of this sort should have deleted messages taken out of it.

It may be better to just dump the stored messages from the DB as they 
will always reflect only undeleted messages. I'll decide when I look at 
the migration.

BTW it's occurred to me that one useful outcome of having a downloadable 
archive of this sort is that we could auto-submit it to archive.org on a 
regular basis.

-- Chris


Post a reply to this message

<<< Previous 10 Messages Goto Initial 10 Messages

Copyright 2003-2021 Persistence of Vision Raytracer Pty. Ltd.