|
|
|
|
|
|
| |
| |
|
|
|
|
| |
| |
|
|
On 26/05/2021 06:05, clipka wrote:
> Just trying to slowly get back up to speed, reading up on what's been
> happening while I wasn't looking, and found that Thunderbird refused to
> load a suspiciously large number of messages from back then, saying they
> didn't exist anymore. Tried to repair the index, and now all messages
> from 2019 seem to have been gone entirely (as well as a few months
> before and after).
>
> Could that be related to the March 9 server failure, or was there an
> earlier separate problem that ate all the posts?
Any group in particular that is missing posts, or is it all?
For povray.general at least, posts seem to be there ...
telnet news.povray.org 119
group povray.general
211 54252 3332 83275 povray.general selected
Estimated 54,252 articles in total, first article 3332 and last is
83275. I'm able to retrieve both the first (from 1998) and the last
(from today).
However that's not to say I know what exact requests Thunderbird is
sending. If it happens again I suggest running tcpdump or wireshark on
your end when you do the fetch so you can see exactly what Thunderbird
is asking and what the server tells it in response. If you can send that
to me I can take a look at it.
-- Chris
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
Am 26.05.2021 um 02:06 schrieb Chris Cason:
> Estimated 54,252 articles in total, first article 3332 and last is
> 83275. I'm able to retrieve both the first (from 1998) and the last
> (from today).
That about matches the number of posts I'm seeing now, after rebuilding
the index (54243; not sure where the difference of 10 is from). Before
though, there were something like 70k articles for which Thunderbird had
retrieved header information.
Posts as far back as 1998 are fine, as are recent posts. What I'm
SPECIFICALLY NOT seeing anymore are any posts dating between 2018-10-07
and 2020-04-18. And I know there were quite a lot, even a couple I had
read just before rebuilding the index. Even a couple I had posted myself.
The same happens in each and every other group: Any posts from 2019 that
are still in the index over here, but that I hadn't read yet, all just
prompt a message saying they no longer exist. And if I rebuild the index
of any group, all posts from 2019 - including ones I had read just
minutes before - are just gone, without any trace whatsoever.
The web interface still seems to have them, for some reason. But the
news server apparently doesn't.
> However that's not to say I know what exact requests Thunderbird is
> sending. If it happens again I suggest running tcpdump or wireshark on
> your end when you do the fetch so you can see exactly what Thunderbird
> is asking and what the server tells it in response. If you can send that
> to me I can take a look at it.
I have no idea what tcpdump or wireshark even do (I may be exaggerating
here, but not by much), let alone how to run them on my Windows machine
to put them to any good use.
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
On 2021-05-26 8:40 AM (-4), clipka wrote:
>
> Posts as far back as 1998 are fine, as are recent posts. What I'm
> SPECIFICALLY NOT seeing anymore are any posts dating between 2018-10-07
> and 2020-04-18. And I know there were quite a lot, even a couple I had
> read just before rebuilding the index. Even a couple I had posted myself.
>
> The same happens in each and every other group: Any posts from 2019 that
> are still in the index over here, but that I hadn't read yet, all just
> prompt a message saying they no longer exist. And if I rebuild the index
> of any group, all posts from 2019 - including ones I had read just
> minutes before - are just gone, without any trace whatsoever.
>
> The web interface still seems to have them, for some reason. But the
> news server apparently doesn't.
I have the same problem.
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
Op 26-5-2021 om 16:12 schreef Cousin Ricky:
> I have the same problem.
>
As do I also.
--
Thomas
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
On 26/05/2021 22:40, clipka wrote:
> The web interface still seems to have them, for some reason. But the
> news server apparently doesn't.
Sounds like the server has a problem with re-indexing. I know it did
have an issue with XOVER for more recent dates and hence I had to turn
that off.
If it turns out 2018-10-07 refers to some 'special' number in 32-bit
seconds since 1970 (or close to one) then it's likely an internal issue.
If it doesn't then I'll have to dig further.
If it turns out I can't fix it I'll have to change to another NNTP
server (and that assumes I can find a way to get the missing messages
imported).
The message spool files are basically large text blobs so are readble
and I know I've had to do that once before maybe 20 years ago. But
that's a last resort as the spam filtering the server uses is a script
custom-written for the current server. If e.g. INN supports spam filters
I expect there will be a way to get it ported.
-- Chris
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
Am 31.05.2021 um 05:14 schrieb Chris Cason:
> If it turns out 2018-10-07 refers to some 'special' number in 32-bit
> seconds since 1970 (or close to one) then it's likely an internal issue.
> If it doesn't then I'll have to dig further.
2018-10-07 00:00 UTC would be 1539734400 seconds since beginning of the
Unix epoch (i.e. 1970).
The magic rollover for 32-bit signed values (or 31-bit unsigned) will be
in early 2038.
The magic rollover for 1 bit less would have been in early 2004.
So no, that doesn't quite fit the symptoms.
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
On 26/05/2021 22:40, clipka wrote:
> Posts as far back as 1998 are fine, as are recent posts. What I'm
> SPECIFICALLY NOT seeing anymore are any posts dating between 2018-10-07
> and 2020-04-18. And I know there were quite a lot, even a couple I had
> read just before rebuilding the index. Even a couple I had posted myself.
I can confirm I see this behavior locally now also. It's not *every*
message, but certainly a lot of them, predominantly high-traffic groups.
In terms of what's causing it, I do note that the last message in one of
the internal spool files (db_312.itm) is dated 7 Oct 2018 17:36:06. The
next message in that thread, according to the web view, is
<5bbacb83$1@news.povray.org>, dated 7 Oct 2018 23:14:11.
The next db_*.itm file, sequentially, is db_314.itm (there's no 313),
however it contains posts from 2006. A grep through all the *.itm files
shows that <5bbacb83$1@news.povray.org> is, in fact, missing entirely.
(The db_*.itm files aren't always sequential; some are, but sometimes it
starts writing to a different sequence. I'm not sure of the trigger for
this, but suspect it is intentional).
I did the same grep on a backup I have locally of the server prior to
the crash and DID find the article: it's in db_1.itm, and in fact is
also the very first article.
So: at some point on 7 October 2018 the news server switched from
writing to db_312.itm to db_1.itm. There's nothing in the command log
file indicating I issued any particular command (e.g. re-index) on that
day, nor do the server logs show a reboot or anything interesting, and
finally the news server read log for the day shows articles being read
before and after the cutoff time right through the evening without any
apparent major interruption or break.
Returning to this year: grepping the nntp server log revealed that on 25
March during a re-index the server decided that db_1.idx (and a bunch of
others) was 'empty lost' (whatever that means) and nuked them.
Subsequently (31 March) it created a new db_1.itm and filled that with
new articles as they came in.
So ... the server nuked a bunch of article files for a reason that is
not clear and has subsequently started re-using the db sequence numbers.
The good news is I have all the removed .itm files, other than the
below, in a backup. The exception is that on Mar 27, Mar 31 and May 6
the server also nuked, respectively, db_0.itm, db_1.itm and db_2.itm
again, and these have subsequently been re-used. As these were
post-crash I only have the current versions of them in my backup.
TL;DR a bunch of items were lost. I can try to restore them by copying
over the nuked .itm files from a backup, re-naming them where necessary,
in the hope that when I re-start the news server it will ingest them
rather than deleting them. I'll experiment with this on a test server
rather than risk breaking this one more.
-- Chris
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
Following on from my previous post about the cause of the missing articles:
In the long term given I really don't have any insights into how our
current news server does stuff internally (it's not open-source) and the
fact it's old and unsupported I think I have no choice but to migrate to
a different server (probably INN).
I should be able to code a means of converting articles between the
respective spool formats to avoid needing to have the new server take a
feed from the current one (this would add an unnecessary new component
to the path of each article, amongst other possible tweaks).
For any articles still missing after I process the current spool and the
previously nuked but saved db_*.itm files, I should be able to fetch
them from the webview database. Once done I just need to port the spam
filter and I should then be able to bring the new NNTP server up.
I will, if I can find a way to do it, attempt to keep the article
numbers within the groups identical. If that's not possible, though, a
full re-fetch of the groups will be necessary for NNTP users.
The priority I place on doing this will depend on whether or not I can
get the missing articles back into the current server without it
throwing a hissy fit and deleting them again.
-- Chris
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
Thank you very much for trying to fix this Chris !
--
Tor Olav
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
On 15/04/2021 16:30, ingo wrote:
> Over the years I've lost my newsgroup archive several times and currently
> don't maintain it. Could it be made available as a (always up to date)
> download in some way?
If I end up migrating to another news server I will need to collate all
the messages I can find (including getting any missing ones from the
webview DB).
If I do that I could provide the collated messages in some form of
download. I may be able to keep that updated by appending to it new
messages as the come in, provided there's enough interest in it.
One complication of a simple append scheme, though, is that
(intentionally) deleted messages won't get removed, and I think that any
archive of this sort should have deleted messages taken out of it.
It may be better to just dump the stored messages from the DB as they
will always reflect only undeleted messages. I'll decide when I look at
the migration.
BTW it's occurred to me that one useful outcome of having a downloadable
archive of this sort is that we could auto-submit it to archive.org on a
regular basis.
-- Chris
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
|
|