|
|
|
|
|
|
| |
| |
|
|
|
|
| |
| |
|
|
On 14/02/2023 23:45, William F Pokorny wrote:
> pokorny:tmp$ telnet news.povray.org 119
> Trying 203.29.75.35...
> Connected to news.povray.org.
> Escape character is '^]'.
> 200 news.povray.org DNEWS Version 5.7b1,, S0, posting OK
> article <5e9cce19@news.povray.org>
> ... I get what looks like some large mime dump that I suspect is
> due some attached images... Probably the header is there, but not sure how to see
the header in this case.
You can use 'head <5e9cce19@news.povray.org>' instead of 'article' in this case. And
to get just the body minus the headers you can use 'body'.
> What to make of this? Some of my stuff is on the server and other stuff is not?
I can't say for sure what's there and what's not. To be totally sure I'd need to write
some code to cross-reference posts that are in the database against those in the news
server. Generally speaking if an article is removed from the news server by some
automated process (expire or disk usage) they will remain in the DB. If they are
explicitly removed (by the user or an admin) then they will be gone from both.
At the moment there's no way to specifically search the web view's DB (or more
specifically, no way unless google gets their shit together).
That said, I have improved the search pages by switching to google's paid search
service. The results on the free tier which I was using were pretty terrible.
However that doesn't fix the issue of missing pages or improper dates etc which we've
seen.
I have made another attempt to get google to digest the site by providing them with a
sitemap (tried this in the past and it didn't help, but worth trying again).
If you were really keen to see if a particular message is in the web DB you can grab
the sitemap and download the files it refers to, then grep for the message ID. The
sitemap is currently at https://news.povray.org/gsitemap.xml. Note this is (currently)
static, generated as of a few hours ago. If it works then I'll set up a cronjob to
keep it updated.
In addition to the above I also retain log files containing all content posted to the
server since 2004 (at least in theory - the earliest is 2004 and latest is today, but
I have yet to check that they all line up wrt start/end so there's no missing dates).
But accessing those shouldn't be necessary provided the web DB has everything.
If after checking the above-mentioned resources doesn't find a message you think is
missing it probably wouldn't be too difficult to throw together a simple perl script
to pull every post you made from the log files.
-- Chris
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
On 19/02/2023 23:36, Chris Cason wrote:
> If you were really keen to see if a particular message is in the web DB you can grab
the sitemap and download the files it refers to, then grep for the message ID. The
sitemap is currently at https://news.povray.org/gsitemap.xml. Note this is (currently)
static, generated as of a few hours ago. If it works then I'll set up a cronjob to
keep it updated.
Forgot to mention there's another way if you know the message ID: just tack it onto
the end of the URL. For example <5e9cce19@news.povray.org> can be found via
https://news.povray.org/5e9cce19@news.povray.org (you can include the '<' and '>' if
you want, but it's not required).
-- Chris
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
On 15/02/2023 03:12, Kenneth wrote:
\> Ah, that's more like it! :-)
>
> I see that my own search says "about 573 results". About?
>
> But web portal/Google only presents 10 pages of the hits, sadly.
I've updated the site search to use google's paid service, so it shouldn't show any
ads now and hopefully does return more useful results.
Sorting by date is useless, though. Google doesn't seem to pay attention to the
structured data containing the post dates.
It would be interesting to see if other search engines return more posts, specifically
if they return ones that google seems to not want to index.
-- Chris
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
Chris Cason <del### [at] deletethistoopovrayorg> wrote:
> I've updated the site search to use google's paid service,
Again, thank you. While I don't "know" from experience what it's like, I can
extrapolate based on managing large-ish projects and knowing many people who
have managed long-term projects. But nothing as long-term as THIS!
You should have a donation page AND a shrine.
> Sorting by date is useless, though. Google doesn't seem to pay attention to the
structured data containing the post d
ates.
>
> It would be interesting to see if other search engines return more posts,
specifically if they return ones that googl
e seems to not want to index.
{sigh}
I hope that this new money and effort isn't wasted - I was just trying again
last night to reference some very recent posts of my own, and I didn't find what
I was looking for.
I often try to use the "site:news.povray.org search-terms" format to focus on
what I want to find, but I inevitably get a lot of results that unsurprisingly
have to do with bald eagle news stories and whatnot.
If there are any tricks (keywords added to posts, special characters, etc) that
would help find things that would be good to know. At least I could start
writing better posts that would help me find them in the future.
Is there any extant special kung-fu that allows looking for posts only BY a
specific user, or that only have attachments? If there's an upper limit on the
number of results, that might help keep the desired posts from falling off that
far edge.
I'm also wondering if Thunderbird users have the ability to search their
downloaded messages more efficiently than G**gle, since I'm imagining that it
would be much like searching my email or using grep on local files.
- BW
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
"Bald Eagle" <cre### [at] netscapenet> wrote:
>
> You should have a donation page AND a shrine.
I second that!
>
> If there are any tricks (keywords added to posts, special characters, etc) that
> would help find things that would be good to know. At least I could start
> writing better posts that would help me find them in the future.
>
Well, in the web portal/Google, I do searches like...
kenneth isosurface color_map bald eagle
and it returns hits that have ALL (or any?) of the relevant terms. But I guess
you already know that.
I have also naively tried 'boolean search' (?), although I don't know much about
how this is syntactically structured, or if I even know what I'm doing:
kenneth + isosurface - color_map - bald eagle
... but that returns the same hits as previously.
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
On 20/02/2023 00:50, Bald Eagle wrote:
> Is there any extant special kung-fu that allows looking for posts only BY a
> specific user, or that only have attachments? If there's an upper limit on the
> number of results, that might help keep the desired posts from falling off that
> far edge.
This is possible in theory. I do include author info in the structured data. If I
switch to their JSON API it is possible to apply refinements like searching by author.
I could also set this up directly from the web interface (basically another set of
pages, one for each author), but if I did this at all I'd only apply it to ones who
had more than a (yet-to-be-decided) minimum number of posts.
(Come to think of it, that would be pretty handy. Hmmmm ...)
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
hi,
Chris Cason <del### [at] deletethistoopovrayorg> wrote:
> On 15/02/2023 03:12, Kenneth wrote:
> > Ah, that's more like it! :-)
> > I see that my own search says "about 573 results". About?
> > ...
>
> I've updated the site search to use google's paid service, so it shouldn't show any
ads now and hopefully does return
more useful results.
tried just now, and the search gets more results than before, see attached.
regards, jr.
Post a reply to this message
Attachments:
Download 'kw2.png' (15 KB)
Preview of image 'kw2.png'
|
|
| |
| |
|
|
|
|
| |
| |
|
|
hi,
"Kenneth" <kdw### [at] gmailcom> wrote:
> ...
> I have also naively tried 'boolean search' (?), although I don't know much about
> how this is syntactically structured, or if I even know what I'm doing:
>
> kenneth + isosurface - color_map - bald eagle
>
> ... but that returns the same hits as previously.
close, to exclude a word use -word (no space), and I think +word means must be
part of match.
regards, jr.
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
On 23/02/2023 02:06, jr wrote:
> close, to exclude a word use -word (no space), and I think +word means must be
> part of match.
'+' /used/ to do that, but unfortunately google changed it. Nowadays you need to put
the word(s) in quotes; so "word" instead of +word.
Even with that, google will still try to give you lots of results even if the search
doesn't generate many proper matches. So it can still insert junk not related to the
query.
To get it to stop screwing around and stick to the subject you need to use "verbatim"
search. Unfortunately there does not appear to be a way to get it to do verbatim with
a search operator (e.g. like "site:"). You need to click the 'Tools' menu just under
the right-hand side of the search box, then click the 'All Results' menu that appears
below on the left and change it to 'Verbatim'. This menu is only available once you've
done a search, which is a PITA also.
-- Chris
Post a reply to this message
Attachments:
Download 'verbatim.png' (89 KB)
Preview of image 'verbatim.png'
|
|
| |
| |
|
|
|
|
| |
| |
|
|
On 2023-02-23 11:55 (-4), Chris Cason wrote:
>
> Even with that, google will still try to give you lots of results even
> if the search doesn't generate many proper matches. So it can still
> insert junk not related to the query.
Which is super annoying. I long for the days when computer programs
didn't constantly second-guess the user. While I can appreciate
suggestions, they should remain just that.
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |