|
 |
On 4/02/2023 05:41, Bald Eagle wrote:
> There are instructions for "building your own search engine" - and i would
> imagine this would be pretty manageable given it could be restricted to just
> this site.
>
> No idea who could set up a rudimentary test of this, but at the very least it
> would crawl and index the entire site.
There are some open-source search engines that can index the site, and I have
considered using one.
If I did it should solve the problem for local searches. However at the moment fixing
the google search results has higher priority since that's what a typical POV-Ray user
would use to ask about something. If it doesn't index all pages we would have cases
where a user searched for something and while the answer actually exists and is in our
groups, google hasn't included it, so they don't get a relevant result.
> I think it would be really great to bea able to scroll through all of the zip,
> inc, mcr, and other such files in a "digest" which was just the search
> results...
>
> Also, you could search by username, which would be particularly useful when
> hunting down a file when you know who the author was.
Google provides means to understand what it's indexing. In particular, we can use
structured data as set out in
https://developers.google.com/search/docs/appearance/structured-data/intro-structured-data
All posts on the newsgroup have a JSON-LD object telling search engines that the
message is a forum post (type is DiscussionForumPosting) and other attributes,
specifically including the author.
If you are using the web view you can see this easily: view the source of this post in
your browser (usually control-U) then search for the word DiscussionForumPosting and
you will find them. For example here's what we said about your post:
<script type="application/ld+json">
{
"@context": "https://schema.org",
"@type": "DiscussionForumPosting",
"@id": "#web.63dd555f13ac33281f9dae3025979125%40news.povray.org",
"headline": "Re: Denoising POV-Ray images in Blender",
"dateCreated": "2023-02-03T18:45:00+00:00",
"datePublished": "2023-02-03T18:45:00+00:00",
"author": {
"@type": "Person",
"name": "Bald Eagle"
}
}
</script>
If I run the current page of this thread through the Google 'Rich Results Test' it
shows it saw and understood all the posts (the warnings are just for optional fields).
I have expanded the row where it saw your message (see attachment).
As far as I can tell the issue is Google isn't indexing *all* of our pages, regardless
of what I do (for example I tried using sitemaps in various forms and it still didn't
fetch all the URL's in the sitemap). I ran out of time trying to sort this and just
left it alone, hoping google would catch up eventually.
Post a reply to this message
Attachments:
Download 'richresults.png' (186 KB)
Preview of image 'richresults.png'

|
 |