crystal faeries

divine love consciousness blog

robots
12th September 2016

Which are the good search engines?
Which ones do I use myself, and recommend?

  1. [DuckDuckGo]DuckDuckGo
  2. [StartPage by ixquick]StartPage

  3. This article is about the robots we also know as spiders or crawlers, the ones which crawl the World Wide Web to build data indexes we also know as search engines. They should obey the world-wide standard by which the website can specify how much of and how quickly robots may scan a website. Unfortunately, given the adverse influence of hostile agendae such as those promulgated by the NSA, the standard itself is dysfunctionally specified inside-out, upside-down, backwards, and negligently. The "robots.txt" protocol allows specification of what should not be crawled, thereby instantly informing the NSA et al. exactly what they're interested in, the stuff you don't wish to share with the world. Why you don't wish to share it is your own business, and is usually different for a variety of reasons, and those reasons are different for different parts of your website, but it all remains voluntary for the crawling spider robot to "obey". This quickly leads to the realization that there are good bots and bad bots.

    The absolute worst most evil corpseoration on the entire planet is Baidu, the Chinese Search Engine, whose robots don't "give a flying fuck" about anything, they will not ever give up trying to crawl everything. Because they absolutely ignore the robots.txt protocol, the only way to stop them is to ban them via IP address range. Today I discovered the Yandex Images bot ignoring my robots.txt file also, requiring me to ban it by its multiple IP address ranges, which I will have to continue to track down until I find them all and fully ban the entire corpseoration and all of its robots.

    One can verify and validate one's robots.txt file as being syntactically correct, in which case a properly designed and well behaved robot can obey it, but whether it is deployed with an agenda to obey robots.txt files, remains a choice of the humans who configure and operate those robots, therefore, ultimately it always comes down to the polarity of evil versus good of some human. Right on my "home" page in the top right corner I have provided a linking icon to a robots.txt validation service. If you've arrived at my website via http: then that link auto-validates my robots.txt file, but if you click it from the secure https: access to my website you have to manually type in the address of the file to validate. Sure enough, my robots.txt file validates as correct and should be "understood" by all robots.

    I must say that it is rather difficult to not taint my feelings about an entire nation of people, e.g. Chinese and Russians, by the evil misbehaviour of their national search engines. Simultaneously I feel sad for the entire nation of people who can no longer benefit from the contents of my website, as, at most, they will find forevermore from now on, only stale search engine results of my website. So, the bottom line is, the moral of the story is, demand your national search engine be well behaved, or your entire nation of people will suffer, the enmity of all other peoples on the planet, and a paucity of truth in a world of lies.

    Update: 2016-09-14 22:32:47+00:00
    I just caught BingBot ignoring my robots.txt file and sucking up files it's prohibited from taking, therefore have also banned it by IP address... so, MicroSoft, true to character, again shows itself to be the deepest pure evil.
    Update: 2016-09-15 05:34:10+00:00
    BingBot is truly programmed to be a bad netizen.
    Now that it's receving only 403 errors for each request, it has stepped-up its rate of queries, bashing on my server more intensely.


    Obviously, one might think: "Gee, celeste, aren't you only hurting yourself by limiting your visibility in some of the planet's most popular search engines?"
    Absolutely, in terms of gross numbers of potential visitors. But then I have never been interested in attracting hordes of unconscious sheep, masses of minions of mammon, those committed to the reality of The Beast. Nobody uninterested in the highest levels of consciousness will read very far in my website.


Created by Chronicle v4.6