Lil' Holmes v2 Ideas
2023-12-29
====================
The original Lil' Holmes was a searx instance I setup to mimic the look of
Little Isle (another searx instance at the time from a friend) which in turn
mimicked the look of Wiby. I originally took down Lil' Holmes because searx
wasn't packaged on OpenBSD like it was on FreeBSD, so I would need to install
a bunch of its python dependencies myself which I did not like.

In the original Lil' Holmes, only a few search engines were enabled by default,
which was pretty much just DuckDuckGo, Wiby, and I think a few of the Wikipedia
related sites like Wiktionary and Wikipedia proper. Many other searx instances
seem to be doing the same thing where much of the general results are outsourced
To either Google or Bing (through proxies or not) and the rest are usually not
as relevant to the search query I was using.

Usually, the results I am looking for are covered by the various StackExchanges.
Other types of results I look for are project repositories hosted on SourceHut,
Micro$oft^WGitHub, GitLab, and the various Gitea instances like Codeberg.
Encyclopedias like Britannica and MediaWikis like Wikipedia and the various
Fandoms include more general information that aren't specific questions. Lastly,
just browsing interesting sites via Wiby would be nice.

The StackExchanges have quarterly database dumps that make it easy to create a
fully offline version of their site, and that is what I am making first with
the tentative name of `sefe' (StackExchange FrontEnd, please send better names).

Afterward, I would like to create an offline frontend/interface for the
MediaWikis since they also have dumps and see if something similar can be done
for the Fandoms and encyclopedias like Britannica (digitally).

I do not think something that is completely offline can be done for Wiby since
there is not a publicly available database dump of it as far I searched, nor is
there one for public software repositories on Gitea and friends. While I could
just send many requests with random terms and collect all the results that come
through, it is very inefficient and I don't wish to send that much traffic to
smaller sites like Wiby and Gitea instances, and would similarly get blocked on
all of them unless I want to wait a very long time due to rate limits. I think
it wouldn't be too difficult with simpler code hosting sites like stagit
instances, but I think I can worry about this later.

Since I think I rambled a lot in this post, I am planning on making offline
frontends for sites that contain what I look for in search engines, namely the
StackExchanges, MediaWikis, software project repositories on Gitea and friends,
and miscellaneous sites like Wiby.