Automatically crawl your website and add search-engine capability.
Go to file
2023-04-20 16:11:56 -04:00
orcinus Use raw title for JSON (typeahead) output 2023-04-20 16:11:56 -04:00
.gitattributes Initial commit 2023-04-11 22:02:16 -04:00
.gitignore Re-upload 3rd party libraries 2023-04-20 10:47:11 -04:00
example.html Update example.html 2023-04-20 11:28:06 -04:00
example.php Update example.php 2023-04-20 11:30:17 -04:00
LICENSE Initial commit 2023-04-11 22:02:16 -04:00
README.md Update README.md 2023-04-20 13:35:51 -04:00

Orcinus Site Search

Brian Huisman

NOTE: This project is not yet officially released and thus probably contains bugs and other nasties. Please use at your own risk! If you do try it out, I would very much appreciate your feedback and issue reports.

The Orcinus Site Search PHP script is an all-in-one website crawler and search engine that extracts searchable content from XML, HTML and PDF files at a single, or multiple websites. It replaces 3rd party, remote search solutions such as Google etc.

Orcinus will crawl your website content on a schedule, or at your command via the admin UI or even by CLI/crontab. Crawler log output conveniently informs you of missing pages, links that redirect, and other errors that you, as a webmaster can fix to keep your user experience tight. A full-featured, responsive administration GUI allows you to adjust crawl settings, view and edit all crawled pages, customize search results, and view a log of all searched queries. You also have complete control over the appearance of your search results with a convenient templating system.

Optionally, Orcinus can generate a sitemap XML or XML.GZ file of your pages after every crawl, suitable for uploading to Google analytics. It can also export a JavaScript version of the entire search engine that works with offline mirrors, such as those generated by HTTrack.

Requirements:

  • PHP >= 7.2.x
  • MySQL / MariaDB

3rd Party Libraries:

Included:

Optional:

Getting Started

  1. Copy the orcinus directory to your root web directory.
  2. Fill out your SQL and desired credential details in the orcinus/config.ini.php file.
  3. Visit yourdomain.com/orcinus/admin.php in your favourite web browser and log in.
  4. Optionally follow the instructions in orcinus/geoip2/README.md to enable geolocation of search queries.

Examples of search interface integration are given in the example.php (online / PHP) and example.html (offline / JavaScript) files.