orcinus-search/README.md
2023-04-20 13:35:51 -04:00

35 lines
2.4 KiB
Markdown
Raw Blame History

This file contains invisible Unicode characters

This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Orcinus Site Search
[Brian Huisman](https://greywyvern.com)
> **NOTE**: This project is not yet officially released and thus probably contains bugs and other nasties. Please use at your own risk! If you do try it out, I would very much appreciate your feedback and issue reports.
The **Orcinus Site Search** PHP script is an all-in-one website crawler and search engine that extracts searchable content from XML, HTML and PDF files at a single, or multiple websites. It replaces 3rd party, remote search solutions such as Google etc.
**Orcinus** will crawl your website content on a schedule, or at your command via the admin UI or even by CLI/crontab. Crawler log output conveniently informs you of missing pages, links that redirect, and other errors that you, as a webmaster can fix to keep your user experience tight. A full-featured, responsive administration GUI allows you to adjust crawl settings, view and edit all crawled pages, customize search results, and view a log of all searched queries. You also have complete control over the appearance of your search results with a [convenient templating system](https://mustache.github.io/).
Optionally, **Orcinus** can generate a [sitemap XML or XML.GZ](https://www.sitemaps.org) file of your pages after every crawl, suitable for uploading to Google analytics. It can also export a JavaScript version of the entire search engine that works with offline mirrors, such as those generated by [HTTrack](https://www.httrack.com).
### Requirements:
- PHP >= 7.2.x
- MySQL / MariaDB
### 3rd Party Libraries:
Included:
- [PHPMailer](https://github.com/PHPMailer/PHPMailer)
- [PDFParser](https://github.com/smalot/pdfparser)
- [Mustache](https://github.com/bobthecow/mustache.php) / [Mustache.js](https://github.com/janl/mustache.js)
- [libcurlemu](https://github.com/m1k3lm/libcurlemu)
Optional:
- [Maxmind GeoIP2](https://github.com/maxmind/GeoIP2-php)
## Getting Started
1. Copy the `orcinus` directory to your root web directory.
2. Fill out your SQL and desired credential details in the `orcinus/config.ini.php` file.
3. Visit `yourdomain.com/orcinus/admin.php` in your favourite web browser and log in.
4. Optionally follow the instructions in `orcinus/geoip2/README.md` to enable geolocation of search queries.
Examples of search interface integration are given in the `example.php` (online / PHP) and `example.html` (offline / JavaScript) files.