orcinus-search/README.md

21 lines
1.5 KiB
Markdown
Raw Normal View History

2023-04-18 18:45:12 +00:00
# Orcinus Site Search
2023-04-20 15:03:08 +00:00
The Orcinus Site Search PHP script is an all-in-one website crawler and search engine that extracts searchable content from XML, HTML and PDF files at a single, or multiple websites. It replaces 3rd party, remote search solutions such as Google etc.
2023-04-18 18:45:12 +00:00
2023-04-20 14:46:20 +00:00
Orcinus will crawl your website content on a schedule, or at your command via the admin UI or even by CLI/crontab. Crawler log output conveniently informs you of missing pages, links that redirect, and other errors that you, as a webmaster can fix to keep your user experience tight. Customize your search results by blocking URLs, unlisting pages, or raising/lowering their search priority. You have complete control over the appearance of your search results with a [convenient templating system](https://mustache.github.io/).
2023-04-18 18:45:12 +00:00
2023-04-20 14:46:20 +00:00
Orcinus can generate a [sitemap XML or XML.GZ](https://www.sitemaps.org) file of your pages after every crawl, suitable for uploading to Google analytics. It can also export a JavaScript version of the entire search engine that works with offline mirrors, such as those generated by [HTTrack](https://www.httrack.com).
2023-04-18 19:24:51 +00:00
Requires:
2023-04-19 01:08:37 +00:00
- PHP >= 7.2.x
- MySQL / MariaDB
2023-04-20 14:46:20 +00:00
3rd Party Libraries Included:
- [PHPMailer](https://github.com/PHPMailer/PHPMailer)
- [PDFParser](https://github.com/smalot/pdfparser)
- [Mustache](https://github.com/bobthecow/mustache.php) / [Mustache.js](https://github.com/janl/mustache.js)
- [libcurlemu](https://github.com/m1k3lm/libcurlemu)
Optional Libraries:
- [Maxmind GeoIP2](https://github.com/maxmind/GeoIP2-php)