Sosse
Selenium based search engine and crawler with offline archiving.
Sosse is a self‑hosted utility that uses Selenium to crawl web pages, index their content, and store archived copies for offline browsing. It provides a web‑based administration interface where users can submit URLs to crawl, manage a crawl queue, and configure crawlers, collections, and RSS or Atom feeds. The system tracks documents, tags, domains, cookies, and mime types, and it can exclude specific URLs or integrate external search engines.
The platform exposes a REST API and command‑line tools for interacting with the indexed data, supporting features such as profile history, screenshot capture, and analytics. Users can set up webhooks, permission rules, and AI‑driven automation for tasks like tagging or summarizing content, while configuration files allow fine‑grained control over the web server, crawler behavior, and other settings.
Sosse is released under the AGPL‑3.0 license, is free of subscription or tiered pricing, and is considered stable for production use. It can be installed via Debian packages, pip, or Docker/Docker‑compose environments, making it adaptable to various deployment preferences.
Reviews
Loading reviews…
Similar apps

Network & Connectivity
Fess
Powerful and easily deployable Enterprise Search Server.

Network & Connectivity
Yacy
Peer based, decentralized search engine server.
File Management & Transfer
ArchiveBox
Create HTML & screenshot archives of sites from your bookmarks, browsing history, RSS feeds, or other sources (alternative to Wayback…

Network & Connectivity
SearXNG
Internet metasearch engine which aggregates results from various search services and databases (Fork of Searx).

API & Network Testing
Seodisias
SEO crawler that audits sites for technical issues and optimization opportunities.

Databases & Data Tools
Apache Solr
Enterprise search platform featuring full-text search, hit highlighting, faceted search, real-time indexing, dynamic clustering, and rich…