VibeHunt
Back to browse

crawley

Unix-way web crawler.

Source

The tool crawls web pages and outputs every link it discovers, streaming unique URLs to standard output while stripping fragments. It parses HTML with a fast SAX parser, extracts URLs from JavaScript and CSS using lexical parsers, and can retrieve API endpoints, image, video, audio, and form resources. Users can control crawl depth, include or exclude specific HTML tags, ignore URLs matching patterns, and follow subdomains, with optional politeness via robots.txt handling and request delays.

It operates as a command‑line application written in Go, supporting environment‑based proxy configuration, custom cookies and headers in curl‑compatible syntax, and a “brute” mode that also scans HTML comments. Additional modes include directory‑only traversal, headless operation that suppresses HEAD requests, and tag‑based filtering for targeted scans. The binary is available for Linux, FreeBSD, macOS, and Windows, with package options for Debian/Red Hat and Arch Linux.

The codebase is under 1500 source lines, idiomatic Go, and more than 80 % test coverage, providing a lightweight yet feature‑rich solution for developers needing to enumerate links, resources, or endpoints from a site without requiring a full browser engine.

Reviews

Sign in to leave a review.

Loading reviews…

Similar apps