Web Data & Ingestion MOC
This map organizes the tools and protocols for ingesting external web data into the vulture-nest.
🕸️ Firecrawl Integration
- firecrawl-scrape-capabilities - Single-page markdown extraction and cleaning.
- firecrawl-crawling-capabilities - Recursive site mapping and batch ingestion.
- firecrawl-map-capabilities - Domain discovery and endpoint inventory.
- handoff-firecrawl-openai-agents - Protocol for firecrawl-to-agent data transfer.
- spec-firecrawl-pgvector-pipeline - Architecture for high-fidelity vector storage.
🛠️ Ingestion Tools & Specs
- poshwiki-tools - PowerShell-based wiki maintenance and ingestion scripts.
- protocol-source-ingestion-runbook - Canonical process for adding new sources.
- codex-supabase-schema-ingestion - Automating database schema synchronization.
- semantic-embedding-pipeline - Strategy for note embedding and RAG indexing.
References
Ingestion Tools & Specs
- openai-js-repl-integration - Bridging agentic reasoning with deterministic JS execution environments.