Grab the source with the right tool
Blogs, YouTube, and PubMed each need different fetch logic. The system starts by keeping source-specific metadata intact.
Provenance pulls content from three source types, normalizes the result, scores trust with visible factors, and keeps the evidence attached so the output can be validated instead of guessed.
The backend endpoints powering this dashboard.
Explore all endpoints via interactive API documentation.
Server health check and basic system info.
High-level aggregation of scraped data summary.
Paginated list of all scraped source documents.
Live scrape pipeline for standard URLs.
Use the arrow controls to move through each connected stage. The stage panel updates in sync so you can track what changed and why.
Blogs, YouTube, and PubMed each need different fetch logic. The system starts by keeping source-specific metadata intact.
Noise gets removed, language gets detected, tags get attached, and the output becomes comparable across source families.
Each score is computed from explicit factors, so every confidence claim is traceable to evidence.
The final record includes the score, factor breakdown, flags, and content chunks so every claim has a clear audit trail.
This is the exact method used for each source type, including fallbacks when metadata or transcripts are missing.
.author or .byline.<article>, <main>, or #content tags, falling back to all <p> tags if necessary. Strip out navs, ads, and footers.[1].youtube.com/watch and shortened youtu.be formats).youtube-transcript-api to get the actual spoken words as the primary content payload.transcript_unavailable risk flag.Paste a URL, pick a source type, and the page will render the response with score, reasons, flags, and chunks.
The interface is designed to feel like a tool, not a landing page. The form stays simple, the output stays readable, and the result keeps its evidence attached.
Submit a URL to see the score, the breakdown, and the evidence behind it.
The score explanation will show up here after the scrape completes.