This repository powers The Longest-Trending Buzzfeed News Articles Since Nov 2018 Overall, by Year, and by Month.
The source data for this project was published by Jeremy Singer-Vine in this Github repo.
This site uses Evidence, a static site generator purpose-built for reporting & BI use cases.
You must have DuckDB & npm installed on your machine.
make
will process data, move it toevidence/
, and build the Evidence site.make dev
will process data, move it toevidence/
, and run Evidence in dev mode.make duckdb
will process data and move it toevidence/
.
There are three files in source-data/
:
bfn-trending-strip-deduped.tsv
: This is Jeremy's dataset from this Github repo.article-metadata.jsonl
: This is metadata I scraped from Buzzfeed News using Jeremy's dataset. Note that there are duplicates in the original dataset because each snapshot references URLs used at that time. If the URLs changed, Buzzfeed News redirected the original URLs. The article metadata contains duplicate rows if an article showed up multiple times with different URLs, but each record containsredirect_url
as well so you know what the canonical URL is.authors.jsonl
: This is metadata I scraped from Buzzfeed News for each reporter I found inarticle-metadata.jsonl
.
See an error? Please open a PR.