December 4, 2024 (2mo ago)

Personal Daily News Aggregator

I used a mix of python packages (namely Beautiful Soup and Selenium for scraping, and Pandas for data formatting) to aggregate the top 5-10 daily headlines from each of my chosen sources into a single CSV.

Scraping like this requires

  1. Inspecting the target sites and understanding how their HTML is structured,
  2. Telling python exactly which HTML elements and what parts of those elements you want to fetch,
  3. and finally using python to convert the fetched data to a readable csv output - with links of course.
Sample Output CSVGithub Repo
Home Page

A video of the scrapers crawling and aggregating headline data, and a snapshot of the output.