- When no prices exist for the requested date, query for the most recent available date and return those prices instead - Log informational message when falling back to alternate date
eamco_scraper
FastAPI microservice for scraping heating oil prices from New England Oil and storing historical pricing data.
Overview
This service scrapes oil company pricing data from the New England Oil website (Zone 10 - Central Massachusetts) and stores it in a PostgreSQL database for historical tracking and trend analysis.
Features
- Web Scraping: Automated scraping of oil prices using BeautifulSoup4
- Historical Tracking: Stores all price records (no updates, only inserts) for trend analysis
- Cron-Friendly: Single GET request triggers scrape and storage
- Health Checks: Built-in health endpoint for monitoring
- Docker Ready: Production and development Docker configurations
API Endpoints
GET /health
Health check endpoint with database connectivity status.
Response:
{
"status": "healthy",
"db_connected": true
}
GET /scraper/newenglandoil/latestprice
Trigger scrape of New England Oil Zone 10 prices, store in database, and return results.
Response:
{
"status": "success",
"message": "Successfully scraped and stored 30 prices",
"prices_scraped": 30,
"prices_stored": 30,
"scrape_timestamp": "2026-02-07T22:00:00",
"prices": [
{
"company_name": "AUBURN OIL",
"town": "Auburn",
"price_decimal": 2.599,
"scrape_date": "2026-02-07",
"zone": "zone10"
}
]
}
Database Schema
company_prices Table
| Column | Type | Description |
|---|---|---|
| id | SERIAL | Primary key |
| company_name | VARCHAR(255) | Oil company name |
| town | VARCHAR(100) | Town/city |
| price_decimal | DECIMAL(6,3) | Price per gallon |
| scrape_date | DATE | Date price was listed |
| zone | VARCHAR(50) | Geographic zone (default: zone10) |
| created_at | TIMESTAMP | Record creation timestamp |
Indexes:
idx_company_prices_companyoncompany_nameidx_company_prices_scrape_dateonscrape_dateidx_company_prices_zoneonzoneidx_company_prices_company_dateon(company_name, scrape_date)idx_company_prices_zone_dateon(zone, scrape_date)
Development
Local Setup
# Create virtual environment
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt
# Copy environment file
cp .env.example .env.local
# Edit .env.local with your database credentials
# Run the application
uvicorn app.main:app --reload --host 0.0.0.0 --port 8000
Docker Local
cd /mnt/code/oil/eamco/eamco_deploy
docker-compose -f docker-compose.local.yml up scraper_local
Access at: http://localhost:9619
Production
Docker Production
cd /mnt/code/oil/eamco/eamco_deploy
docker-compose -f docker-compose.prod.yml up -d scraper_prod
Access at: http://192.168.1.204:9519
Cron Integration
Add to Unraid cron or system crontab:
# Scrape prices daily at 6 AM
0 6 * * * curl -s http://192.168.1.204:9619/scraper/newenglandoil/latestprice > /dev/null 2>&1
Environment Variables
| Variable | Description | Default |
|---|---|---|
| MODE | Application mode (LOCAL/PRODUCTION) | LOCAL |
| POSTGRES_USERNAME | Database username | postgres |
| POSTGRES_PW | Database password | password |
| POSTGRES_SERVER | Database server | 192.168.1.204 |
| POSTGRES_PORT | Database port | 5432 |
| POSTGRES_DBNAME | Database name | eamco |
| LOG_LEVEL | Logging level | INFO |
| SCRAPER_DELAY | Delay between requests (seconds) | 2.0 |
| SCRAPER_TIMEOUT | Request timeout (seconds) | 10 |
Architecture
- Framework: FastAPI 0.109+
- Database: PostgreSQL 15+ with SQLAlchemy 2.0
- Scraping: BeautifulSoup4 + lxml + requests
- Server: Uvicorn with 2 workers (production)
Ports
- Local Development: 9619
- Production: 9519
Future Enhancements
- Frontend display on Home page (table or cards)
- Price change alerts/notifications
- Support for additional zones
- Price trend graphs and analytics
Description
Languages
Python
94.9%
Shell
5.1%