EAMCO Address Checker
eamco_address_checker is a robust and resilient FastAPI microservice designed for batch verification and correction of customer addresses. It leverages the Nominatim geocoding service and an intelligent, ReAct-inspired agent to ensure address data quality.
The service is designed to be triggered as a scheduled job (e.g., via cron) to process addresses in batches, making it ideal for maintaining data hygiene in large databases without disrupting real-time operations.
Core Features
- Batch Address Verification: Geocodes customer addresses in configurable batches to find their precise latitude and longitude.
- Fuzzy Matching Correction: If an address fails to geocode, the agent attempts to correct misspellings by comparing it against a local database of known street names for that town.
- Street Data Population: Includes an endpoint to fetch and store all street names for a given town/state from OpenStreetMap, building the reference data needed for corrections.
- Resilient Agentic Workflow: An
AddressVerificationAgentfollows a Think-Act-Observe cycle, ensuring that failures on individual records do not halt the entire batch. - Rate Limiting: Automatically respects the Nominatim API's rate limits (1 request/second) to prevent service blockage.
- Environment-based Configuration: Easily configured for different environments (development, production) using
.envfiles. - Containerized: Comes with Dockerfiles for easy deployment.
How It Works
The service operates through a simple but powerful workflow:
- Trigger: A
POSTrequest to/verify-addresseskicks off a batch run. - Plan: The agent queries the database for records that have not been verified or were marked as incorrect.
- Execute: For each address, the agent performs the following steps:
a. Attempt Geocoding: Tries to get a location from Nominatim.
b. Fuzzy Match on Failure: If the initial attempt fails, it uses
rapidfuzzto find the closest matching street name from thestreet_referencetable. c. Retry Geocoding: If a confident match is found, it retries geocoding with the corrected address. d. Update Record: The database record is updated with the latitude/longitude and acorrect_addressflag. - Reflect: The service returns a detailed summary of the batch run, including how many addresses were processed, updated, corrected, or failed.
API Endpoints
Health
GET /health- Description: Checks the service status and database connectivity.
- Response:
{"status": "healthy", "db_connected": true}
Verification
-
POST /verify-addresses- Description: Triggers a synchronous batch job to verify a new set of addresses. The batch size is determined by the
BATCH_SIZEenvironment variable. - Response: A detailed JSON object with statistics of the batch run.
- Description: Triggers a synchronous batch job to verify a new set of addresses. The batch size is determined by the
-
POST /reset-verifications- Description: Use with caution. This endpoint resets the verification status (
correct_address,verified_at, etc.) for all customer records, making them eligible for re-verification. - Response: A confirmation with the number of records reset.
- Description: Use with caution. This endpoint resets the verification status (
Street Data
-
POST /streets/{town}/{state}- Description: Fetches all named streets for a given town and state from the OpenStreetMap Overpass API and stores them in the
street_referencetable. This is essential for the fuzzy matching feature. - Example:
curl -X POST http://localhost:8000/streets/Boston/MA - Response: A summary of streets added or updated.
- Description: Fetches all named streets for a given town and state from the OpenStreetMap Overpass API and stores them in the
-
GET /streets/{town}/{state}- Description: Returns the number of reference streets stored locally for a given town and state.
- Example:
curl http://localhost:8000/streets/Boston/MA - Response: A JSON object with the street count.
Getting Started
Prerequisites
- Python 3.10+
- PostgreSQL database
- An OpenStreetMap Nominatim user agent (set in your configuration)
Installation
-
Clone the repository:
git clone http://192.168.1.204:3017/Eamco/eamco_address_checker.git cd eamco_address_checker -
Create a virtual environment and install dependencies:
python -m venv venv source venv/bin/activate pip install -r requirements.txt -
Configure your environment:
- Copy
.env.exampleto.env.local:cp .env.example .env.local - Edit
.env.localwith your database credentials and other settings. Pay special attention to thePOSTGRES_DBNAMEandCURRENT_SETTINGSvariables.
- Copy
Running the Service
For Development
You can run the service directly with Uvicorn, which provides live reloading.
uvicorn app.main:app --reload --host 0.0.0.0 --port 8000
The API will be available at http://localhost:8000, and interactive documentation can be found at http://localhost:8000/docs.
Using Docker
The project includes Dockerfiles for containerized deployment.
- Build the image:
docker build -t eamco_address_checker . - Run the container:
docker run -d -p 8000:8000 --env-file .env.local --name address-checker eamco_address_checker
Project Structure
eamco_address_checker/
├── app/
│ ├── __init__.py
│ ├── agent.py # The core ReAct-style verification agent
│ ├── config.py # Application configuration from environment variables
│ ├── main.py # FastAPI application, endpoints, and startup logic
│ ├── models.py # SQLAlchemy ORM models
│ ├── streets.py # Logic for fetching and correcting street names
│ └── tools.py # Modular tools used by the agent (geocoding, validation, etc.)
├── .env.example # Example environment variables
├── Dockerfile # Dockerfile for production builds
├── requirements.txt # Python dependencies
└── README.md # This file