# EAMCO Address Checker **eamco_address_checker** is a robust and resilient FastAPI microservice designed for batch verification and correction of customer addresses. It leverages the Nominatim geocoding service and an intelligent, ReAct-inspired agent to ensure address data quality. The service is designed to be triggered as a scheduled job (e.g., via cron) to process addresses in batches, making it ideal for maintaining data hygiene in large databases without disrupting real-time operations. [![Language](https://img.shields.io/badge/Language-Python%203.11-blue)](https://www.python.org/) [![Framework](https://img.shields.io/badge/Framework-FastAPI-green)](https://fastapi.tiangolo.com/) [![Database](https://img.shields.io/badge/Database-PostgreSQL-blue)](https://www.postgresql.org/) --- ## Core Features - **Batch Address Verification**: Geocodes customer addresses in configurable batches to find their precise latitude and longitude. - **Fuzzy Matching Correction**: If an address fails to geocode, the agent attempts to correct misspellings by comparing it against a local database of known street names for that town. - **Street Data Population**: Includes an endpoint to fetch and store all street names for a given town/state from OpenStreetMap, building the reference data needed for corrections. - **Resilient Agentic Workflow**: An `AddressVerificationAgent` follows a Think-Act-Observe cycle, ensuring that failures on individual records do not halt the entire batch. - **Rate Limiting**: Automatically respects the Nominatim API's rate limits (1 request/second) to prevent service blockage. - **Environment-based Configuration**: Easily configured for different environments (development, production) using `.env` files. - **Containerized**: Comes with Dockerfiles for easy deployment. ## How It Works The service operates through a simple but powerful workflow: 1. **Trigger**: A `POST` request to `/verify-addresses` kicks off a batch run. 2. **Plan**: The agent queries the database for records that have not been verified or were marked as incorrect. 3. **Execute**: For each address, the agent performs the following steps: a. **Attempt Geocoding**: Tries to get a location from Nominatim. b. **Fuzzy Match on Failure**: If the initial attempt fails, it uses `rapidfuzz` to find the closest matching street name from the `street_reference` table. c. **Retry Geocoding**: If a confident match is found, it retries geocoding with the corrected address. d. **Update Record**: The database record is updated with the latitude/longitude and a `correct_address` flag. 4. **Reflect**: The service returns a detailed summary of the batch run, including how many addresses were processed, updated, corrected, or failed. --- ## API Endpoints ### Health - `GET /health` - **Description**: Checks the service status and database connectivity. - **Response**: `{"status": "healthy", "db_connected": true}` ### Verification - `POST /verify-addresses` - **Description**: Triggers a synchronous batch job to verify a new set of addresses. The batch size is determined by the `BATCH_SIZE` environment variable. - **Response**: A detailed JSON object with statistics of the batch run. - `POST /reset-verifications` - **Description**: **Use with caution.** This endpoint resets the verification status (`correct_address`, `verified_at`, etc.) for all customer records, making them eligible for re-verification. - **Response**: A confirmation with the number of records reset. ### Street Data - `POST /streets/{town}/{state}` - **Description**: Fetches all named streets for a given town and state from the OpenStreetMap Overpass API and stores them in the `street_reference` table. This is essential for the fuzzy matching feature. - **Example**: `curl -X POST http://localhost:8000/streets/Boston/MA` - **Response**: A summary of streets added or updated. - `GET /streets/{town}/{state}` - **Description**: Returns the number of reference streets stored locally for a given town and state. - **Example**: `curl http://localhost:8000/streets/Boston/MA` - **Response**: A JSON object with the street count. --- ## Getting Started ### Prerequisites - Python 3.10+ - PostgreSQL database - An OpenStreetMap Nominatim user agent (set in your configuration) ### Installation 1. **Clone the repository:** ```bash git clone http://192.168.1.204:3017/Eamco/eamco_address_checker.git cd eamco_address_checker ``` 2. **Create a virtual environment and install dependencies:** ```bash python -m venv venv source venv/bin/activate pip install -r requirements.txt ``` 3. **Configure your environment:** - Copy `.env.example` to `.env.local`: ```bash cp .env.example .env.local ``` - Edit `.env.local` with your database credentials and other settings. Pay special attention to the `POSTGRES_DBNAME` and `CURRENT_SETTINGS` variables. ### Running the Service #### For Development You can run the service directly with Uvicorn, which provides live reloading. ```bash uvicorn app.main:app --reload --host 0.0.0.0 --port 8000 ``` The API will be available at `http://localhost:8000`, and interactive documentation can be found at `http://localhost:8000/docs`. #### Using Docker The project includes Dockerfiles for containerized deployment. - **Build the image:** ```bash docker build -t eamco_address_checker . ``` - **Run the container:** ```bash docker run -d -p 8000:8000 --env-file .env.local --name address-checker eamco_address_checker ``` --- ## Project Structure ``` eamco_address_checker/ ├── app/ │ ├── __init__.py │ ├── agent.py # The core ReAct-style verification agent │ ├── config.py # Application configuration from environment variables │ ├── main.py # FastAPI application, endpoints, and startup logic │ ├── models.py # SQLAlchemy ORM models │ ├── streets.py # Logic for fetching and correcting street names │ └── tools.py # Modular tools used by the agent (geocoding, validation, etc.) ├── .env.example # Example environment variables ├── Dockerfile # Dockerfile for production builds ├── requirements.txt # Python dependencies └── README.md # This file ```