diff --git a/README.md b/README.md index a5f2f72..8e1ae40 100644 --- a/README.md +++ b/README.md @@ -1,3 +1,143 @@ # EAMCO Address Checker -This service checks addresses. +**eamco_address_checker** is a robust and resilient FastAPI microservice designed for batch verification and correction of customer addresses. It leverages the Nominatim geocoding service and an intelligent, ReAct-inspired agent to ensure address data quality. + +The service is designed to be triggered as a scheduled job (e.g., via cron) to process addresses in batches, making it ideal for maintaining data hygiene in large databases without disrupting real-time operations. + +[![Language](https://img.shields.io/badge/Language-Python%203.11-blue)](https://www.python.org/) +[![Framework](https://img.shields.io/badge/Framework-FastAPI-green)](https://fastapi.tiangolo.com/) +[![Database](https://img.shields.io/badge/Database-PostgreSQL-blue)](https://www.postgresql.org/) + +--- + +## Core Features + +- **Batch Address Verification**: Geocodes customer addresses in configurable batches to find their precise latitude and longitude. +- **Fuzzy Matching Correction**: If an address fails to geocode, the agent attempts to correct misspellings by comparing it against a local database of known street names for that town. +- **Street Data Population**: Includes an endpoint to fetch and store all street names for a given town/state from OpenStreetMap, building the reference data needed for corrections. +- **Resilient Agentic Workflow**: An `AddressVerificationAgent` follows a Think-Act-Observe cycle, ensuring that failures on individual records do not halt the entire batch. +- **Rate Limiting**: Automatically respects the Nominatim API's rate limits (1 request/second) to prevent service blockage. +- **Environment-based Configuration**: Easily configured for different environments (development, production) using `.env` files. +- **Containerized**: Comes with Dockerfiles for easy deployment. + +## How It Works + +The service operates through a simple but powerful workflow: + +1. **Trigger**: A `POST` request to `/verify-addresses` kicks off a batch run. +2. **Plan**: The agent queries the database for records that have not been verified or were marked as incorrect. +3. **Execute**: For each address, the agent performs the following steps: + a. **Attempt Geocoding**: Tries to get a location from Nominatim. + b. **Fuzzy Match on Failure**: If the initial attempt fails, it uses `rapidfuzz` to find the closest matching street name from the `street_reference` table. + c. **Retry Geocoding**: If a confident match is found, it retries geocoding with the corrected address. + d. **Update Record**: The database record is updated with the latitude/longitude and a `correct_address` flag. +4. **Reflect**: The service returns a detailed summary of the batch run, including how many addresses were processed, updated, corrected, or failed. + +--- + +## API Endpoints + +### Health + +- `GET /health` + - **Description**: Checks the service status and database connectivity. + - **Response**: `{"status": "healthy", "db_connected": true}` + +### Verification + +- `POST /verify-addresses` + - **Description**: Triggers a synchronous batch job to verify a new set of addresses. The batch size is determined by the `BATCH_SIZE` environment variable. + - **Response**: A detailed JSON object with statistics of the batch run. + +- `POST /reset-verifications` + - **Description**: **Use with caution.** This endpoint resets the verification status (`correct_address`, `verified_at`, etc.) for all customer records, making them eligible for re-verification. + - **Response**: A confirmation with the number of records reset. + +### Street Data + +- `POST /streets/{town}/{state}` + - **Description**: Fetches all named streets for a given town and state from the OpenStreetMap Overpass API and stores them in the `street_reference` table. This is essential for the fuzzy matching feature. + - **Example**: `curl -X POST http://localhost:8000/streets/Boston/MA` + - **Response**: A summary of streets added or updated. + +- `GET /streets/{town}/{state}` + - **Description**: Returns the number of reference streets stored locally for a given town and state. + - **Example**: `curl http://localhost:8000/streets/Boston/MA` + - **Response**: A JSON object with the street count. + +--- + +## Getting Started + +### Prerequisites + +- Python 3.10+ +- PostgreSQL database +- An OpenStreetMap Nominatim user agent (set in your configuration) + +### Installation + +1. **Clone the repository:** + ```bash + git clone http://192.168.1.204:3017/Eamco/eamco_address_checker.git + cd eamco_address_checker + ``` + +2. **Create a virtual environment and install dependencies:** + ```bash + python -m venv venv + source venv/bin/activate + pip install -r requirements.txt + ``` + +3. **Configure your environment:** + - Copy `.env.example` to `.env.local`: + ```bash + cp .env.example .env.local + ``` + - Edit `.env.local` with your database credentials and other settings. Pay special attention to the `POSTGRES_DBNAME` and `CURRENT_SETTINGS` variables. + +### Running the Service + +#### For Development + +You can run the service directly with Uvicorn, which provides live reloading. + +```bash +uvicorn app.main:app --reload --host 0.0.0.0 --port 8000 +``` + +The API will be available at `http://localhost:8000`, and interactive documentation can be found at `http://localhost:8000/docs`. + +#### Using Docker + +The project includes Dockerfiles for containerized deployment. + +- **Build the image:** + ```bash + docker build -t eamco_address_checker . + ``` +- **Run the container:** + ```bash + docker run -d -p 8000:8000 --env-file .env.local --name address-checker eamco_address_checker + ``` + +--- + +## Project Structure + +``` +eamco_address_checker/ +├── app/ +│ ├── __init__.py +│ ├── agent.py # The core ReAct-style verification agent +│ ├── config.py # Application configuration from environment variables +│ ├── main.py # FastAPI application, endpoints, and startup logic +│ ├── models.py # SQLAlchemy ORM models +│ ├── streets.py # Logic for fetching and correcting street names +│ └── tools.py # Modular tools used by the agent (geocoding, validation, etc.) +├── .env.example # Example environment variables +├── Dockerfile # Dockerfile for production builds +├── requirements.txt # Python dependencies +└── README.md # This file +``` \ No newline at end of file