143 lines
6.3 KiB
Markdown
143 lines
6.3 KiB
Markdown
# EAMCO Address Checker
|
|
|
|
**eamco_address_checker** is a robust and resilient FastAPI microservice designed for batch verification and correction of customer addresses. It leverages the Nominatim geocoding service and an intelligent, ReAct-inspired agent to ensure address data quality.
|
|
|
|
The service is designed to be triggered as a scheduled job (e.g., via cron) to process addresses in batches, making it ideal for maintaining data hygiene in large databases without disrupting real-time operations.
|
|
|
|
[](https://www.python.org/)
|
|
[](https://fastapi.tiangolo.com/)
|
|
[](https://www.postgresql.org/)
|
|
|
|
---
|
|
|
|
## Core Features
|
|
|
|
- **Batch Address Verification**: Geocodes customer addresses in configurable batches to find their precise latitude and longitude.
|
|
- **Fuzzy Matching Correction**: If an address fails to geocode, the agent attempts to correct misspellings by comparing it against a local database of known street names for that town.
|
|
- **Street Data Population**: Includes an endpoint to fetch and store all street names for a given town/state from OpenStreetMap, building the reference data needed for corrections.
|
|
- **Resilient Agentic Workflow**: An `AddressVerificationAgent` follows a Think-Act-Observe cycle, ensuring that failures on individual records do not halt the entire batch.
|
|
- **Rate Limiting**: Automatically respects the Nominatim API's rate limits (1 request/second) to prevent service blockage.
|
|
- **Environment-based Configuration**: Easily configured for different environments (development, production) using `.env` files.
|
|
- **Containerized**: Comes with Dockerfiles for easy deployment.
|
|
|
|
## How It Works
|
|
|
|
The service operates through a simple but powerful workflow:
|
|
|
|
1. **Trigger**: A `POST` request to `/verify-addresses` kicks off a batch run.
|
|
2. **Plan**: The agent queries the database for records that have not been verified or were marked as incorrect.
|
|
3. **Execute**: For each address, the agent performs the following steps:
|
|
a. **Attempt Geocoding**: Tries to get a location from Nominatim.
|
|
b. **Fuzzy Match on Failure**: If the initial attempt fails, it uses `rapidfuzz` to find the closest matching street name from the `street_reference` table.
|
|
c. **Retry Geocoding**: If a confident match is found, it retries geocoding with the corrected address.
|
|
d. **Update Record**: The database record is updated with the latitude/longitude and a `correct_address` flag.
|
|
4. **Reflect**: The service returns a detailed summary of the batch run, including how many addresses were processed, updated, corrected, or failed.
|
|
|
|
---
|
|
|
|
## API Endpoints
|
|
|
|
### Health
|
|
|
|
- `GET /health`
|
|
- **Description**: Checks the service status and database connectivity.
|
|
- **Response**: `{"status": "healthy", "db_connected": true}`
|
|
|
|
### Verification
|
|
|
|
- `POST /verify-addresses`
|
|
- **Description**: Triggers a synchronous batch job to verify a new set of addresses. The batch size is determined by the `BATCH_SIZE` environment variable.
|
|
- **Response**: A detailed JSON object with statistics of the batch run.
|
|
|
|
- `POST /reset-verifications`
|
|
- **Description**: **Use with caution.** This endpoint resets the verification status (`correct_address`, `verified_at`, etc.) for all customer records, making them eligible for re-verification.
|
|
- **Response**: A confirmation with the number of records reset.
|
|
|
|
### Street Data
|
|
|
|
- `POST /streets/{town}/{state}`
|
|
- **Description**: Fetches all named streets for a given town and state from the OpenStreetMap Overpass API and stores them in the `street_reference` table. This is essential for the fuzzy matching feature.
|
|
- **Example**: `curl -X POST http://localhost:8000/streets/Boston/MA`
|
|
- **Response**: A summary of streets added or updated.
|
|
|
|
- `GET /streets/{town}/{state}`
|
|
- **Description**: Returns the number of reference streets stored locally for a given town and state.
|
|
- **Example**: `curl http://localhost:8000/streets/Boston/MA`
|
|
- **Response**: A JSON object with the street count.
|
|
|
|
---
|
|
|
|
## Getting Started
|
|
|
|
### Prerequisites
|
|
|
|
- Python 3.10+
|
|
- PostgreSQL database
|
|
- An OpenStreetMap Nominatim user agent (set in your configuration)
|
|
|
|
### Installation
|
|
|
|
1. **Clone the repository:**
|
|
```bash
|
|
git clone http://192.168.1.204:3017/Eamco/eamco_address_checker.git
|
|
cd eamco_address_checker
|
|
```
|
|
|
|
2. **Create a virtual environment and install dependencies:**
|
|
```bash
|
|
python -m venv venv
|
|
source venv/bin/activate
|
|
pip install -r requirements.txt
|
|
```
|
|
|
|
3. **Configure your environment:**
|
|
- Copy `.env.example` to `.env.local`:
|
|
```bash
|
|
cp .env.example .env.local
|
|
```
|
|
- Edit `.env.local` with your database credentials and other settings. Pay special attention to the `POSTGRES_DBNAME` and `CURRENT_SETTINGS` variables.
|
|
|
|
### Running the Service
|
|
|
|
#### For Development
|
|
|
|
You can run the service directly with Uvicorn, which provides live reloading.
|
|
|
|
```bash
|
|
uvicorn app.main:app --reload --host 0.0.0.0 --port 8000
|
|
```
|
|
|
|
The API will be available at `http://localhost:8000`, and interactive documentation can be found at `http://localhost:8000/docs`.
|
|
|
|
#### Using Docker
|
|
|
|
The project includes Dockerfiles for containerized deployment.
|
|
|
|
- **Build the image:**
|
|
```bash
|
|
docker build -t eamco_address_checker .
|
|
```
|
|
- **Run the container:**
|
|
```bash
|
|
docker run -d -p 8000:8000 --env-file .env.local --name address-checker eamco_address_checker
|
|
```
|
|
|
|
---
|
|
|
|
## Project Structure
|
|
|
|
```
|
|
eamco_address_checker/
|
|
├── app/
|
|
│ ├── __init__.py
|
|
│ ├── agent.py # The core ReAct-style verification agent
|
|
│ ├── config.py # Application configuration from environment variables
|
|
│ ├── main.py # FastAPI application, endpoints, and startup logic
|
|
│ ├── models.py # SQLAlchemy ORM models
|
|
│ ├── streets.py # Logic for fetching and correcting street names
|
|
│ └── tools.py # Modular tools used by the agent (geocoding, validation, etc.)
|
|
├── .env.example # Example environment variables
|
|
├── Dockerfile # Dockerfile for production builds
|
|
├── requirements.txt # Python dependencies
|
|
└── README.md # This file
|
|
``` |