Skip to content

New Hires Reporting System

AI-powered validation and correction system for new hire reporting files using AWS Bedrock.

Overview

The New Hires Reporting System is a production-ready application that automatically validates and corrects fixed-width formatted files using AWS Bedrock AI (Claude Sonnet 4.5). Built with FastAPI, React, PostgreSQL, and background workers, it dramatically reduces manual error correction work through intelligent AI-powered analysis.

Key Features

  • AI-Powered Corrections - AWS Bedrock (Claude Sonnet 4.5) provides intelligent error corrections
  • Background Processing - Worker-based architecture for scalable job processing
  • Multi-State Support - 10 US states supported (AZ, CO, DE, IL, KY, LA, MD, MI, OH, TX)
  • Context-Aware Batching - Sends surrounding records to AI for better accuracy
  • Retry Logic - Automatic retries with exponential backoff for failed corrections
  • Modern UI - React 19 frontend with real-time job status
  • Production Ready - Docker-based deployment with PostgreSQL persistence

Supported States

State Abbreviation Record Length Format
Arizona AZ 801 characters Fixed-width
Colorado CO 860 characters Fixed-width
Delaware DE 300 characters Fixed-width
Illinois IL 801 characters Fixed-width
Kentucky KY 900 characters Fixed-width
Louisiana LA 1132 characters Fixed-width
Maryland MD 860 characters Fixed-width
Michigan MI Tab-delimited Delimited format
Ohio OH 815 characters Fixed-width
Texas TX 801 characters Fixed-width

Architecture

Services

The system consists of four microservices:

  1. PostgreSQL Database - Stores correction jobs and job queue
  2. Backend API (FastAPI) - REST API for file operations
  3. Workers - Background processors using AWS Bedrock
  4. Frontend (React + Nginx) - Modern web interface

Technology Stack

  • Backend: Python 3.11 + FastAPI + SQLAlchemy + Alembic
  • Frontend: React 19 + TypeScript + Vite + Nginx
  • Database: PostgreSQL 16
  • AI: AWS Bedrock (Claude Sonnet 4.5 / Meta Llama 4 Scout)
  • Deployment: Docker Compose + AWS ECR

Quick Start

Get started in under 10 minutes:

Prerequisites

  • Docker 20.10+ and Docker Compose 2.0+
  • AWS account with Bedrock access enabled
  • AWS IAM credentials with Bedrock permissions

Deploy

# 1. Create .env file
cat > .env << 'EOF'
AWS_ACCESS_KEY_ID=your_access_key
AWS_SECRET_ACCESS_KEY=your_secret_key
AWS_REGION=us-east-1
IMAGE_TAG=sha-latest
POSTGRES_PASSWORD=change_this_password
EOF

# 2. Authenticate to AWS ECR
aws ecr get-login-password --region us-east-1 | \
  docker login --username AWS --password-stdin \
  878796852397.dkr.ecr.us-east-1.amazonaws.com

# 3. Pull and start services
docker-compose -f docker-compose.prod.yml pull
docker-compose -f docker-compose.prod.yml up -d

# 4. Initialize database
docker exec newhires-backend alembic upgrade head

Access: - Web Interface: http://localhost:8080 - Backend API: http://localhost:8000/docs

For detailed setup instructions, see Docker Compose Deployment.

Documentation Structure

Getting Started

Deployment

Usage

Troubleshooting

Operations

AWS Bedrock Configuration

Required IAM Policy

Workers need this IAM policy to invoke Bedrock models:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "BedrockModelInvocation",
      "Effect": "Allow",
      "Action": [
        "bedrock:InvokeModel"
      ],
      "Resource": [
        "arn:aws:bedrock:us-east-1::foundation-model/us.anthropic.claude-sonnet-4-5-20250929-v1:0",
        "arn:aws:bedrock:us-east-1::foundation-model/us.meta.llama4-scout-17b-instruct-v1:0"
      ]
    }
  ]
}

See AWS Bedrock Setup for complete instructions.

Performance

AI Correction Capabilities

  • Context-Aware: Sends 3 records before + invalid record + 3 records after to AI
  • Batch Processing: Processes multiple errors per API call for efficiency
  • Retry Logic: Automatic retries with exponential backoff
  • Throttling Handling: Smart backoff when AWS rate limits are hit
  • Token Tracking: Logs input/output tokens for cost monitoring

Scalability

  • Worker Scaling: Scale horizontally with --scale workers=N
  • Concurrent Calls: Configurable concurrent Bedrock API calls per worker
  • Database-Backed Queue: Reliable job queue with PostgreSQL
  • Background Processing: Non-blocking UI with async job processing

System Status

Production Ready

✅ Complete refactored architecture ✅ AWS Bedrock integration with Claude Sonnet 4.5 ✅ PostgreSQL database with migrations ✅ Background worker processing ✅ Modern React frontend ✅ Multi-architecture Docker images (AMD64 + ARM64) ✅ GitHub Actions CI/CD pipeline ✅ Complete deployment documentation

Support & Monitoring

Health Checks

# Backend API health
curl http://localhost:8000/health

# Frontend health
curl -I http://localhost:8080

# View all service status
docker-compose -f docker-compose.prod.yml ps

View Logs

# All services
docker-compose -f docker-compose.prod.yml logs -f

# Specific service
docker-compose -f docker-compose.prod.yml logs -f workers
docker-compose -f docker-compose.prod.yml logs -f backend

Database Operations

# Access database CLI
docker exec -it newhires-db psql -U newhires -d newhires

# Backup database
docker exec newhires-db pg_dump -U newhires newhires > backup.sql

# Check job status
docker exec newhires-db psql -U newhires -d newhires -c \
  "SELECT status, COUNT(*) FROM correction_jobs GROUP BY status;"

Common Operations

# Restart services
docker-compose -f docker-compose.prod.yml restart

# Scale workers for higher throughput
docker-compose -f docker-compose.prod.yml up -d --scale workers=3

# Update to new version
docker-compose -f docker-compose.prod.yml pull
docker-compose -f docker-compose.prod.yml up -d
docker exec newhires-backend alembic upgrade head

# View worker activity
docker logs -f newhires-workers

Cost Considerations

AWS Bedrock Pricing

Approximate costs per correction job:

  • Claude Sonnet 4.5: ~$0.045 per job (higher accuracy)
  • Meta Llama 4 Scout: ~$0.015 per job (budget option)

Tuning Tips: - Adjust MAX_CONCURRENT_BEDROCK_CALLS to control throughput and costs - Use MAX_AI_ATTEMPTS to limit retries on difficult corrections - Monitor token usage in worker logs

Next Steps

  1. AWS Bedrock Setup - Get your AWS credentials and IAM policy
  2. Docker Compose Deployment - Deploy the application
  3. Environment Variables - Configure your environment
  4. Health Monitoring - Monitor your deployment

Ready to process new hire files with AI-powered corrections! 🚀