AWS Bedrock Error Troubleshooting¶

Resolve AWS Bedrock AI-related issues with workers and correction jobs.

Quick Diagnosis¶

Run these commands to quickly identify the issue:

# Check worker logs for Bedrock errors
docker logs newhires-workers --tail=50 | grep -i "bedrock\|aws\|error"

# Verify AWS credentials are set
docker exec newhires-workers env | grep AWS

# Test Bedrock access from worker
docker exec newhires-workers python3 -c "
import boto3
client = boto3.client('bedrock-runtime', region_name='us-east-1')
print('Bedrock client created successfully')
"

Common Bedrock Errors¶

"AccessDeniedException" or "UnrecognizedClientException"¶

Symptom: Worker logs show:

botocore.exceptions.ClientError: An error occurred (AccessDeniedException)

or

botocore.exceptions.ClientError: An error occurred (UnrecognizedClientException)

Cause: Invalid or missing AWS credentials

Solutions:

Verify credentials in .env:

grep AWS_ .env
# Should show:
# AWS_ACCESS_KEY_ID=AKIA...
# AWS_SECRET_ACCESS_KEY=...
# AWS_REGION=us-east-1

Check credentials format:
Access Key ID should start with AKIA
Secret Access Key is 40 characters
No quotes around values in .env
Verify credentials in AWS Console:
Go to AWS Console → IAM → Users → Your User → Security credentials
Check if access key is Active
If inactive or deleted, create new one

Test credentials:

export AWS_ACCESS_KEY_ID="your_key"
export AWS_SECRET_ACCESS_KEY="your_secret"
aws sts get-caller-identity
# Should return your AWS account info

Restart workers with correct credentials:

docker-compose -f docker-compose.prod.yml restart workers

"AccessDeniedException: User is not authorized"¶

Symptom: Worker logs show:

AccessDeniedException: User: arn:aws:iam::XXXX:user/newhires is not authorized
to perform: bedrock:InvokeModel on resource: arn:aws:bedrock:...

Cause: IAM user lacks bedrock:InvokeModel permission

Solutions:

Attach correct IAM policy to your user:

Go to AWS Console → IAM → Users → Your User → Add permissions → Attach policies

Create/attach policy with this JSON:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "bedrock:InvokeModel"
      ],
      "Resource": [
        "arn:aws:bedrock:us-east-1::foundation-model/us.anthropic.claude-sonnet-4-5-20250929-v1:0",
        "arn:aws:bedrock:us-east-1::foundation-model/us.meta.llama4-scout-17b-instruct-v1:0"
      ]
    }
  ]
}

Wait 1-2 minutes for IAM changes to propagate

Restart workers:

docker-compose -f docker-compose.prod.yml restart workers

See Also: AWS Bedrock Setup Guide for complete IAM setup

"ResourceNotFoundException: Could not resolve foundation model"¶

Symptom: Worker logs show:

ResourceNotFoundException: Could not resolve the foundation model from the model identifier

Cause: Model access not enabled in AWS Bedrock console

Solutions:

Enable model access in AWS Console:
Go to: https://console.aws.amazon.com/bedrock/home?region=us-east-1#/modelaccess
Click "Manage model access"
Check "Claude Sonnet 4.5" (and optionally "Llama 4 Scout")
Click "Request model access"
Wait for status to change from "Pending" to "Access granted"

Verify model ID is correct in .env:

grep BEDROCK_MODEL_ID .env
# Should be empty (uses default) or:
# BEDROCK_MODEL_ID=us.anthropic.claude-sonnet-4-5-20250929-v1:0

Check model is available in us-east-1:

aws bedrock list-foundation-models --region us-east-1 --query 'modelSummaries[?contains(modelId, `claude`)].modelId'

Restart workers:

docker-compose -f docker-compose.prod.yml restart workers

"ThrottlingException: Rate exceeded"¶

Symptom: Worker logs show:

ThrottlingException: Rate exceeded for operation InvokeModel

Cause: Too many Bedrock API calls too quickly (hitting rate limits)

Solutions:

Reduce concurrent calls in .env:

# Edit .env
MAX_CONCURRENT_BEDROCK_CALLS=1  # Reduce from 2 to 1

Increase poll interval to reduce pressure:

# Edit .env
POLL_INTERVAL=10  # Increase from 5 to 10 seconds

Scale down workers if running multiple:

docker-compose -f docker-compose.prod.yml up -d --scale workers=1

Wait and retry:
Throttling is temporary
Workers will automatically retry with exponential backoff
Check logs: docker logs newhires-workers --tail=50
Request higher limits (if persistent):
Go to AWS Service Quotas console
Request quota increase for Bedrock InvokeModel

Note: Claude Sonnet 4.5 default limits: - Requests per minute: Varies by account - Tokens per minute: Varies by account

"Could not connect to the endpoint URL"¶

Symptom: Worker logs show:

Could not connect to the endpoint URL: "https://bedrock-runtime.us-east-1.amazonaws.com/"

Cause: Network connectivity or wrong AWS region

Solutions:

Verify AWS region in .env:

grep AWS_REGION .env
# Should be: AWS_REGION=us-east-1

Test network connectivity:

# From host
curl -I https://bedrock-runtime.us-east-1.amazonaws.com

# From worker container
docker exec newhires-workers curl -I https://bedrock-runtime.us-east-1.amazonaws.com

Check firewall/proxy settings:
Ensure outbound HTTPS (port 443) is allowed
Check if corporate firewall blocks AWS services
Configure proxy if needed

Restart workers:

docker-compose -f docker-compose.prod.yml restart workers

"ValidationException: The provided model identifier is invalid"¶

Symptom: Worker logs show:

ValidationException: The provided model identifier is invalid

Cause: Incorrect BEDROCK_MODEL_ID in .env

Solutions:

Use correct model ID:

Valid options: - us.anthropic.claude-sonnet-4-5-20250929-v1:0 (Claude Sonnet 4.5) - us.meta.llama4-scout-17b-instruct-v1:0 (Llama 4 Scout)

Fix .env file:

# Edit .env and set:
BEDROCK_MODEL_ID=us.anthropic.claude-sonnet-4-5-20250929-v1:0

# Or remove the line to use default (Claude Sonnet 4.5)

List available models:

aws bedrock list-foundation-models --region us-east-1 \
  --query 'modelSummaries[*].[modelId,modelName]' --output table

Restart workers:

docker-compose -f docker-compose.prod.yml restart workers

Workers Not Processing Jobs¶

Symptom¶

Jobs stuck in pending status, workers not picking them up.

Diagnosis¶

# Check worker is running
docker ps | grep workers

# Check worker logs
docker logs newhires-workers --tail=50

# Check job queue
docker exec newhires-db psql -U newhires -d newhires -c \
  "SELECT id, status, created_at FROM job_queue WHERE status='pending' LIMIT 10;"

Solutions¶

Verify worker is running:

docker-compose -f docker-compose.prod.yml ps workers
# Should show "Up"

Check for AWS errors in worker logs:

docker logs newhires-workers | grep -i "error\|exception"

Verify database connection:

docker exec newhires-workers env | grep DATABASE_URL

Restart workers:

docker-compose -f docker-compose.prod.yml restart workers

Scale up workers if needed:

docker-compose -f docker-compose.prod.yml up -d --scale workers=2

High AWS Costs¶

Symptom¶

Unexpectedly high AWS Bedrock charges.

Diagnosis¶

# Check token usage in worker logs
docker logs newhires-workers | grep "tokens used"

# Count jobs processed
docker exec newhires-db psql -U newhires -d newhires -c \
  "SELECT status, COUNT(*) FROM correction_jobs GROUP BY status;"

Solutions¶

Review cost settings in .env:

grep -E "MAX_CONCURRENT|MAX_AI_ATTEMPTS|BEDROCK_MODEL" .env

Reduce concurrent calls:

# Edit .env
MAX_CONCURRENT_BEDROCK_CALLS=1  # Reduce from 2

Reduce retry attempts:

# Edit .env
MAX_AI_ATTEMPTS=3  # Reduce from 5

Switch to cheaper model:

# Edit .env
BEDROCK_MODEL_ID=us.meta.llama4-scout-17b-instruct-v1:0  # ~70% cheaper than Claude

Set up billing alerts in AWS Console:
Go to AWS Billing → Budgets
Create budget alert for Bedrock usage

Restart workers after changes:

docker-compose -f docker-compose.prod.yml restart workers

Cost comparison per job: - Claude Sonnet 4.5: ~$0.045/job - Llama 4 Scout: ~$0.015/job

Debugging Workflow¶

Step 1: Check Worker Logs¶

docker logs newhires-workers --tail=100 -f

Look for: - INFO: Worker polling for jobs - Worker is running - INFO: Processing correction job - Job picked up - INFO: Calling AWS Bedrock - API call started - INFO: Bedrock tokens used: input=XXX, output=YYY - Success - ERROR: - Problems

Step 2: Verify AWS Credentials¶

# Check env vars
docker exec newhires-workers env | grep AWS

# Test AWS access
docker exec newhires-workers python3 -c "
import boto3
print(boto3.client('sts').get_caller_identity())
"

Step 3: Test Bedrock Directly¶

# Test from worker container
docker exec newhires-workers python3 -c "
import boto3
import json

client = boto3.client('bedrock-runtime', region_name='us-east-1')

body = json.dumps({
    'anthropic_version': 'bedrock-2023-05-31',
    'messages': [{'role': 'user', 'content': [{'type': 'text', 'text': 'Hello'}]}],
    'max_tokens': 100,
    'temperature': 0.1
})

response = client.invoke_model(
    modelId='us.anthropic.claude-sonnet-4-5-20250929-v1:0',
    body=body
)

print('Success!')
"

Step 4: Check Database Connection¶

# Verify workers can reach database
docker exec newhires-workers python3 -c "
import psycopg2
import os

conn = psycopg2.connect(os.environ['DATABASE_URL'])
print('Database connection successful')
conn.close()
"

Getting Help¶

If issues persist after trying these solutions:

Collect diagnostic info:

# Save logs
docker logs newhires-workers > worker_logs.txt

# Check env (remove sensitive values before sharing)
docker exec newhires-workers env | grep -v SECRET > worker_env.txt

# Check job status
docker exec newhires-db psql -U newhires -d newhires -c \
  "SELECT status, COUNT(*) FROM correction_jobs GROUP BY status;" > job_status.txt

Review documentation:
AWS Bedrock Setup
Environment Variables
Common Issues
Check AWS Service Health:
https://health.aws.amazon.com/health/status
Filter by Bedrock and us-east-1
Contact Support with collected logs and diagnostic info

Prevention¶

Best Practices¶

Monitor worker logs regularly:
```
docker logs newhires-workers --tail=50
```
Set up CloudWatch for Bedrock API monitoring (optional)
Use staging environment to test changes before production
Keep credentials secure:
Rotate AWS access keys every 90 days
Never commit .env to git
Use IAM roles when possible (EC2/ECS)
Monitor costs:
Set up AWS billing alerts
Review Bedrock usage monthly
Adjust MAX_CONCURRENT_BEDROCK_CALLS based on budget

Test Bedrock access after any AWS changes:

aws bedrock list-foundation-models --region us-east-1

Quick Reference¶

Environment Variables¶

Variable	Purpose	Impact on Bedrock
`AWS_ACCESS_KEY_ID`	AWS authentication	Required
`AWS_SECRET_ACCESS_KEY`	AWS authentication	Required
`AWS_REGION`	AWS region	Required (us-east-1)
`BEDROCK_MODEL_ID`	Model selection	Optional (default: Claude)
`MAX_CONCURRENT_BEDROCK_CALLS`	API concurrency	Higher = more cost
`MAX_AI_ATTEMPTS`	Retry limit	Higher = more cost

Common Commands¶

# Restart workers
docker-compose -f docker-compose.prod.yml restart workers

# View worker logs
docker logs newhires-workers -f

# Check AWS credentials
docker exec newhires-workers env | grep AWS

# Test Bedrock access
aws bedrock list-foundation-models --region us-east-1

# Check job queue
docker exec newhires-db psql -U newhires -d newhires -c \
  "SELECT status, COUNT(*) FROM correction_jobs GROUP BY status;"