batch-transform¶
Run batch predictions on AWS SageMaker.
Synopsis¶
easy_sm [--docker-tag TAG] batch-transform --s3-model-location S3_PATH \
--s3-input-location S3_PATH --s3-output-location S3_PATH \
--num-instances COUNT --ec2-type TYPE [OPTIONS]
Description¶
The batch-transform command runs batch inference on SageMaker, processing large datasets without deploying a persistent endpoint. It's cost-effective for:
- Periodic batch predictions
- Large-scale inference jobs
- One-time predictions on datasets
- Offline scoring
Unlike endpoints, batch transform jobs: - Process data in S3 and write results back to S3 - Run once and terminate (no ongoing costs) - Handle large files automatically - Don't require endpoint management
Options¶
| Option | Short | Type | Required | Default | Description |
|---|---|---|---|---|---|
--s3-model-location | -m | string | Yes | - | S3 location to model tar.gz |
--s3-input-location | -i | string | Yes | - | S3 location of input data files |
--s3-output-location | -o | string | Yes | - | S3 location to save predictions |
--num-instances | - | integer | Yes | - | Number of EC2 instances |
--ec2-type | -e | string | Yes | - | EC2 instance type (e.g., ml.m5.large) |
--iam-role-arn | -r | string | No | From SAGEMAKER_ROLE | AWS IAM role ARN |
--app-name | -a | string | No | Auto-detected | App name for configuration |
--wait | -w | boolean | No | false | Wait until job completes |
--job-name | -n | string | No | Auto-generated | Custom job name |
--docker-tag | -t | string | No | latest | Docker image tag (global option) |
Examples¶
Basic batch transform¶
export SAGEMAKER_ROLE=arn:aws:iam::123456789012:role/SageMakerRole
easy_sm batch-transform \
-e ml.m5.large \
--num-instances 1 \
-m s3://my-bucket/models/model.tar.gz \
-i s3://my-bucket/input-data \
-o s3://my-bucket/predictions
Large-scale batch job¶
Process large datasets with multiple instances:
easy_sm batch-transform \
-e ml.m5.xlarge \
--num-instances 5 \
-m s3://bucket/model.tar.gz \
-i s3://bucket/large-dataset \
-o s3://bucket/predictions
Wait for completion¶
Block until the job finishes:
easy_sm batch-transform \
-e ml.m5.large \
--num-instances 1 \
-m s3://bucket/model.tar.gz \
-i s3://bucket/data \
-o s3://bucket/output \
--wait
Output:
Or if failed:
(Exit code 1)Custom job name¶
easy_sm batch-transform \
-n my-batch-job-2024-01 \
-e ml.m5.large \
--num-instances 2 \
-m s3://bucket/model.tar.gz \
-i s3://bucket/data \
-o s3://bucket/output
With specific Docker tag¶
easy_sm -t v1.0.0 batch-transform \
-e ml.m5.large \
--num-instances 1 \
-m s3://bucket/model.tar.gz \
-i s3://bucket/data \
-o s3://bucket/output
Output Format¶
Without --wait: - No output (job submitted asynchronously)
With --wait: - Outputs final job status: Completed, Failed, or Stopped - Exit code 0 for success, 1 for failure
Prerequisites¶
- Trained model in S3 (from
traincommand) - Docker image pushed to ECR
- Input data files in S3
- IAM role with SageMaker and S3 permissions
- Inference code in
prediction/serve
Input Data Format¶
File Structure¶
Place input files in S3:
SageMaker processes each file and creates corresponding output files:
Input File Formats¶
Batch transform supports various formats:
CSV:
JSON Lines (JSONL):
{"features": [1.0, 2.0, 3.0, 4.0]}
{"features": [5.0, 6.0, 7.0, 8.0]}
{"features": [9.0, 10.0, 11.0, 12.0]}
Binary formats (if your serving code supports it)
Serving Code Requirements¶
Your prediction/serve code must implement the inference functions:
import joblib
import os
import json
import numpy as np
def model_fn(model_dir):
"""Load model."""
return joblib.load(os.path.join(model_dir, 'model.mdl'))
def input_fn(request_body, content_type):
"""
Parse input for batch transform.
Args:
request_body: Raw input line/record
content_type: Input format (e.g., 'text/csv')
Returns:
Parsed input ready for prediction
"""
if content_type == 'text/csv':
# Parse CSV line
values = [float(x) for x in request_body.strip().split(',')]
return np.array(values).reshape(1, -1)
elif content_type == 'application/json':
# Parse JSON
data = json.loads(request_body)
return np.array(data['features']).reshape(1, -1)
else:
raise ValueError(f"Unsupported content type: {content_type}")
def predict_fn(input_data, model):
"""Make prediction."""
return model.predict(input_data)
def output_fn(prediction, accept):
"""
Format output for batch transform.
Args:
prediction: Model prediction
accept: Output format
Returns:
Formatted prediction string
"""
if accept == 'application/json':
return json.dumps({"prediction": prediction.tolist()})
else:
return str(prediction[0])
How Batch Transform Works¶
- Split input: SageMaker splits input data across instances
- Process in parallel: Each instance processes its chunk
- Generate predictions: Calls your inference code for each record
- Write output: Saves predictions to S3 output location
- Terminate: Instances shut down when complete
S3 Input Files
↓
Split across instances
↓
Instance 1: batch1.csv → predictions
Instance 2: batch2.csv → predictions
Instance 3: batch3.csv → predictions
↓
S3 Output Files
Output Files¶
For each input file, SageMaker creates an output file:
| Input | Output |
|---|---|
data.csv | data.csv.out |
input.json | input.json.out |
batch_001.txt | batch_001.txt.out |
Output format matches your output_fn implementation.
Performance Optimization¶
Multiple Instances¶
For large datasets, use multiple instances:
# Single instance: processes files sequentially
easy_sm batch-transform --num-instances 1 ...
# Multiple instances: parallel processing
easy_sm batch-transform --num-instances 10 ...
Instance Types¶
Choose based on workload:
| Instance Type | vCPUs | Memory | Best For |
|---|---|---|---|
ml.m5.large | 2 | 8 GB | Small jobs |
ml.m5.xlarge | 4 | 16 GB | Medium jobs |
ml.c5.2xlarge | 8 | 16 GB | CPU-intensive |
ml.p3.2xlarge | 8 | 61 GB + GPU | Deep learning |
File Size Considerations¶
- Small files (<10 MB): Use fewer instances, increase parallelism per instance
- Large files (>100 MB): Use more instances for parallel processing
- Many small files: SageMaker distributes across instances automatically
Monitoring¶
AWS Console¶
Monitor in SageMaker Console: - SageMaker → Inference → Batch transform jobs - View progress, logs, and metrics
AWS CLI¶
# Describe job
aws sagemaker describe-transform-job \
--transform-job-name my-batch-job
# List jobs
aws sagemaker list-transform-jobs \
--sort-by CreationTime \
--sort-order Descending
# View CloudWatch logs
aws logs tail /aws/sagemaker/TransformJobs \
--follow
Check Output¶
After completion, verify output in S3:
aws s3 ls s3://my-bucket/predictions/
# Download predictions
aws s3 cp s3://my-bucket/predictions/ ./predictions/ --recursive
Cost Comparison¶
Batch Transform vs Endpoint¶
Scenario: 10,000 predictions, once per day
Batch Transform (ml.m5.large, 10 minutes): - Cost: 10 min × $0.119/hour / 60 = \(0.0198 per day - Monthly: ~\)0.60
Provisioned Endpoint (ml.m5.large, 24/7): - Cost: 24 hours × $0.119/hour = \(2.86 per day - Monthly: ~\)85
Batch transform is 140x cheaper for periodic jobs!
When to Use Batch Transform¶
✅ Use batch transform for: - Periodic predictions (daily, weekly, monthly) - Large datasets processed offline - One-time scoring jobs - Cost-sensitive workloads - Non-latency-critical applications
❌ Use endpoints for: - Real-time predictions - Interactive applications - Low-latency requirements - Continuous traffic
Use Cases¶
1. Daily Customer Scoring¶
#!/bin/bash
# daily_scoring.sh
export SAGEMAKER_ROLE=arn:aws:iam::123456789012:role/SageMakerRole
# Export customer data daily
# (assume data exported to S3)
# Run batch predictions
easy_sm batch-transform \
-e ml.m5.large \
--num-instances 3 \
-m s3://models/churn-model/model.tar.gz \
-i s3://data/customers/$(date +%Y-%m-%d) \
-o s3://predictions/churn/$(date +%Y-%m-%d) \
--wait
if [ $? -eq 0 ]; then
echo "Predictions complete, loading to database..."
# Load predictions to database
else
echo "Prediction job failed!"
exit 1
fi
2. Large-Scale Offline Scoring¶
# Score 1 million records
easy_sm batch-transform \
-e ml.c5.2xlarge \
--num-instances 20 \
-m s3://bucket/model.tar.gz \
-i s3://bucket/million-records \
-o s3://bucket/scores \
--wait
3. Monthly Financial Forecasts¶
# Run monthly
easy_sm batch-transform \
-n forecast-$(date +%Y-%m) \
-e ml.m5.xlarge \
--num-instances 5 \
-m s3://models/forecast/model.tar.gz \
-i s3://data/historical/$(date +%Y-%m) \
-o s3://forecasts/$(date +%Y-%m) \
--wait
Troubleshooting¶
Job fails immediately¶
Problem: Job goes directly to "Failed" status.
Solution: Check CloudWatch logs:
Common issues: - Model file not found in S3 - Docker image missing in ECR - Serving code errors
Input/output mismatch¶
Problem: Output format doesn't match expected.
Solution: Check your output_fn implementation:
def output_fn(prediction, accept):
# Return consistent format
return json.dumps({"prediction": prediction.tolist()})
Out of memory errors¶
Problem: Instance runs out of memory during inference.
Solution: Use larger instance type:
Job takes too long¶
Problem: Slow processing.
Solution: Increase parallelism:
# Add more instances
easy_sm batch-transform --num-instances 10 ...
# Or use faster instances
easy_sm batch-transform -e ml.c5.2xlarge ...
Missing output files¶
Problem: Some input files don't have corresponding output files.
Solution: Check logs for errors on specific files. Ensure all input files are valid.
"ModelError" in logs¶
Problem: Serving code throws exceptions.
Solution: Test locally first:
easy_sm local deploy
curl -X POST http://localhost:8080/invocations \
-H 'Content-Type: text/csv' \
--data-binary @test_input.csv
Complete Batch Transform Workflow¶
export SAGEMAKER_ROLE=arn:aws:iam::123456789012:role/SageMakerRole
# 1. Train model
MODEL=$(easy_sm train \
-n training-job \
-e ml.m5.large \
-i s3://bucket/training-data \
-o s3://bucket/models)
# 2. Prepare batch input data
easy_sm upload-data \
-i ./batch-input \
-t s3://bucket/batch-input
# 3. Run batch transform
easy_sm batch-transform \
-e ml.m5.large \
--num-instances 3 \
-m $MODEL \
-i s3://bucket/batch-input \
-o s3://bucket/predictions \
--wait
# 4. Download predictions
aws s3 cp s3://bucket/predictions/ ./predictions/ --recursive
# 5. Process predictions
python process_predictions.py ./predictions/
Automated Batch Scoring Pipeline¶
#!/bin/bash
# batch_pipeline.sh
set -e
export SAGEMAKER_ROLE=arn:aws:iam::123456789012:role/SageMakerRole
DATE=$(date +%Y-%m-%d)
echo "Starting batch scoring pipeline for $DATE"
# 1. Export data (your data export logic)
echo "Exporting data..."
python export_data.py --date $DATE --output s3://bucket/daily-data/$DATE/
# 2. Run batch transform
echo "Running batch predictions..."
easy_sm batch-transform \
-n batch-job-$DATE \
-e ml.m5.xlarge \
--num-instances 5 \
-m s3://bucket/models/latest/model.tar.gz \
-i s3://bucket/daily-data/$DATE \
-o s3://bucket/predictions/$DATE \
--wait
# 3. Validate predictions
echo "Validating predictions..."
python validate_predictions.py \
--input s3://bucket/predictions/$DATE \
--output s3://bucket/validated/$DATE
# 4. Load to database
echo "Loading predictions to database..."
python load_to_db.py s3://bucket/validated/$DATE
echo "Pipeline completed successfully!"
Related Commands¶
train- Train models for batch inferencedeploy- Alternative: deploy endpoint for real-time inferenceupload-data- Upload input data to S3