Configuration¶
easy_sm uses JSON configuration files and environment variables to manage project settings.
Configuration File¶
Each project has a JSON configuration file named {app-name}.json in the project root.
Example Configuration¶
{
"image_name": "my-ml-app",
"aws_profile": "dev",
"aws_region": "eu-west-1",
"python_version": "3.13",
"easy_sm_module_dir": "my-ml-app",
"requirements_dir": "requirements.txt"
}
Configuration Fields¶
| Field | Description | Example |
|---|---|---|
image_name | Docker image name (used for ECR) | my-ml-app |
aws_profile | AWS CLI profile name | dev, prod |
aws_region | AWS region for SageMaker operations | eu-west-1, us-east-1 |
python_version | Python version for Docker image | 3.13, 3.12 |
easy_sm_module_dir | Directory containing easy_sm_base/ | my-ml-app |
requirements_dir | Path to requirements file | requirements.txt |
Auto-Detection¶
Most commands auto-detect the configuration file:
# Automatically finds my-ml-app.json in current directory
easy_sm build
easy_sm train -n job-name -e ml.m5.large -i s3://... -o s3://...
You can override with the -a/--app-name flag:
Multiple Config Files
If multiple *.json files exist in the current directory, easy_sm will fail. Either:
- Remove extra JSON files
- Use
-aflag to specify which app to use
Environment Variables¶
SAGEMAKER_ROLE (Required)¶
The IAM role ARN for SageMaker operations:
Persist Across Sessions
Add to ~/.bashrc or ~/.zshrc:
You can override with the -r/--iam-role-arn flag:
AWS Credentials¶
easy_sm uses the standard AWS credential chain:
- Environment variables (
AWS_ACCESS_KEY_ID,AWS_SECRET_ACCESS_KEY) - AWS CLI credentials file (
~/.aws/credentials) - IAM role (when running on EC2/ECS)
The aws_profile in the config file specifies which profile to use from ~/.aws/credentials.
Project Structure¶
After running easy_sm init, your project structure looks like:
my-project/
├── my-app.json # Configuration
├── requirements.txt # Python dependencies
└── my-app/ # Module directory
└── easy_sm_base/ # Template directory
├── Dockerfile # Customize if needed
├── training/
│ ├── train # Entry point (shell script)
│ └── training.py # Your training code
├── prediction/
│ └── serve # Your serving code
├── processing/ # Processing scripts
└── local_test/
└── test_dir/ # Test data for local runs
├── input/ # Input data
│ └── data/
│ └── training/
└── model/ # Model output
Dockerfile Customization¶
The default Dockerfile is generated during easy_sm init. You can customize it for:
- Installing system dependencies
- Adding custom build steps
- Configuring environment variables
Example customization:
FROM python:3.13
# Install system dependencies
RUN apt-get update && apt-get install -y \
libgomp1 \
&& rm -rf /var/lib/apt/lists/*
# Copy requirements and install
COPY requirements.txt /opt/program/requirements.txt
RUN pip install --no-cache-dir -r /opt/program/requirements.txt
# Copy code
COPY training /opt/program/training
COPY prediction /opt/program/prediction
COPY processing /opt/program/processing
# Set environment variables
ENV PYTHONUNBUFFERED=TRUE
ENV PYTHONDONTWRITEBYTECODE=TRUE
WORKDIR /opt/program
Maintain Entry Points
Keep the original entry point scripts (training/train, prediction/serve) or SageMaker won't work correctly.
Requirements File¶
The requirements.txt file lists Python dependencies:
Docker Tags¶
Control Docker image versions with the --docker-tag flag:
# Build with custom tag
easy_sm --docker-tag v1.0 build
# Use tagged image for training
easy_sm --docker-tag v1.0 local train
# Push tagged image
easy_sm --docker-tag v1.0 push
Default tag is latest.
Versioning Strategy
Use semantic versioning for production:
Multiple Environments¶
Manage multiple environments (dev, staging, prod) with separate config files:
my-project/
├── my-app-dev.json # Dev environment
├── my-app-staging.json # Staging environment
├── my-app-prod.json # Production environment
└── my-app/
└── easy_sm_base/
Use the -a flag to select environment:
# Dev
easy_sm build -a my-app-dev
easy_sm train -a my-app-dev -n dev-job ...
# Production
easy_sm build -a my-app-prod
easy_sm train -a my-app-prod -n prod-job ...
Configuration Validation¶
easy_sm validates configuration on each command:
- App name: Alphanumeric, hyphens, underscores only (prevents path traversal)
- Config file: Must exist and contain valid JSON
- Required fields: All fields must be present
- IAM role: Must be set via env var or
-rflag
If validation fails, you'll see an error message:
Error: Configuration file 'my-app.json' not found
Error: Invalid app name: '../../../etc/passwd'
Error: SAGEMAKER_ROLE environment variable not set
Next Steps¶
- Learn about local development
- Explore cloud deployment
- See AWS setup requirements