Friday, May 30, 2025

Amazon Sagemaker Studio

Amazon SageMaker Studio is an integrated development environment (IDE) for machine learning that provides everything data scientists and developers need to build, train, and deploy ML models at scale in a unified web-based visual interface.


๐Ÿ” Core Capabilities of SageMaker Studio

CapabilityDescription
Unified InterfaceOne web interface for all stages: data prep, experimentation, training, tuning, deployment, and monitoring.
SageMaker NotebooksJupyter-based, one-click notebooks with elastic compute; kernel and instance lifecycle management.
SageMaker Data WranglerVisual UI to prepare data from various sources with transformations, joins, filters, and analysis.
SageMaker AutopilotAutoML functionality to automatically build, train, and tune ML models.
SageMaker ExperimentsTrack, compare, and visualize ML experiments easily.
SageMaker PipelinesCI/CD orchestration for ML workflows using pre-built or custom steps.
SageMaker Debugger & ProfilerDebug, optimize, and profile training jobs.
Model RegistryCentralized model catalog to register, manage, and version models for deployment.
Real-time & Batch InferenceSupport for real-time endpoints or batch transform jobs.
SageMaker JumpStartAccess to pre-built models and solutions for common ML use cases.


๐Ÿ‘ค Customer Onboarding Use Case: Intelligent Identity Verification

๐ŸŽฏ Objective

Automate the onboarding of customers for a financial institution, including identity verification, fraud detection, and customer classification using ML.

๐Ÿ”ง Steps in the Workflow

  1. Document Upload (via S3 or App API)
  2. Customer uploads a government-issued ID (e.g., passport) and proof of address.

  3. Data Ingestion & Preparation
  4. S3 receives files → SageMaker Studio (via Data Wrangler) cleans and normalizes data.
  5. OCR and image preprocessing done using Amazon Textract or custom image processing.

  6. Model Training
  7. SageMaker Studio Notebooks used to build models:
  8. Fraud Detection Model (binary classification)
  9. Document Authenticity Model (vision model using PyTorch/TensorFlow)
  10. Customer Tier Classification (multi-class ML model)


  11. Model Orchestration
  12. SageMaker Pipelines orchestrate preprocessing, training, evaluation, and registration.

  13. Model Deployment
  14. Real-time inference endpoints deployed using SageMaker Hosting Services.
  15. Registered in Model Registry with approval workflow.

  16. Inference & Feedback Loop
  17. API calls made from the customer portal to SageMaker Endpoints.
  18. Predictions used to drive automated or manual customer approvals.
  19. Results sent to Amazon EventBridge or SNS for notification or audit logging.
  20. Feedback ingested into the training dataset to improve model accuracy over time.


๐Ÿงฉ SageMaker Studio Integrations

ComponentPurpose
Amazon S3Data lake and model/artifact storage
AWS Glue / DataBrewData cataloging and ETL
Amazon RedshiftStructured analytics and model input
Amazon AthenaServerless querying of S3 data
Amazon Textract/ComprehendNLP/OCR support
Amazon ECRContainer storage for custom algorithms
AWS LambdaEvent-driven triggers and preprocessing
Amazon EventBridge/SNS/SQSEventing and pipeline notifications
AWS CloudWatch & CloudTrailMonitoring, logging, auditing
AWS KMS & IAMSecurity, encryption, and fine-grained access control
Lake Formation (optional)Data lake governance and fine-grained data access
Step FunctionsWorkflow orchestration beyond ML pipelines


๐Ÿ“ก Data Dissemination in SageMaker Studio

Data dissemination refers to how data flows through the ML lifecycle—from ingestion to preprocessing, modeling, inference, and feedback.

SageMaker Studio Dissemination Pipeline:

  1. Ingest data via S3, Redshift, Glue, or JDBC sources.
  2. Transform using Data Wrangler or custom notebook-based processing.
  3. Train/Validate using custom models or Autopilot.
  4. Store outputs in S3, Model Registry, or downstream DBs.
  5. Distribute predictions through REST APIs (real-time), batch outputs (to S3), or events (SNS/SQS).
  6. Feedback loop enabled via pipelines and ingestion of labeled results.


๐Ÿ†š SageMaker Studio vs AWS Lake Formation

FeatureSageMaker StudioAWS Lake Formation
Primary PurposeML development & opsSecure data lake governance
Target UsersData scientists, ML engineersData engineers, analysts, compliance teams
UI CapabilitiesJupyter-based, ML-centric IDELake-centric access management
Data Access ControlIAM-based permissionsFine-grained, column/row-level security
Workflow CapabilitiesML pipelines, experimentsData ingestion, transformation, sharing
IntegrationML & AI tools (e.g., Textract, Comprehend)Analytics tools (e.g., Athena, Redshift)
Security FocusNotebook and model access, endpoint policiesEncryption, data lake permissions, audit
Data DisseminationOrchestrated within ML pipelinesGoverned through data lake policies
Ideal Use CaseBuilding, training, deploying modelsCreating secure, centralized data lakes

Summary:

  1. Use SageMaker Studio when your goal is ML model development and operationalization.
  2. Use Lake Formation when your focus is centralized data governance, cross-account sharing, and secure access control for large datasets.


๐Ÿš€ Conclusion

AWS SageMaker Studio empowers ML teams to work faster and more collaboratively by bringing together every piece of the ML lifecycle under one roof. From rapid prototyping to secure, scalable deployment, SageMaker Studio accelerates innovation. When combined with services like Lake Formation and Glue, it enables a secure, end-to-end AI/ML platform that can power modern, intelligent applications such as automated customer onboarding.

If your enterprise aims to bring AI into production with auditability, repeatability, and governance, SageMaker Studio is a foundational element in your AWS-based data science strategy.

Wednesday, May 7, 2025

Deploying Apache Tomcat and Running a WAR File on AWS ECS

Deploying Apache Tomcat and Running a WAR File on AWS ECS

Author: Harvinder Singh Saluja
Tags: #AWS #ECS #Tomcat #DevOps #Java #WARDeployment


Modern applications demand scalable and resilient infrastructure. Apache Tomcat, a popular Java servlet container, can be containerized and deployed on AWS ECS (Elastic Container Service) for high availability and manageability. In this blog, we walk through the end-to-end process of containerizing Tomcat with a custom WAR file and deploying it on AWS ECS using Fargate.


Objective

To deploy a Java .war file under Tomcat on AWS ECS Fargate and access the web application through an Application Load Balancer (ALB).


Prerequisites

  • AWS Account

  • Docker installed locally

  • AWS CLI configured

  • An existing .war file (e.g., myapp.war)

  • Basic understanding of ECS, Docker, and networking on AWS


Step 1: Create Dockerfile for Tomcat + WAR

Create a Dockerfile to extend the official Tomcat image and copy the WAR file into the webapps directory.

# Use official Tomcat base image
FROM tomcat:9.0

# Remove default ROOT webapp
RUN rm -rf /usr/local/tomcat/webapps/ROOT

# Copy custom WAR file
COPY myapp.war /usr/local/tomcat/webapps/ROOT.war

# Expose port
EXPOSE 8080

# Start Tomcat
CMD ["catalina.sh", "run"]

Place this Dockerfile alongside your myapp.war.


Step 2: Build and Push Docker Image to Amazon ECR

  1. Create ECR Repository

aws ecr create-repository --repository-name tomcat-myapp
  1. Authenticate Docker with ECR

aws ecr get-login-password | docker login --username AWS --password-stdin <aws_account_id>.dkr.ecr.<region>.amazonaws.com
  1. Build and Push Docker Image

docker build -t tomcat-myapp .
docker tag tomcat-myapp:latest <aws_account_id>.dkr.ecr.<region>.amazonaws.com/tomcat-myapp:latest
docker push <aws_account_id>.dkr.ecr.<region>.amazonaws.com/tomcat-myapp:latest

Step 3: Setup ECS Cluster and Fargate Service

  1. Create ECS Cluster

aws ecs create-cluster --cluster-name tomcat-cluster
  1. Create Task Definition JSON

Example: task-def.json

{
  "family": "tomcat-task",
  "networkMode": "awsvpc",
  "containerDefinitions": [
    {
      "name": "tomcat-container",
      "image": "<aws_account_id>.dkr.ecr.<region>.amazonaws.com/tomcat-myapp:latest",
      "portMappings": [
        {
          "containerPort": 8080,
          "hostPort": 8080,
          "protocol": "tcp"
        }
      ],
      "essential": true
    }
  ],
  "requiresCompatibilities": ["FARGATE"],
  "cpu": "512",
  "memory": "1024",
  "executionRoleArn": "arn:aws:iam::<account_id>:role/ecsTaskExecutionRole"
}
  1. Register Task Definition

aws ecs register-task-definition --cli-input-json file://task-def.json
  1. Create Security Group & ALB

    • Create a security group allowing HTTP (port 80) and custom port 8080.

    • Create an Application Load Balancer with a target group pointing to port 8080.

  2. Run ECS Fargate Service

aws ecs create-service \
  --cluster tomcat-cluster \
  --service-name tomcat-service \
  --task-definition tomcat-task \
  --desired-count 1 \
  --launch-type FARGATE \
  --network-configuration '{
      "awsvpcConfiguration": {
          "subnets": ["subnet-xxxxxxx"],
          "securityGroups": ["sg-xxxxxxx"],
          "assignPublicIp": "ENABLED"
      }
  }' \
  --load-balancers '[
      {
          "targetGroupArn": "arn:aws:elasticloadbalancing:<region>:<account_id>:targetgroup/<target-group-name>",
          "containerName": "tomcat-container",
          "containerPort": 8080
      }
  ]'

Step 4: Access the Deployed App

Once the ECS service stabilizes, navigate to the DNS name of the ALB (e.g., http://<alb-dns-name>) to access your Java application running on Tomcat.


Troubleshooting Tips

  • WAR not deploying? Make sure it's named ROOT.war if you want it accessible directly at /.

  • Service unhealthy? Confirm security group rules allow traffic on port 8080.

  • Task failing? Check ECS task logs in CloudWatch.


A CloudFormation Template (CFT) was revised to deploy Apache Tomcat on ECS Fargate using private subnets. In this version:

  • Tomcat runs in private subnets

  • Application Load Balancer (ALB) resides in public subnets

  • NAT Gateway is used to allow ECS tasks to access the internet (e.g., for downloading updates)

  • WAR file is pre-packaged into the Docker image

  • Load Balancer forwards traffic to ECS service running in private subnets


tomcat-on-ecs-private.yaml

AWSTemplateFormatVersion: '2010-09-09'
Description: Deploy Tomcat in ECS Fargate with WAR file under private subnets and public-facing ALB

Parameters:
  VpcCidr:
    Type: String
    Default: 10.0.0.0/16
  PublicSubnet1Cidr:
    Type: String
    Default: 10.0.1.0/24
  PublicSubnet2Cidr:
    Type: String
    Default: 10.0.2.0/24
  PrivateSubnet1Cidr:
    Type: String
    Default: 10.0.3.0/24
  PrivateSubnet2Cidr:
    Type: String
    Default: 10.0.4.0/24
  ImageUrl:
    Type: String
    Description: ECR image URL for the Tomcat + WAR image

Resources:

  VPC:
    Type: AWS::EC2::VPC
    Properties:
      CidrBlock: !Ref VpcCidr
      EnableDnsSupport: true
      EnableDnsHostnames: true

  # Subnets
  PublicSubnet1:
    Type: AWS::EC2::Subnet
    Properties:
      VpcId: !Ref VPC
      CidrBlock: !Ref PublicSubnet1Cidr
      AvailabilityZone: !Select [ 0, !GetAZs '' ]
      MapPublicIpOnLaunch: true

  PublicSubnet2:
    Type: AWS::EC2::Subnet
    Properties:
      VpcId: !Ref VPC
      CidrBlock: !Ref PublicSubnet2Cidr
      AvailabilityZone: !Select [ 1, !GetAZs '' ]
      MapPublicIpOnLaunch: true

  PrivateSubnet1:
    Type: AWS::EC2::Subnet
    Properties:
      VpcId: !Ref VPC
      CidrBlock: !Ref PrivateSubnet1Cidr
      AvailabilityZone: !Select [ 0, !GetAZs '' ]

  PrivateSubnet2:
    Type: AWS::EC2::Subnet
    Properties:
      VpcId: !Ref VPC
      CidrBlock: !Ref PrivateSubnet2Cidr
      AvailabilityZone: !Select [ 1, !GetAZs '' ]

  InternetGateway:
    Type: AWS::EC2::InternetGateway

  AttachGateway:
    Type: AWS::EC2::VPCGatewayAttachment
    Properties:
      VpcId: !Ref VPC
      InternetGatewayId: !Ref InternetGateway

  PublicRouteTable:
    Type: AWS::EC2::RouteTable
    Properties:
      VpcId: !Ref VPC

  PublicRoute:
    Type: AWS::EC2::Route
    Properties:
      RouteTableId: !Ref PublicRouteTable
      DestinationCidrBlock: 0.0.0.0/0
      GatewayId: !Ref InternetGateway

  PublicRouteAssoc1:
    Type: AWS::EC2::SubnetRouteTableAssociation
    Properties:
      SubnetId: !Ref PublicSubnet1
      RouteTableId: !Ref PublicRouteTable

  PublicRouteAssoc2:
    Type: AWS::EC2::SubnetRouteTableAssociation
    Properties:
      SubnetId: !Ref PublicSubnet2
      RouteTableId: !Ref PublicRouteTable

  # NAT Gateway setup for private subnets
  EIP:
    Type: AWS::EC2::EIP

  NATGateway:
    Type: AWS::EC2::NatGateway
    Properties:
      AllocationId: !GetAtt EIP.AllocationId
      SubnetId: !Ref PublicSubnet1

  PrivateRouteTable:
    Type: AWS::EC2::RouteTable
    Properties:
      VpcId: !Ref VPC

  PrivateRoute:
    Type: AWS::EC2::Route
    Properties:
      RouteTableId: !Ref PrivateRouteTable
      DestinationCidrBlock: 0.0.0.0/0
      NatGatewayId: !Ref NATGateway

  PrivateRouteAssoc1:
    Type: AWS::EC2::SubnetRouteTableAssociation
    Properties:
      SubnetId: !Ref PrivateSubnet1
      RouteTableId: !Ref PrivateRouteTable

  PrivateRouteAssoc2:
    Type: AWS::EC2::SubnetRouteTableAssociation
    Properties:
      SubnetId: !Ref PrivateSubnet2
      RouteTableId: !Ref PrivateRouteTable

  # Security Group
  ECSSecurityGroup:
    Type: AWS::EC2::SecurityGroup
    Properties:
      GroupDescription: Allow inbound traffic from ALB
      VpcId: !Ref VPC
      SecurityGroupIngress:
        - IpProtocol: tcp
          FromPort: 8080
          ToPort: 8080
          SourceSecurityGroupId: !Ref ALBSecurityGroup

  ALBSecurityGroup:
    Type: AWS::EC2::SecurityGroup
    Properties:
      GroupDescription: Allow HTTP from internet
      VpcId: !Ref VPC
      SecurityGroupIngress:
        - IpProtocol: tcp
          FromPort: 80
          ToPort: 80
          CidrIp: 0.0.0.0/0

  # ALB
  LoadBalancer:
    Type: AWS::ElasticLoadBalancingV2::LoadBalancer
    Properties:
      Subnets: [!Ref PublicSubnet1, !Ref PublicSubnet2]
      SecurityGroups: [!Ref ALBSecurityGroup]
      Scheme: internet-facing
      Type: application

  TargetGroup:
    Type: AWS::ElasticLoadBalancingV2::TargetGroup
    Properties:
      Port: 8080
      Protocol: HTTP
      VpcId: !Ref VPC
      TargetType: ip
      HealthCheckPath: /
      HealthCheckPort: 8080

  Listener:
    Type: AWS::ElasticLoadBalancingV2::Listener
    Properties:
      DefaultActions:
        - Type: forward
          TargetGroupArn: !Ref TargetGroup
      LoadBalancerArn: !Ref LoadBalancer
      Port: 80
      Protocol: HTTP

  ECSCluster:
    Type: AWS::ECS::Cluster

  ECSTaskExecutionRole:
    Type: AWS::IAM::Role
    Properties:
      AssumeRolePolicyDocument:
        Version: '2012-10-17'
        Statement:
          - Effect: Allow
            Principal:
              Service: ecs-tasks.amazonaws.com
            Action: sts:AssumeRole
      ManagedPolicyArns:
        - arn:aws:iam::aws:policy/service-role/AmazonECSTaskExecutionRolePolicy

  TaskDefinition:
    Type: AWS::ECS::TaskDefinition
    Properties:
      Family: tomcat-task
      Cpu: 512
      Memory: 1024
      NetworkMode: awsvpc
      RequiresCompatibilities: [FARGATE]
      ExecutionRoleArn: !GetAtt ECSTaskExecutionRole.Arn
      ContainerDefinitions:
        - Name: tomcat-container
          Image: !Ref ImageUrl
          PortMappings:
            - ContainerPort: 8080
          Essential: true

  ECSService:
    Type: AWS::ECS::Service
    DependsOn: Listener
    Properties:
      Cluster: !Ref ECSCluster
      DesiredCount: 1
      LaunchType: FARGATE
      TaskDefinition: !Ref TaskDefinition
      NetworkConfiguration:
        AwsvpcConfiguration:
          AssignPublicIp: DISABLED
          Subnets: [!Ref PrivateSubnet1, !Ref PrivateSubnet2]
          SecurityGroups: [!Ref ECSSecurityGroup]
      LoadBalancers:
        - TargetGroupArn: !Ref TargetGroup
          ContainerName: tomcat-container
          ContainerPort: 8080

Outputs:
  ALBDNS:
    Description: DNS of the Application Load Balancer
    Value: !GetAtt LoadBalancer.DNSName

Deploy Instructions

  1. Save this file as tomcat-on-ecs-private.yaml

  2. Deploy using AWS CLI:

aws cloudformation deploy \
  --template-file tomcat-on-ecs-private.yaml \
  --stack-name tomcat-private-ecs \
  --capabilities CAPABILITY_NAMED_IAM \
  --parameter-overrides ImageUrl=<your-ecr-image-url>
  1. Once stack creation is complete, access the application via the ALB DNS output.


Would you like this exported to PDF, or want a GitHub Actions pipeline to automate container builds and deployments?


Conclusion

You’ve now deployed a containerized Tomcat server running a WAR application to AWS ECS using Fargate. This setup abstracts away server management, allowing you to focus on your application logic while AWS handles the infrastructure.


 

Friday, May 2, 2025

Case study and tutorial for Amazon SageMaker Studio (Unified Experience)

Case study and tutorial for Amazon SageMaker Studio (Unified Experience), designed to help enterprise teams, data scientists, and ML engineers understand its capabilities, features, and implementation through a real-world example.


๐Ÿง  Case Study: Predicting Loan Defaults Using Amazon SageMaker Studio (Unified Experience)

๐Ÿข Client Profile

Company: Condifential 
Industry: Financial Services
Objective: To build an end-to-end machine learning pipeline to predict loan default risks using Amazon SageMaker Studio Unified Experience.


๐ŸŽฏ Business Challenge

The company needed:

  • A collaborative, scalable, and secure ML environment

  • Model versioning and experimentation tracking

  • Integration with RDS, S3, and CI/CD workflows

  • Compliance with data governance and role-based access control (RBAC)


✅ Why Amazon SageMaker Studio (Unified Experience)?

  • Unified interface for data wrangling, experimentation, model building, deployment, and monitoring

  • Built-in JupyterLab & SageMaker JumpStart

  • MLOps integration with SageMaker Pipelines, Model Registry

  • Custom image support for enterprise tools like scikit-learn, PyTorch, TensorFlow

  • IAM-based access controls via SageMaker Domain


๐Ÿ› ️ Architecture Overview

              +-------------------------+
              |      Amazon S3          | <-- Raw Loan Data
              +-------------------------+
                         |
                         v
               +--------------------+
               | Amazon SageMaker   |
               |  Studio (Unified)  |
               +--------------------+
                  |     |      |
   +--------------+     |      +---------------------+
   |                    |                            |
Data Wrangler     SageMaker Pipelines        SageMaker Experiments
(Data Prep)       (ETL + Train + Deploy)      (Track Models & Metrics)
   |                    |                            |
   +--------------------+----------------------------+
                         |
                         v
               +---------------------------+
               |  SageMaker Model Registry |
               +---------------------------+
                         |
                         v
               +---------------------+
               | SageMaker Endpoints|
               +---------------------+
                         |
                         v
                +------------------+
                | Client App (UI)  |
                +------------------+

๐Ÿงช Step-by-Step Tutorial: ML Pipeline with SageMaker Studio

๐Ÿ”น 1. Set Up SageMaker Studio

  1. Go to the AWS Console → SageMaker → “SageMaker Domain” → Create Domain

  2. Use IAM authentication, enable default SageMaker Studio settings

  3. Create a User Profile with execution roles attached (AmazonSageMakerFullAccess, S3FullAccess, RDSReadOnly etc.)


๐Ÿ”น 2. Launch SageMaker Studio

  1. Select the created user → “Launch Studio”

  2. Choose Kernel → Python 3 (Data Science)

  3. Start a new Jupyter notebook


๐Ÿ”น 3. Data Ingestion & Exploration

import boto3
import pandas as pd

# Load from S3
s3_bucket = 's3://trustfund-data/loan-defaults.csv'
df = pd.read_csv(s3_bucket)

# Quick stats
df.describe()
df['default'].value_counts()

๐Ÿ”น 4. Data Preparation with SageMaker Data Wrangler

  1. Open Data Wrangler from Studio UI

  2. Import S3 dataset → Profile the data

  3. Add transforms: handle nulls, encode categorical, normalize

  4. Export flow to SageMaker Pipeline (generates .flow and .pipeline.py)


๐Ÿ”น 5. Build Training Script (train.py)

from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import train_test_split
import joblib
import pandas as pd

df = pd.read_csv('loan-defaults.csv')
X = df.drop('default', axis=1)
y = df['default']

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)

model = RandomForestClassifier(n_estimators=100)
model.fit(X_train, y_train)

joblib.dump(model, 'model.joblib')

๐Ÿ”น 6. Create and Run a SageMaker Pipeline

from sagemaker.workflow.pipeline import Pipeline
from sagemaker.workflow.steps import ProcessingStep, TrainingStep, ModelStep
from sagemaker.sklearn.processing import SKLearnProcessor
from sagemaker.sklearn.estimator import SKLearn

# Setup processor
sklearn_processor = SKLearnProcessor(
    framework_version='0.23-1',
    role='SageMakerRole',
    instance_type='ml.m5.xlarge',
    instance_count=1
)

# Define pipeline steps
step_process = ProcessingStep(...)
step_train = TrainingStep(...)
step_register = ModelStep(...)

pipeline = Pipeline(
    name="LoanDefaultPipeline",
    steps=[step_process, step_train, step_register]
)
pipeline.upsert(role_arn="SageMakerRole")
pipeline.start()

๐Ÿ”น 7. Deploy Model to Endpoint

from sagemaker.model import Model

model = Model(
    model_data='s3://.../model.tar.gz',
    role='SageMakerRole',
    entry_point='inference.py'
)

predictor = model.deploy(instance_type='ml.m5.large', initial_instance_count=1)

๐Ÿ”น 8. Monitor and Retrain

Use:

  • SageMaker Model Monitor for drift detection

  • SageMaker Pipelines to automate retraining on new data


๐Ÿ“Š Results

Metric Value
AUC 0.91
Accuracy 88.4%
Training Time ~3 minutes
Retrain Schedule Weekly

๐Ÿ›ก️ Security & Governance

  • IAM roles enforced per user profile

  • Audit trail via CloudTrail + SageMaker lineage tracking

  • Data encryption at rest and in transit (KMS)


๐Ÿ”š Summary

Amazon SageMaker Studio Unified Experience empowers enterprises to:

  • Consolidate ML workflows in one secure UI

  • Integrate data prep, experimentation, model registry, and CI/CD

  • Boost productivity with reusable components

Would you like a downloadable diagram or sample repo structure for this use case?

Wednesday, April 2, 2025

Terraform Infrastructure as Code (IaC) for AWS

 Terraform is an Infrastructure as Code (IaC) tool that enables you to provision and manage AWS infrastructure using a declarative configuration language (HCL - HashiCorp Configuration Language). A well-structured Terraform setup for provisioning AWS resources typically follows a modular, organized layout to promote reusability, maintainability, and scalability.

Here’s a high-level structure of a typical Terraform project to provision AWS infrastructure:


๐Ÿ”ง 1. Directory Structure


terraform-aws-infra/ │ ├── main.tf # Entry point, includes root resources and module calls ├── variables.tf # Input variable definitions ├── outputs.tf # Output values to export useful information ├── providers.tf # AWS provider configuration and backend settings ├── terraform.tfvars # Actual variable values for a specific environment ├── versions.tf # Terraform and provider version constraints │ ├── modules/ # Reusable modules (VPC, EC2, RDS, S3, etc.) │ ├── vpc/ │ │ ├── main.tf │ │ ├── variables.tf │ │ └── outputs.tf │ ├── ec2/ │ ├── rds/ │ └── s3/ │ └── envs/ # Environment-specific configuration (dev, prod, etc.) ├── dev/ │ ├── main.tf │ └── terraform.tfvars └── prod/ ├── main.tf └── terraform.tfvars

๐Ÿ› ️ 2. Key Files Explained

main.tf

  • Defines AWS resources or calls reusable modules.

  • Example:

module "vpc" { source = "./modules/vpc" cidr_block = var.vpc_cidr region = var.aws_region }

variables.tf

  • Defines inputs used across resources/modules.

variable "aws_region" { description = "AWS region" type = string default = "us-west-2" }

outputs.tf

  • Defines values to export (e.g., VPC ID, public IP).

output "vpc_id" { value = module.vpc.vpc_id }

providers.tf

  • Sets up the AWS provider and optionally backend for state management.

provider "aws" { region = var.aws_region } terraform { backend "s3" { bucket = "my-terraform-state" key = "dev/terraform.tfstate" region = "us-west-2" } }

terraform.tfvars

  • Provides real values for declared variables (not committed to Git ideally).

aws_region = "us-west-2" vpc_cidr = "10.0.0.0/16"

versions.tf

  • Locks Terraform and provider versions for consistency.

terraform { required_version = ">= 1.5.0" required_providers { aws = { source = "hashicorp/aws" version = "~> 5.0" } } }

๐Ÿ“ฆ 3. Modules

Modules help you encapsulate related resources and reuse them.

Example: modules/vpc/main.tf

resource "aws_vpc" "main" { cidr_block = var.cidr_block tags = { Name = "main-vpc" } }

modules/vpc/variables.tf

variable "cidr_block" { type = string }

modules/vpc/outputs.tf

output "vpc_id" { value = aws_vpc.main.id }

๐ŸŒฑ 4. Environments (Optional)

Use separate folders under envs/ to customize configurations for dev, staging, or prod.


✅ 5. Best Practices

  • Use remote backend (like S3 + DynamoDB) for state file management.

  • Use .tfvars and terraform.workspace for environment separation.

  • Keep secrets in AWS Secrets Manager or use sops/Vault.

  • Format and validate regularly: terraform fmt and terraform validate.

  • Use terraform plan before apply.

Friday, March 7, 2025

Python script to read files from a drive mapped from AWS Storage Gateway.

Python script to read files from a drive mapped from AWS Storage Gateway. Assuming the drive is mapped to a local directory (e.g., Z:/ on Windows or /mnt/storage_gateway/ on Linux), the script will list the files and read their contents.

Steps:

  1. Ensure your mapped drive is accessible.
  2. Update the MAPPED_DRIVE_PATH variable accordingly.
  3. Run the script locally.

Python Code:

python

import os # Set the mapped drive path (Update this based on your system) MAPPED_DRIVE_PATH = "Z:/" # Example for Windows # MAPPED_DRIVE_PATH = "/mnt/storage_gateway/" # Example for Linux def list_files_in_directory(directory): """List all files in the given directory.""" try: files = os.listdir(directory) print(f"Files in '{directory}':") for file in files: print(file) return files except Exception as e: print(f"Error listing files in directory '{directory}': {e}") return [] def read_file_content(file_path): """Read and print the content of a file.""" try: with open(file_path, "r", encoding="utf-8") as file: content = file.read() print(f"\nContent of {file_path}:\n{content}") except Exception as e: print(f"Error reading file '{file_path}': {e}") def main(): """Main function to list and read files from the mapped drive.""" if not os.path.exists(MAPPED_DRIVE_PATH): print(f"Error: The mapped drive '{MAPPED_DRIVE_PATH}' is not accessible.") return files = list_files_in_directory(MAPPED_DRIVE_PATH) # Read the first file as a sample (Modify as needed) if files: first_file = os.path.join(MAPPED_DRIVE_PATH, files[0]) if os.path.isfile(first_file): read_file_content(first_file) else: print(f"'{first_file}' is not a file.") if __name__ == "__main__": main()

How It Works:

  • Lists files in the mapped drive directory.
  • Reads and prints the content of the first file (modify as needed).
  • Handles errors gracefully if the drive is inaccessible or the file cannot be read.

Dependencies:

  • Ensure the mapped drive is accessible before running the script.
  • This script reads text files (.txt, .csv, etc.). For binary files, modify the read_file_content function.

Sunday, January 5, 2025

Use SSH Keys to clone GIT Repository using SSH

 

1. Generate a New SSH Key Pair

bash

ssh-keygen -t rsa -b 4096 -C "HSingh@MindTelligent.com"
  • -t rsa specifies the type of key (RSA in this case).
  • -b 4096 sets the number of bits for the key length (4096 is more secure).
  • -C "HSingh@MindTelligent.com" adds a comment (usually your email) to help identify the key.

2. Save the Key Files

  • You will be prompted to enter a file name and location to save the key pair:
    bash

    Enter file in which to save the key (/home/user/.ssh/id_rsa):
    • Press Enter to save it in the default location (~/.ssh/id_rsa).
    • Or specify a custom path if you want multiple keys.

3. Set a Passphrase (Optional)

You will be asked:



Enter passphrase (empty for no passphrase):
  • Enter a passphrase for extra security, or press Enter for no passphrase.

4. View the Public Key

bash

cat ~/.ssh/id_rsa.pub

This will display the public key, which you can copy to add to remote servers or platforms like GitHub, GitLab, or AWS.


5. Add Key to SSH Agent (Optional for Convenience)

bash

eval "$(ssh-agent -s)" # Start the SSH agent
ssh-add ~/.ssh/id_rsa # Add the key to the agent

 

6. Add SSH Key to Git Hosting Provider

  • GitHub: Go to Settings → SSH and GPG keys → New SSH Key and paste the contents of your public key:
bash

cat ~/.ssh/id_rsa.pub
  • GitLab/Bitbucket: Follow similar steps under SSH Keys settings.

7. Test SSH Connection

Test the SSH connection to your hosting provider:

  • For GitHub:
bash

ssh -T git@github.com
  • For GitLab:
bash

ssh -T git@gitlab.com

You should see a success message, e.g.:

rust
Hi username! You've successfully authenticated.

8. Clone Repository Using SSH

Copy the SSH URL of the repository from the hosting provider. It looks like:

scss
git@github.com:username/repo.git

Then, clone it:

bash
git clone git@github.com:username/repo.git

Tuesday, December 24, 2024

Microsoft Dynamics Deployment on AWS

 Microsoft Dynamics Deployment on AWS provides businesses with a scalable and secure cloud infrastructure to host and manage Microsoft Dynamics 365 applications. These applications include Business Central, Finance, Supply Chain Management, and Sales, among others. AWS offers flexibility for both Infrastructure-as-a-Service (IaaS) and Platform-as-a-Service (PaaS) models for deployment.


1. Deployment Models for Microsoft Dynamics on AWS

  1. Infrastructure-as-a-Service (IaaS):

    • Use Amazon EC2 instances to host Microsoft Dynamics applications.
    • Configure Windows Server and SQL Server databases.
    • Ideal for businesses that need full control of the infrastructure.
  2. Hybrid Cloud Deployment:

    • Integrates AWS-hosted Dynamics with on-premises systems.
    • Useful for businesses transitioning to the cloud in phases.
  3. Fully Cloud-Native Deployment:

    • Deploy Dynamics 365 using AWS services like RDS, Elastic Load Balancer (ELB), Auto Scaling, and CloudFormation.
    • Best for scalability and operational efficiency.

2. Key AWS Services for Microsoft Dynamics 365 Deployment

  • Compute:

    • Amazon EC2 – Hosts Windows Server and Dynamics services.
    • AWS Auto Scaling – Automatically scales EC2 instances based on demand.
  • Database:

    • Amazon RDS for SQL Server – Fully managed relational database service.
    • Amazon Aurora – For higher performance with SQL compatibility.
  • Storage:

    • Amazon S3 – For backups, logs, and files.
    • Amazon FSx for Windows File Server – Fully managed shared file systems.
  • Networking:

    • Amazon VPC – Isolates Dynamics deployments in a private network.
    • AWS Direct Connect – Ensures secure, low-latency connectivity with on-premises environments.
  • Security:

    • AWS IAM – Manages access permissions and user roles.
    • AWS Shield and WAF – Protects against DDoS and web attacks.
  • Monitoring and Analytics:

    • Amazon CloudWatch – Monitors system performance.
    • AWS CloudTrail – Tracks API usage and logs user activity.
    • Amazon QuickSight – BI tool for reporting and dashboards.

3. Steps to Deploy Microsoft Dynamics 365 on AWS

Step 1: Infrastructure Preparation

  1. Create VPC: Configure subnets, routing tables, and security groups.
  2. Set Up IAM Roles: Define access controls and permissions for EC2 instances.
  3. Provision EC2 Instances:
    • Choose Windows Server with required configurations.
    • Attach EBS volumes for storage.
  4. Configure Network Security:
    • Enable firewalls and VPN connections.
    • Use AWS Direct Connect for hybrid setups.

Step 2: Database Setup

  1. Use Amazon RDS for SQL Server or EC2-hosted SQL Server.
  2. Configure failover clusters for high availability.
  3. Optimize performance with Read Replicas and Elasticache for caching.

Step 3: Application Deployment

  1. Install Microsoft Dynamics 365 Software:

    • Transfer installation media to AWS instances using S3 or AWS Transfer Family.
    • Connect to the server via RDP and install the application.
  2. Connect to SQL Database:

    • Configure Dynamics to communicate with SQL Server hosted on RDS or EC2.
  3. Configure Load Balancers:

    • Deploy Elastic Load Balancer (ELB) for distributing traffic.
  4. Enable Scaling:

    • Use Auto Scaling Groups to handle dynamic workloads.

Step 4: Testing and Optimization

  1. Validate the setup with test transactions.
  2. Use AWS CloudWatch to monitor performance and detect anomalies.
  3. Optimize configurations for network latency and response times.

Step 5: Security and Compliance

  1. Enable AWS Key Management Service (KMS) for encryption.
  2. Apply Security Groups and WAF rules to control access.
  3. Configure Multi-Factor Authentication (MFA) for admin users.
  4. Enable auditing via CloudTrail for compliance tracking.

Step 6: Backup and Disaster Recovery

  1. Schedule automatic backups using AWS Backup.
  2. Enable cross-region replication for disaster recovery.
  3. Implement snapshot-based backups for databases.

4. Integration with Microsoft Services

  • Azure Active Directory:
    • Use AWS Directory Service or integrate with Azure AD for authentication.
  • Office 365 Integration:
    • Connect Dynamics with Microsoft 365 apps like Outlook and Teams.
  • Power BI:
    • Connect Dynamics data to Power BI for advanced analytics and reporting.

5. High Availability and Scalability

  • Use Multi-AZ RDS for SQL Server to ensure database failover.
  • Deploy instances across multiple AWS Availability Zones (AZs).
  • Enable Auto Scaling for application and web servers.

6. Licensing Options

  1. Bring Your Own License (BYOL):

    • Use existing Microsoft licenses on AWS.
  2. License Included Instances:

    • AWS provides pre-configured Windows and SQL Server licenses.
  3. AWS Marketplace Subscriptions:

    • Access Dynamics configurations directly from the AWS Marketplace.

7. Monitoring and Maintenance

  • Use AWS Systems Manager to automate patching and updates.
  • Monitor system performance via Amazon CloudWatch.
  • Schedule regular maintenance windows for updates and optimizations.

8. Benefits of Deploying Dynamics 365 on AWS

  1. Flexibility and Scalability:

    • Easily scale resources based on workload demands.
  2. High Availability and Reliability:

    • Multi-AZ support ensures fault tolerance.
  3. Security and Compliance:

    • Built-in encryption, IAM roles, and auditing features.
  4. Cost Optimization:

    • Pay-as-you-go pricing reduces capital expenses.
  5. Integration with AWS Services:

    • Leverage Lambda, S3, and Redshift for extended functionality.

Conclusion

AWS provides a robust and scalable platform to deploy Microsoft Dynamics 365 with flexible deployment models, high availability, and advanced security features. Businesses can use AWS to integrate Dynamics with other AWS services, ensuring seamless performance, monitoring, and disaster recovery while optimizing costs.

Amazon Sagemaker Studio

Amazon SageMaker Studio is an integrated development environment (IDE) for machine learning that provides everything data scientists and dev...