Advanced Python Core Concepts and Applications in AI : A Comprehensive Guide for AI Engineers

Illustration of advanced Python programming concepts with machine learning, deep learning, and AI engineering applications. — A complete guide to advanced Python concepts and their applications in real-world AI engineering, machine learning, and automation.

Advanced Python Core Concepts and Applications in AI : A Comprehensive Guide for AI Engineers

Table of Contents

1. Introduction to Python in AI

1.1 Why Python Dominates AI/ML/DL

Python has become the de facto language for artificial intelligence, machine learning, and data science for several compelling reasons:

Simplicity and Readability: Python’s syntax resembles natural language, making it accessible to researchers who can focus on algorithms rather than language complexity. This allows rapid prototyping and experimentation, which is crucial in research environments.

Rich Ecosystem: The Python Package Index (PyPI) hosts over 400,000 packages, with specialized libraries for every aspect of AI development. This ecosystem means you rarely need to build from scratch.

Community and Industry Support: Major tech companies (Google, Facebook, Microsoft, OpenAI) have invested heavily in Python-based AI tools. TensorFlow, PyTorch, scikit-learn, and Hugging Face Transformers are all Python-first.

Interoperability: Python seamlessly integrates with C/C++ for performance-critical operations, allowing high-level ease with low-level speed when needed.

1.2 The AI Development Workflow

Understanding the typical AI development pipeline helps contextualize where Python fits:

Data Collection → Data Preprocessing → Feature Engineering →

Model Selection → Training → Evaluation → Deployment → Monitoring

Python excels at each stage:

Data Collection: Web scraping (BeautifulSoup, Scrapy), API integration (requests)
Preprocessing: Pandas for cleaning, NumPy for numerical operations
Feature Engineering: Scikit-learn pipelines, custom transformations
Model Development: PyTorch, TensorFlow, scikit-learn
Deployment: Flask, FastAPI, Docker integration
Monitoring: MLflow, Weights & Biases

Diagram showing the architecture of a Large Language Model with tokenizer, transformer blocks, self-attention layers, embeddings, and end-to-end implementation pipeline — A complete overview of how Large Language Models work — from data pipelines and transformer architecture to training, fine-tuning, and deployment.

1.3 Setting Up Your Environment

Professional AI development requires a properly configured environment:

# Using conda for environment management

conda create -n ai_env python=3.10

conda activate ai_env

# Install core packages

pip install numpy pandas matplotlib seaborn

pip install scikit-learn torch torchvision

pip install jupyter notebook ipython

# For GPU support (CUDA-enabled machines)

pip install torch torchvision torchaudio –index-url https://download.pytorch.org/whl/cu118

Best Practice: Always use virtual environments to isolate project dependencies and maintain reproducibility.

2. Python Fundamentals for AI

2.1 Data Types and Structures

Python’s built-in data structures are the foundation for more complex AI operations.

2.1.1 Lists: Dynamic Arrays

Lists are mutable, ordered collections that form the basis of many data processing operations.

# Creating and manipulating lists

features = [1.2, 3.4, 5.6, 7.8]

labels = [‘cat’, ‘dog’, ‘bird’]

# List comprehensions – efficient and Pythonic

squared = [x**2 for x in range(10)]

filtered = [x for x in features if x > 3.0]

# Nested list comprehensions for matrix operations

matrix = [[i*j for j in range(5)] for i in range(5)]

AI Application: Lists are used for batch processing, storing training samples, and collecting predictions.

2.1.2 Dictionaries: Key-Value Mappings

Dictionaries provide O(1) average-case lookup, essential for caching and configuration management.

# Model configuration dictionary

model_config = {

‘learning_rate’: 0.001,

‘batch_size’: 32,

‘epochs’: 100,

‘optimizer’: ‘adam’,

‘layers’: [128, 64, 32]

}

# Dictionary comprehensions

squared_dict = {x: x**2 for x in range(10)}

# Nested dictionaries for experiment tracking

experiments = {

‘exp_001’: {

‘accuracy’: 0.95,

‘loss’: 0.05,

‘hyperparams’: model_config

}

AI Application: Hyperparameter storage, model checkpoints, JSON API responses.

2.1.3 Tuples and Sets

# Tuples: Immutable sequences (useful for constants)

image_shape = (224, 224, 3) # Height, Width, Channels

train_test_split = (0.8, 0.2)

# Sets: Unique elements (useful for vocabulary)

vocab = set([‘hello’, ‘world’, ‘ai’, ‘machine’, ‘learning’])

unique_labels = set(labels)

2.2 Functions and Lambda Expressions

Functions are the building blocks of modular, reusable code.

2.2.1 Function Definitions

def preprocess_image(image, target_size=(224, 224)):

“””

Preprocess an image for model input.

Args:

image: Input image array

target_size: Tuple of (height, width)

Returns:

Preprocessed image array

“””

# Resize, normalize, etc.

return processed_image

# Type hints for better code documentation

def calculate_accuracy(predictions: list, targets: list) -> float:

correct = sum(p == t for p, t in zip(predictions, targets))

return correct / len(targets)

2.2.2 Lambda Functions

Lambda functions are anonymous functions useful for simple operations.

# Sorting by custom key

students = [(‘Alice’, 85), (‘Bob’, 92), (‘Charlie’, 78)]

sorted_students = sorted(students, key=lambda x: x[1], reverse=True)

# Map, filter, reduce patterns

data = [1, 2, 3, 4, 5]

normalized = list(map(lambda x: x / max(data), data))

even_only = list(filter(lambda x: x % 2 == 0, data))

AI Application: Custom loss functions, data transformations, callback functions.

2.3 Control Flow and Iteration

2.3.1 Advanced Iteration Patterns

# Enumerate for index tracking

for idx, value in enumerate(training_data):

print(f”Processing sample {idx}: {value}”)

# Zip for parallel iteration

features = [1, 2, 3]

labels = [‘a’, ‘b’, ‘c’]

for feat, label in zip(features, labels):

print(f”Feature: {feat}, Label: {label}”)

# Itertools for advanced iteration

from itertools import combinations, product

# Generate all pairs for similarity computation

pairs = list(combinations(items, 2))

# Grid search parameter combinations

param_grid = {

‘lr’: [0.001, 0.01],

‘batch_size’: [16, 32]

}

configs = [dict(zip(param_grid.keys(), v))

for v in product(*param_grid.values())]

2.4 File I/O and Data Loading

import json

import pickle

# JSON for configuration files

with open(‘config.json’, ‘r’) as f:

config = json.load(f)

# Pickle for Python objects

with open(‘model.pkl’, ‘wb’) as f:

pickle.dump(trained_model, f)

# Reading large files efficiently

def read_large_file(filepath):

with open(filepath, ‘r’) as f:

for line in f: # Memory-efficient line-by-line

yield line.strip()

# Context managers for resource management

class DataLoader:

def __enter__(self):

self.data = load_data()

return self

def __exit__(self, exc_type, exc_val, exc_tb):

self.cleanup()

3. Advanced Python Core Concepts

3.1 Decorators: Metaprogramming for AI

Decorators modify function behavior without changing their code, essential for logging, timing, and caching.

3.1.1 Basic Decorators

import time

from functools import wraps

def timer(func):

“””Measure execution time of a function.”””

@wraps(func)

def wrapper(*args, **kwargs):

start = time.time()

result = func(*args, **kwargs)

end = time.time()

print(f”{func.__name__} took {end – start:.4f} seconds”)

return result

return wrapper

@timer

def train_model(epochs):

# Training logic

time.sleep(2) # Simulating training

return “Model trained”

# Usage

train_model(100)

3.1.2 Parameterized Decorators

def repeat(times):

“””Repeat function execution.”””

def decorator(func):

@wraps(func)

def wrapper(*args, **kwargs):

results = []

for _ in range(times):

results.append(func(*args, **kwargs))

return results

return wrapper

return decorator

@repeat(times=3)

def train_with_different_seeds():

# Train with random initialization

return accuracy

# Caching for expensive computations

from functools import lru_cache

@lru_cache(maxsize=128)

def compute_similarity(vec1, vec2):

“””Cached similarity computation.”””

return cosine_similarity(vec1, vec2)

AI Application: Performance monitoring, experiment tracking, memoization of expensive operations.

3.2 Generators and Iterators

Generators provide memory-efficient iteration, crucial for processing large datasets.

3.2.1 Generator Functions

def data_generator(filepath, batch_size=32):

“””

Generate batches of data from a file.

Memory-efficient for large datasets.

“””

batch = []

with open(filepath, ‘r’) as f:

for line in f:

batch.append(process_line(line))

if len(batch) == batch_size:

yield batch

batch = []

if batch: # Yield remaining data

yield batch

# Usage in training loop

for batch in data_generator(‘train.txt’, batch_size=32):

loss = model.train_step(batch)

3.2.2 Generator Expressions

# Memory-efficient data processing

sum_of_squares = sum(x**2 for x in range(1000000))

# Instead of creating entire list

# sum_of_squares = sum([x**2 for x in range(1000000)])

# Custom iterator class

class DataIterator:

def __init__(self, data):

self.data = data

self.index = 0

def __iter__(self):

return self

def __next__(self):

if self.index >= len(self.data):

raise StopIteration

value = self.data[self.index]

self.index += 1

return value

3.3 Context Managers

Context managers ensure proper resource management, critical for GPU memory and file handles.

from contextlib import contextmanager

@contextmanager

def gpu_memory_manager():

“””Manage GPU memory allocation.”””

print(“Allocating GPU memory”)

try:

yield

finally:

print(“Clearing GPU cache”)

torch.cuda.empty_cache()

# Usage

with gpu_memory_manager():

output = model(input_data)

# Class-based context manager

class ModelCheckpoint:

def __init__(self, filepath):

self.filepath = filepath

def __enter__(self):

self.model_state = load_checkpoint(self.filepath)

return self.model_state

def __exit__(self, exc_type, exc_val, exc_tb):

if exc_type is None:

save_checkpoint(self.model_state, self.filepath)

3.4 List, Dict, and Set Comprehensions

Comprehensions provide concise, readable ways to create collections.

# List comprehension with conditionals

normalized_data = [

(x – mean) / std

for x in data

if x is not None

]

# Dictionary comprehension for mapping

label_to_idx = {

label: idx

for idx, label in enumerate(unique_labels)

}

# Set comprehension for unique filtering

unique_tokens = {

token.lower()

for sentence in corpus

for token in sentence.split()

}

# Nested comprehensions for matrix operations

transposed = [

[row[i] for row in matrix]

for i in range(len(matrix[0]))

]

3.5 Exception Handling

Robust exception handling prevents training interruptions and data loss.

class ModelTrainingError(Exception):

“””Custom exception for training failures.”””

pass

def train_model_with_recovery(model, data, epochs):

“””Train with automatic recovery from failures.”””

checkpoint_path = ‘checkpoint.pth’

try:

# Attempt to load checkpoint

if os.path.exists(checkpoint_path):

model.load_state_dict(torch.load(checkpoint_path))

print(“Resumed from checkpoint”)

for epoch in range(epochs):

try:

loss = train_epoch(model, data)

# Save checkpoint every 10 epochs

if epoch % 10 == 0:

torch.save(model.state_dict(), checkpoint_path)

except RuntimeError as e:

if “out of memory” in str(e):

print(“GPU OOM, reducing batch size”)

torch.cuda.empty_cache()

# Retry with smaller batch

else:

raise

except KeyboardInterrupt:

print(“Training interrupted, saving checkpoint”)

torch.save(model.state_dict(), checkpoint_path)

except Exception as e:

print(f”Unexpected error: {e}”)

raise ModelTrainingError(“Training failed”) from e

finally:

# Cleanup resources

torch.cuda.empty_cache()

4. Object-Oriented Programming in Python

4.1 Classes and Objects

OOP organizes code into reusable, modular components, essential for building complex AI systems.

4.1.1 Basic Class Structure

class NeuralNetwork:

“””Base class for neural network models.”””

def __init__(self, input_dim, hidden_dim, output_dim):

“””

Initialize network architecture.

Args:

input_dim: Input feature dimension

hidden_dim: Hidden layer dimension

output_dim: Output dimension (number of classes)

“””

self.input_dim = input_dim

self.hidden_dim = hidden_dim

self.output_dim = output_dim

self.weights = self._initialize_weights()

self.training_history = []

def _initialize_weights(self):

“””Private method for weight initialization.”””

# Xavier initialization

import numpy as np

return {

‘W1’: np.random.randn(self.input_dim, self.hidden_dim) * np.sqrt(2.0 / self.input_dim),

‘W2’: np.random.randn(self.hidden_dim, self.output_dim) * np.sqrt(2.0 / self.hidden_dim)

}

def forward(self, x):

“””Forward pass through the network.”””

raise NotImplementedError(“Subclasses must implement forward()”)

def train(self, x, y, epochs=100):

“””Training loop.”””

for epoch in range(epochs):

predictions = self.forward(x)

loss = self._compute_loss(predictions, y)

self.training_history.append(loss)

self._backpropagate(loss)

def __repr__(self):

“””String representation for debugging.”””

return f”NeuralNetwork(input={self.input_dim}, hidden={self.hidden_dim}, output={self.output_dim})”

4.1.2 Properties and Methods

class DataProcessor:

def __init__(self, data):

self._data = data

self._is_normalized = False

@property

def data(self):

“””Getter for data.”””

return self._data

@data.setter

def data(self, value):

“””Setter with validation.”””

if not isinstance(value, (list, np.ndarray)):

raise TypeError(“Data must be list or array”)

self._data = value

self._is_normalized = False

@property

def is_normalized(self):

“””Check if data is normalized.”””

return self._is_normalized

def normalize(self):

“””Normalize data to [0, 1] range.”””

min_val = min(self._data)

max_val = max(self._data)

self._data = [(x – min_val) / (max_val – min_val) for x in self._data]

self._is_normalized = True

return self

@staticmethod

def compute_mean(data):

“””Static method for mean calculation.”””

return sum(data) / len(data)

@classmethod

def from_file(cls, filepath):

“””Alternative constructor.”””

with open(filepath, ‘r’) as f:

data = [float(line.strip()) for line in f]

return cls(data)

4.2 Inheritance and Polymorphism

Inheritance enables code reuse and creates hierarchical relationships between classes.

class Model:

“””Base class for all models.”””

def __init__(self, name):

self.name = name

self.is_trained = False

def train(self, data):

raise NotImplementedError

def predict(self, x):

raise NotImplementedError

def save(self, filepath):

“””Common save functionality.”””

import pickle

with open(filepath, ‘wb’) as f:

pickle.dump(self, f)

class LinearRegression(Model):

“””Linear regression implementation.”””

def __init__(self, name=”LinearRegression”):

super().__init__(name)

self.weights = None

self.bias = None

def train(self, X, y):

“””Train using closed-form solution.”””

X_with_bias = np.c_[np.ones(len(X)), X]

theta = np.linalg.inv(X_with_bias.T @ X_with_bias) @ X_with_bias.T @ y

self.bias = theta[0]

self.weights = theta[1:]

self.is_trained = True

def predict(self, X):

“””Make predictions.”””

if not self.is_trained:

raise ValueError(“Model must be trained first”)

return X @ self.weights + self.bias

class LogisticRegression(Model):

“””Logistic regression implementation.”””

def __init__(self, name=”LogisticRegression”, learning_rate=0.01):

super().__init__(name)

self.learning_rate = learning_rate

self.weights = None

def _sigmoid(self, z):

“””Sigmoid activation.”””

return 1 / (1 + np.exp(-z))

def train(self, X, y, epochs=1000):

“””Train using gradient descent.”””

n_samples, n_features = X.shape

self.weights = np.zeros(n_features)

for _ in range(epochs):

linear_pred = X @ self.weights

predictions = self._sigmoid(linear_pred)

gradient = (1/n_samples) * X.T @ (predictions – y)

self.weights -= self.learning_rate * gradient

self.is_trained = True

def predict(self, X):

“””Make binary predictions.”””

if not self.is_trained:

raise ValueError(“Model must be trained first”)

return (self._sigmoid(X @ self.weights) >= 0.5).astype(int)

# Polymorphism in action

models = [LinearRegression(), LogisticRegression()]

for model in models:

model.train(X_train, y_train)

predictions = model.predict(X_test)

print(f”{model.name}: Accuracy = {compute_accuracy(predictions, y_test)}”)

4.3 Multiple Inheritance and Mixins

Mixins provide reusable functionality across different class hierarchies.

class LoggingMixin:

“””Mixin for adding logging capabilities.”””

def log(self, message):

print(f”[{self.__class__.__name__}] {message}”)

def log_training_step(self, epoch, loss):

self.log(f”Epoch {epoch}: Loss = {loss:.4f}”)

class VisualizationMixin:

“””Mixin for visualization capabilities.”””

def plot_training_history(self):

import matplotlib.pyplot as plt

plt.plot(self.training_history)

plt.xlabel(‘Epoch’)

plt.ylabel(‘Loss’)

plt.title(f’Training History – {self.name}’)

plt.show()

class AdvancedNeuralNetwork(NeuralNetwork, LoggingMixin, VisualizationMixin):

“””Neural network with logging and visualization.”””

def train(self, x, y, epochs=100):

“””Enhanced training with logging.”””

self.log(f”Starting training for {epochs} epochs”)

for epoch in range(epochs):

predictions = self.forward(x)

loss = self._compute_loss(predictions, y)

self.training_history.append(loss)

self._backpropagate(loss)

if epoch % 10 == 0:

self.log_training_step(epoch, loss)

self.log(“Training completed”)

return self

# Usage

model = AdvancedNeuralNetwork(784, 128, 10)

model.train(X_train, y_train, epochs=100)

model.plot_training_history()

4.4 Abstract Base Classes

ABCs define interfaces that subclasses must implement.

from abc import ABC, abstractmethod

class Optimizer(ABC):

“””Abstract base class for optimizers.”””

def __init__(self, learning_rate):

self.learning_rate = learning_rate

@abstractmethod

def step(self, gradients):

“””Update parameters using gradients.”””

pass

@abstractmethod

def zero_grad(self):

“””Reset gradients.”””

pass

class SGD(Optimizer):

“””Stochastic Gradient Descent.”””

def __init__(self, learning_rate, momentum=0.0):

super().__init__(learning_rate)

self.momentum = momentum

self.velocity = None

def step(self, gradients):

if self.velocity is None:

self.velocity = gradients

else:

self.velocity = self.momentum * self.velocity + gradients

return self.learning_rate * self.velocity

def zero_grad(self):

self.velocity = None

class Adam(Optimizer):

“””Adam optimizer.”””

def __init__(self, learning_rate, beta1=0.9, beta2=0.999, epsilon=1e-8):

super().__init__(learning_rate)

self.beta1 = beta1

self.beta2 = beta2

self.epsilon = epsilon

self.m = None # First moment

self.v = None # Second moment

self.t = 0 # Timestep

def step(self, gradients):

self.t += 1

if self.m is None:

self.m = gradients

self.v = gradients ** 2

else:

self.m = self.beta1 * self.m + (1 – self.beta1) * gradients

self.v = self.beta2 * self.v + (1 – self.beta2) * (gradients ** 2)

m_hat = self.m / (1 – self.beta1 ** self.t)

v_hat = self.v / (1 – self.beta2 ** self.t)

return self.learning_rate * m_hat / (np.sqrt(v_hat) + self.epsilon)

def zero_grad(self):

self.m = None

self.v = None

self.t = 0

4.5 Special Methods (Magic Methods)

Special methods enable custom behavior for built-in operations.

class Dataset:

“””Custom dataset class with special methods.”””

def __init__(self, features, labels):

self.features = features

self.labels = labels

def __len__(self):

“””Enable len(dataset).”””

return len(self.features)

def __getitem__(self, idx):

“””Enable dataset[idx] and slicing.”””

if isinstance(idx, slice):

return Dataset(self.features[idx], self.labels[idx])

return self.features[idx], self.labels[idx]

def __iter__(self):

“””Enable iteration.”””

for i in range(len(self)):

yield self[i]

def __add__(self, other):

“””Enable dataset concatenation with +.”””

return Dataset(

self.features + other.features,

self.labels + other.labels

)

def __repr__(self):

“””String representation.”””

return f”Dataset(n_samples={len(self)})”

def __eq__(self, other):

“””Enable equality comparison.”””

return (self.features == other.features and

self.labels == other.labels)

# Usage

dataset = Dataset([1, 2, 3], [‘a’, ‘b’, ‘c’])

print(len(dataset)) # 3

print(dataset[0]) # (1, ‘a’)

subset = dataset[0:2]

for x, y in dataset:

print(x, y)

5. NumPy: Numerical Computing Foundation

NumPy is the cornerstone of numerical computing in Python, providing efficient array operations essential for AI.

5.1 Array Creation and Manipulation

import numpy as np

# Creating arrays

a = np.array([1, 2, 3, 4, 5])

matrix = np.array([[1, 2, 3], [4, 5, 6]])

# Special array creation functions

zeros = np.zeros((3, 4)) # 3×4 array of zeros

ones = np.ones((2, 3, 4)) # 2x3x4 array of ones

identity = np.eye(5) # 5×5 identity matrix

random_array = np.random.randn(3, 4) # Gaussian random values

uniform = np.random.uniform(0, 1, (3, 4)) # Uniform [0,1)

arange = np.arange(0, 10, 0.5) # Array from 0 to 10, step 0.5

linspace = np.linspace(0, 1, 100) # 100 points between 0 and 1

# Array attributes

print(matrix.shape) # (2, 3)

print(matrix.dtype) # int64 or int32

print(matrix.ndim) # 2 (dimensions)

print(matrix.size) # 6 (total elements)

5.2 Array Indexing and Slicing

# Basic indexing

arr = np.arange(10)

print(arr[0]) # 0

print(arr[-1]) # 9

print(arr[2:7]) # [2, 3, 4, 5, 6]

print(arr[::2]) # [0, 2, 4, 6, 8] (every 2nd element)

# Multi-dimensional indexing

matrix = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])

print(matrix[1, 2]) # 6 (row 1, column 2)

print(matrix[:, 1]) # [2, 5, 8] (all rows, column 1)

print(matrix[1:, :2]) # [[4, 5], [7, 8]]

# Boolean indexing (crucial for data filtering)

data = np.array([1, 2, 3, 4, 5])

mask = data > 2

print(data[mask]) # [3, 4, 5]

print(data[data % 2 == 0]) # [2, 4] (even numbers)

# Fancy indexing

indices = np.array([0, 2, 4])

print(data[indices]) # [1, 3, 5]

5.3 Broadcasting

Broadcasting allows operations between arrays of different shapes, eliminating explicit loops.

# Scalar broadcasting

arr = np.array([1, 2, 3, 4])

print(arr * 2) # [2, 4, 6, 8]

# Vector-matrix broadcasting

matrix = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])

vector = np.array([1, 0, 1])

result = matrix + vector # Adds vector to each row

# [[2, 2, 4], [5, 5, 7], [8, 8, 10]]

# Broadcasting rules visualization

# Shape (3, 4) and (4,) -> compatible, broadcasts to (3, 4)

# Shape (3, 4) and (3, 1) -> compatible, broadcasts to (3, 4)

# Shape (3, 4) and (3,) -> NOT compatible (trailing dimensions)

# Practical example: Normalizing data

data = np.random.randn(1000, 10) # 1000 samples, 10 features

mean = data.mean(axis=0, keepdims=True) # Shape (1, 10)

std = data.std(axis=0, keepdims=True) # Shape (1, 10)

normalized = (data – mean) / std # Broadcasting

5.4 Mathematical Operations

# Element-wise operations

a = np.array([1, 2, 3, 4])

b = np.array([5, 6, 7, 8])

print(a + b) # [6, 8, 10, 12]

print(a * b) # [5, 12, 21, 32]

print(a ** 2) # [1, 4, 9, 16]

print(np.sqrt(a)) # [1.0, 1.414, 1.732, 2.0]

# Matrix operations

A = np.array([[1, 2], [3, 4]])

B = np.array([[5, 6], [7, 8]])

# Element-wise multiplication

print(A * B) # [[5, 12], [21, 32]]

# Matrix multiplication (dot product)

print(A @ B) # [[19, 22], [43, 50]]

print(np.dot(A, B)) # Same as above

# Transpose

print(A.T) # [[1, 3], [2, 4]]

# Universal functions (ufuncs) – vectorized operations

x = np.array([0, np.pi/2, np.pi])

print(np.sin(x)) # [0.0, 1.0, 0.0]

print(np.exp(x)) # Exponential

print(np.log(x + 1)) # Natural logarithm

# Aggregation functions

data = np.random.randn(100, 10)

print(data.sum()) # Sum of all elements

print(data.mean(axis=0)) # Mean along columns

print(data.std(axis=1)) # Standard deviation along rows

print(data.min(), data.max())

print(np.median(data))

print(np.percentile(data, 95))

5.5 Linear Algebra Operations

NumPy’s linear algebra capabilities are fundamental for machine learning.

# Matrix decompositions

A = np.random.randn(5, 5)

# Eigenvalue decomposition

eigenvalues, eigenvectors = np.linalg.eig(A)

# Singular Value Decomposition (SVD) – crucial for PCA

U, S, Vt = np.linalg.svd(A)

# Solving linear systems: Ax = b

A = np.array([[3, 1], [1, 2]])

b = np.array([9, 8])

x = np.linalg.solve(A, b) # x = [2, 3]

# Matrix inverse

A_inv = np.linalg.inv(A)

# Matrix norm

frobenius_norm = np.linalg.norm(A, ‘fro’)

l2_norm = np.linalg.norm(A, 2)

# Determinant

det = np.linalg.det(A)

# Pseudo-inverse (for non-square matrices)

A = np.random.randn(10, 5)

A_pinv = np.linalg.pinv(A)

5.6 NumPy in Neural Networks

def sigmoid(x):

“””Sigmoid activation function.”””

return 1 / (1 + np.exp(-x))

def relu(x):

“””ReLU activation function.”””

return np.maximum(0, x)

def softmax(x):

“””Softmax for multi-class classification.”””

exp_x = np.exp(x – np.max(x, axis=-1, keepdims=True))

return exp_x / np.sum(exp_x, axis=-1, keepdims=True)

class SimpleNeuralNetwork:

“””Neural network implemented with NumPy.”””

def __init__(self, input_size, hidden_size, output_size):

# Xavier initialization

self.W1 = np.random.randn(input_size, hidden_size) * np.sqrt(2.0 / input_size)

self.b1 = np.zeros((1, hidden_size))

self.W2 = np.random.randn(hidden_size, output_size) * np.sqrt(2.0 / hidden_size)

self.b2 = np.zeros((1, output_size))

def forward(self, X):

“””Forward propagation.”””

self.z1 = X @ self.W1 + self.b1

self.a1 = relu(self.z1)

self.z2 = self.a1 @ self.W2 + self.b2

self.a2 = softmax(self.z2)

return self.a2

def backward(self, X, y, learning_rate=0.01):

“””Backward propagation with gradient descent.”””

m = X.shape[0]

# Output layer gradients

dz2 = self.a2 – y

dW2 = (1/m) * self.a1.T @ dz2

db2 = (1/m) * np.sum(dz2, axis=0, keepdims=True)

# Hidden layer gradients

da1 = dz2 @ self.W2.T

dz1 = da1 * (self.z1 > 0) # ReLU derivative

dW1 = (1/m) * X.T @ dz1

db1 = (1/m) * np.sum(dz1, axis=0, keepdims=True)

# Update weights

self.W2 -= learning_rate * dW2

self.b2 -= learning_rate * db2

self.W1 -= learning_rate * dW1

self.b1 -= learning_rate * db1

def train(self, X, y, epochs=1000):

“””Training loop.”””

losses = []

for epoch in range(epochs):

predictions = self.forward(X)

loss = -np.mean(y * np.log(predictions + 1e-8))

losses.append(loss)

self.backward(X, y)

if epoch % 100 == 0:

print(f”Epoch {epoch}, Loss: {loss:.4f}”)

return losses

# Usage example

X = np.random.randn(100, 20) # 100 samples, 20 features

y = np.eye(3)[np.random.randint(0, 3, 100)] # One-hot encoded labels

model = SimpleNeuralNetwork(20, 50, 3)

losses = model.train(X, y, epochs=1000)

5.7 Advanced NumPy Techniques

# Vectorization vs loops – performance comparison

import time

# Slow loop-based approach

def slow_distance(x1, x2):

result = []

for i in range(len(x1)):

dist = sum((x1[i] – x2[j])**2 for j in range(len(x2)))

result.append(dist)

return result

# Fast vectorized approach

def fast_distance(x1, x2):

return np.sum((x1[:, np.newaxis] – x2)**2, axis=2)

# Memory-efficient operations with out parameter

large_array = np.random.randn(10000, 1000)

result = np.empty_like(large_array)

np.exp(large_array, out=result) # In-place operation

# Advanced indexing for batching

data = np.random.randn(1000, 784) # 1000 images, 784 pixels

batch_size = 32

num_batches = len(data) // batch_size

for i in range(num_batches):

batch = data[i*batch_size:(i+1)*batch_size]

# Process batch

# np.einsum for complex tensor operations

# Matrix multiplication: C_ij = A_ik * B_kj

A = np.random.randn(3, 4)

B = np.random.randn(4, 5)

C = np.einsum(‘ik,kj->ij’, A, B)

# Batch matrix multiplication

batch_A = np.random.randn(10, 3, 4)

batch_B = np.random.randn(10, 4, 5)

batch_C = np.einsum(‘bik,bkj->bij’, batch_A, batch_B)

6. Pandas: Data Manipulation and Analysis

Pandas is the premier library for data manipulation, essential for data preprocessing in AI pipelines.

6.1 DataFrame Basics

import pandas as pd

import numpy as np

# Creating DataFrames

data = {

‘name’: [‘Alice’, ‘Bob’, ‘Charlie’, ‘David’],

‘age’: [25, 30, 35, 28],

‘salary’: [50000, 60000, 75000, 55000],

‘department’: [‘Engineering’, ‘Sales’, ‘Engineering’, ‘Sales’]

}

df = pd.DataFrame(data)

# From NumPy array

arr = np.random.randn(100, 4)

df_array = pd.DataFrame(arr, columns=[‘A’, ‘B’, ‘C’, ‘D’])

# From CSV

df_csv = pd.read_csv(‘data.csv’)

# Basic information

print(df.head()) # First 5 rows

print(df.tail(3)) # Last 3 rows

print(df.info()) # Data types and non-null counts

print(df.describe()) # Statistical summary

print(df.shape) # (rows, columns)

print(df.columns) # Column names

print(df.dtypes) # Data types

6.2 Data Selection and Filtering

# Column selection

ages = df[‘age’] # Single column (Series)

subset = df[[‘name’, ‘salary’]] # Multiple columns (DataFrame)

# Row selection by position

first_row = df.iloc[0] # First row

first_three = df.iloc[0:3] # First 3 rows

specific = df.iloc[[0, 2, 4]] # Specific rows

# Row selection by label

df_indexed = df.set_index(‘name’)

alice = df_indexed.loc[‘Alice’]

# Boolean indexing

high_salary = df[df[‘salary’] > 55000]

engineers = df[df[‘department’] == ‘Engineering’]

complex_filter = df[(df[‘age’] > 27) & (df[‘salary’] < 70000)]

# Query method (more readable)

result = df.query(‘age > 27 and salary < 70000’)

result = df.query(‘department == “Engineering”‘)

# isin for multiple values

selected = df[df[‘department’].isin([‘Engineering’, ‘Sales’])]

# Advanced selection with loc

df.loc[df[‘age’] > 30, ‘salary’] *= 1.1 # Give raise to older employees

6.3 Data Cleaning and Preprocessing

# Handling missing data

df_missing = pd.DataFrame({

‘A’: [1, 2, np.nan, 4],

‘B’: [5, np.nan, np.nan, 8],

‘C’: [9, 10, 11, 12]

})

# Detect missing values

print(df_missing.isnull().sum()) # Count nulls per column

print(df_missing.notnull()) # Boolean mask of non-null values

# Drop missing values

df_dropped = df_missing.dropna() # Drop rows with any null

df_dropped_cols = df_missing.dropna(axis=1) # Drop columns with any null

df_thresh = df_missing.dropna(thresh=2) # Keep rows with at least 2 non-null

# Fill missing values

df_filled = df_missing.fillna(0) # Fill with constant

df_filled = df_missing.fillna(method=’ffill’) # Forward fill

df_filled = df_missing.fillna(method=’bfill’) # Backward fill

df_filled = df_missing.fillna(df_missing.mean()) # Fill with mean

# Interpolation for time series

df_interp = df_missing.interpolate(method=’linear’)

# Handling duplicates

df_with_dupes = pd.DataFrame({

‘A’: [1, 1, 2, 2],

‘B’: [3, 3, 4, 5]

})

print(df_with_dupes.duplicated()) # Boolean mask

df_unique = df_with_dupes.drop_duplicates() # Remove duplicates

# Data type conversion

df[‘age’] = df[‘age’].astype(‘int32’)

df[‘salary’] = pd.to_numeric(df[‘salary’], errors=’coerce’)

# String operations

df[‘name_upper’] = df[‘name’].str.upper()

df[‘name_length’] = df[‘name’].str.len()

df[‘first_letter’] = df[‘name’].str[0]

# Replacing values

df[‘department’] = df[‘department’].replace({

‘Engineering’: ‘Tech’,

‘Sales’: ‘Business’

})

6.4 Data Transformation

# Adding new columns

df[‘salary_in_k’] = df[‘salary’] / 1000

df[‘age_group’] = pd.cut(df[‘age’], bins=[0, 30, 40, 100],

labels=[‘Young’, ‘Middle’, ‘Senior’])

# Apply functions

df[‘bonus’] = df[‘salary’].apply(lambda x: x * 0.1)

df[‘full_info’] = df.apply(

lambda row: f”{row[‘name’]} ({row[‘age’]})”, axis=1

)

# Map for Series

salary_map = {50000: ‘Low’, 60000: ‘Medium’, 75000: ‘High’, 55000: ‘Medium’}

df[‘salary_category’] = df[‘salary’].map(salary_map)

# Sorting

df_sorted = df.sort_values(‘salary’, ascending=False)

df_multi_sort = df.sort_values([‘department’, ‘salary’], ascending=[True, False])

# Ranking

df[‘salary_rank’] = df[‘salary’].rank(ascending=False)

df[‘percentile’] = df[‘salary’].rank(pct=True)

# Binning continuous variables

df[‘age_bin’] = pd.qcut(df[‘age’], q=4, labels=[‘Q1’, ‘Q2’, ‘Q3’, ‘Q4’])

6.5 Grouping and Aggregation

# GroupBy operations

dept_stats = df.groupby(‘department’)[‘salary’].mean()

multi_agg = df.groupby(‘department’).agg({

‘salary’: [‘mean’, ‘min’, ‘max’, ‘std’],

‘age’: [‘mean’, ‘count’]

})

# Multiple grouping levels

df[‘experience’] = pd.cut(df[‘age’], bins=[0, 30, 100], labels=[‘Junior’, ‘Senior’])

grouped = df.groupby([‘department’, ‘experience’])[‘salary’].mean()

# Custom aggregation functions

def salary_range(x):

return x.max() – x.min()

df.groupby(‘department’)[‘salary’].agg([

(‘average’, ‘mean’),

(‘range’, salary_range),

(‘count’, ‘size’)

])

# Transform (keep same shape)

df[‘salary_normalized’] = df.groupby(‘department’)[‘salary’].transform(

lambda x: (x – x.mean()) / x.std()

)

# Filter groups

high_avg_depts = df.groupby(‘department’).filter(

lambda x: x[‘salary’].mean() > 60000

)

# Pivot tables

pivot = df.pivot_table(

values=’salary’,

index=’department’,

columns=’experience’,

aggfunc=’mean’,

fill_value=0

)

# Crosstab

ct = pd.crosstab(df[‘department’], df[‘experience’], margins=True)

6.6 Merging and Joining

# Sample DataFrames

employees = pd.DataFrame({

’emp_id’: [1, 2, 3, 4],

‘name’: [‘Alice’, ‘Bob’, ‘Charlie’, ‘David’],

‘dept_id’: [10, 20, 10, 30]

})

departments = pd.DataFrame({

‘dept_id’: [10, 20, 30],

‘dept_name’: [‘Engineering’, ‘Sales’, ‘Marketing’]

})

# Inner join

inner = pd.merge(employees, departments, on=’dept_id’, how=’inner’)

# Left join (keep all employees)

left = pd.merge(employees, departments, on=’dept_id’, how=’left’)

# Right join (keep all departments)

right = pd.merge(employees, departments, on=’dept_id’, how=’right’)

# Outer join (keep all records)

outer = pd.merge(employees, departments, on=’dept_id’, how=’outer’)

# Join on different column names

df1 = pd.DataFrame({’emp_id’: [1, 2], ‘name’: [‘Alice’, ‘Bob’]})

df2 = pd.DataFrame({‘id’: [1, 2], ‘salary’: [50000, 60000]})

merged = pd.merge(df1, df2, left_on=’emp_id’, right_on=’id’)

# Concatenation

df1 = pd.DataFrame({‘A’: [1, 2], ‘B’: [3, 4]})

df2 = pd.DataFrame({‘A’: [5, 6], ‘B’: [7, 8]})

concatenated = pd.concat([df1, df2], ignore_index=True)

# Horizontal concatenation

horizontal = pd.concat([df1, df2], axis=1)

6.7 Time Series Operations

# Creating datetime index

dates = pd.date_range(‘2024-01-01′, periods=100, freq=’D’)

ts = pd.Series(np.random.randn(100), index=dates)

# Date parsing

df_time = pd.DataFrame({

‘date’: [‘2024-01-01’, ‘2024-01-02’, ‘2024-01-03’],

‘value’: [100, 105, 103]

})

df_time[‘date’] = pd.to_datetime(df_time[‘date’])

df_time = df_time.set_index(‘date’)

# Resampling

monthly_mean = ts.resample(‘M’).mean()

weekly_sum = ts.resample(‘W’).sum()

# Rolling windows

rolling_mean = ts.rolling(window=7).mean()

rolling_std = ts.rolling(window=7).std()

# Expanding windows

cumulative_mean = ts.expanding().mean()

# Shifting (for lag features)

df_time[‘lag_1’] = df_time[‘value’].shift(1)

df_time[‘lead_1’] = df_time[‘value’].shift(-1)

df_time[‘pct_change’] = df_time[‘value’].pct_change()

# Date components

df_time[‘year’] = df_time.index.year

df_time[‘month’] = df_time.index.month

df_time[‘day_of_week’] = df_time.index.dayofweek

df_time[‘quarter’] = df_time.index.quarter

6.8 Pandas for Machine Learning Pipelines

# Complete preprocessing pipeline

class DataPreprocessor:

“””Data preprocessing pipeline for ML.”””

def __init__(self):

self.numeric_features = None

self.categorical_features = None

self.scaler_params = {}

def fit(self, df, target_col):

“””Fit preprocessing parameters.”””

# Identify feature types

self.numeric_features = df.select_dtypes(

include=[‘int64’, ‘float64’]

).columns.tolist()

self.numeric_features.remove(target_col)

self.categorical_features = df.select_dtypes(

include=[‘object’]

).columns.tolist()

# Calculate scaling parameters

for col in self.numeric_features:

self.scaler_params[col] = {

‘mean’: df[col].mean(),

‘std’: df[col].std()

}

return self

def transform(self, df):

“””Transform dataframe.”””

df_transformed = df.copy()

# Handle missing values

for col in self.numeric_features:

df_transformed[col].fillna(

self.scaler_params[col][‘mean’],

inplace=True

)

# Normalize numeric features

for col in self.numeric_features:

mean = self.scaler_params[col][‘mean’]

std = self.scaler_params[col][‘std’]

df_transformed[col] = (df_transformed[col] – mean) / std

# One-hot encode categorical features

df_transformed = pd.get_dummies(

df_transformed,

columns=self.categorical_features,

drop_first=True

)

return df_transformed

def fit_transform(self, df, target_col):

“””Fit and transform in one step.”””

return self.fit(df, target_col).transform(df)

# Usage

df_train = pd.read_csv(‘train.csv’)

preprocessor = DataPreprocessor()

X_train = preprocessor.fit_transform(df_train, target_col=’label’)

# Feature engineering helpers

def create_interaction_features(df, col1, col2):

“””Create interaction features.”””

df[f'{col1}_x_{col2}’] = df[col1] * df[col2]

return df

def create_polynomial_features(df, columns, degree=2):

“””Create polynomial features.”””

for col in columns:

for d in range(2, degree + 1):

df[f'{col}^{d}’] = df[col] ** d

return df

def create_binned_features(df, column, bins=5):

“””Create binned versions of continuous features.”””

df[f'{column}_binned’] = pd.qcut(

df[column],

q=bins,

labels=False,

duplicates=’drop’

)

return df

7. Matplotlib: Data Visualization

Visualization is crucial for understanding data and communicating results. Matplotlib is the foundational plotting library in Python.

7.1 Basic Plotting

import matplotlib.pyplot as plt

import numpy as np

# Simple line plot

x = np.linspace(0, 10, 100)

y = np.sin(x)

plt.figure(figsize=(10, 6))

plt.plot(x, y, label=’sin(x)’, color=’blue’, linewidth=2)

plt.xlabel(‘X axis’, fontsize=12)

plt.ylabel(‘Y axis’, fontsize=12)

plt.title(‘Sine Wave’, fontsize=14, fontweight=’bold’)

plt.legend()

plt.grid(True, alpha=0.3)

plt.tight_layout()

plt.show()

# Multiple lines

plt.figure(figsize=(10, 6))

plt.plot(x, np.sin(x), label=’sin(x)’, linewidth=2)

plt.plot(x, np.cos(x), label=’cos(x)’, linewidth=2)

plt.plot(x, np.sin(x) * np.cos(x), label=’sin(x)·cos(x)’, linewidth=2, linestyle=’–‘)

plt.legend(loc=’upper right’)

plt.show()

# Scatter plot

x = np.random.randn(100)

y = 2 * x + np.random.randn(100) * 0.5

plt.figure(figsize=(8, 6))

plt.scatter(x, y, alpha=0.6, c=y, cmap=’viridis’, s=50)

plt.colorbar(label=’Y value’)

plt.xlabel(‘Feature’)

plt.ylabel(‘Target’)

plt.title(‘Scatter Plot with Color Mapping’)

plt.show()

7.2 Statistical Visualizations

# Histogram

data = np.random.randn(1000)

plt.figure(figsize=(10, 6))

plt.hist(data, bins=30, alpha=0.7, color=’blue’, edgecolor=’black’)

plt.axvline(data.mean(), color=’red’, linestyle=’–‘, linewidth=2, label=f’Mean: {data.mean():.2f}’)

plt.xlabel(‘Value’)

plt.ylabel(‘Frequency’)

plt.title(‘Distribution of Data’)

plt.legend()

plt.show()

# Box plot

data_groups = [np.random.randn(100) for _ in range(4)]

plt.figure(figsize=(10, 6))

plt.boxplot(data_groups, labels=[‘Group A’, ‘Group B’, ‘Group C’, ‘Group D’])

plt.ylabel(‘Value’)

plt.title(‘Box Plot Comparison’)

plt.grid(True, alpha=0.3)

plt.show()

# Violin plot (requires seaborn)

import seaborn as sns

import pandas as pd

df = pd.DataFrame({

‘value’: np.concatenate(data_groups),

‘group’: [‘A’]*100 + [‘B’]*100 + [‘C’]*100 + [‘D’]*100

})

plt.figure(figsize=(10, 6))

sns.violinplot(data=df, x=’group’, y=’value’)

plt.title(‘Violin Plot’)

plt.show()

7.3 Subplots and Complex Layouts

# Creating subplots

fig, axes = plt.subplots(2, 2, figsize=(12, 10))

# Plot 1: Line plot

axes[0, 0].plot(x, np.sin(x))

axes[0, 0].set_title(‘Sine Wave’)

axes[0, 0].grid(True)

# Plot 2: Scatter

axes[0, 1].scatter(np.random.randn(50), np.random.randn(50))

axes[0, 1].set_title(‘Scatter Plot’)

# Plot 3: Histogram

axes[1, 0].hist(np.random.randn(1000), bins=30)

axes[1, 0].set_title(‘Histogram’)

# Plot 4: Bar chart

categories = [‘A’, ‘B’, ‘C’, ‘D’]

values = [23, 45, 56, 78]

axes[1, 1].bar(categories, values, color=[‘red’, ‘blue’, ‘green’, ‘orange’])

axes[1, 1].set_title(‘Bar Chart’)

plt.tight_layout()

plt.show()

# GridSpec for custom layouts

from matplotlib.gridspec import GridSpec

fig = plt.figure(figsize=(12, 8))

gs = GridSpec(3, 3, figure=fig)

ax1 = fig.add_subplot(gs[0, :]) # Top row, all columns

ax2 = fig.add_subplot(gs[1, :-1]) # Middle row, first two columns

ax3 = fig.add_subplot(gs[1:, -1]) # Last two rows, last column

ax4 = fig.add_subplot(gs[-1, 0]) # Bottom left

ax5 = fig.add_subplot(gs[-1, 1]) # Bottom middle

ax1.plot(x, np.sin(x))

ax1.set_title(‘Main Plot’)

plt.tight_layout()

plt.show()

7.4 Visualizing Machine Learning Results

# Training history visualization

def plot_training_history(history):

“””Plot training and validation metrics.”””

fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(15, 5))

# Loss

ax1.plot(history[‘train_loss’], label=’Train Loss’, linewidth=2)

ax1.plot(history[‘val_loss’], label=’Validation Loss’, linewidth=2)

ax1.set_xlabel(‘Epoch’)

ax1.set_ylabel(‘Loss’)

ax1.set_title(‘Training and Validation Loss’)

ax1.legend()

ax1.grid(True, alpha=0.3)

# Accuracy

ax2.plot(history[‘train_acc’], label=’Train Accuracy’, linewidth=2)

ax2.plot(history[‘val_acc’], label=’Validation Accuracy’, linewidth=2)

ax2.set_xlabel(‘Epoch’)

ax2.set_ylabel(‘Accuracy’)

ax2.set_title(‘Training and Validation Accuracy’)

ax2.legend()

ax2.grid(True, alpha=0.3)

plt.tight_layout()

plt.show()

# Confusion matrix visualization

from sklearn.metrics import confusion_matrix

import seaborn as sns

def plot_confusion_matrix(y_true, y_pred, classes):

“””Plot confusion matrix.”””

cm = confusion_matrix(y_true, y_pred)

plt.figure(figsize=(10, 8))

sns.heatmap(cm, annot=True, fmt=’d’, cmap=’Blues’,

xticklabels=classes, yticklabels=classes)

plt.ylabel(‘True Label’)

plt.xlabel(‘Predicted Label’)

plt.title(‘Confusion Matrix’)

plt.tight_layout()

plt.show()

# ROC Curve

from sklearn.metrics import roc_curve, auc

def plot_roc_curve(y_true, y_scores, n_classes):

“””Plot ROC curve for multi-class classification.”””

plt.figure(figsize=(10, 8))

for i in range(n_classes):

fpr, tpr, _ = roc_curve(y_true == i, y_scores[:, i])

roc_auc = auc(fpr, tpr)

plt.plot(fpr, tpr, linewidth=2,

label=f’Class {i} (AUC = {roc_auc:.2f})’)

plt.plot([0, 1], [0, 1], ‘k–‘, linewidth=2, label=’Random’)

plt.xlabel(‘False Positive Rate’)

plt.ylabel(‘True Positive Rate’)

plt.title(‘ROC Curves’)

plt.legend()

plt.grid(True, alpha=0.3)

plt.tight_layout()

plt.show()

# Feature importance visualization

def plot_feature_importance(feature_names, importances):

“””Plot feature importance.”””

indices = np.argsort(importances)[::-1][:20] # Top 20

plt.figure(figsize=(12, 8))

plt.barh(range(len(indices)), importances[indices])

plt.yticks(range(len(indices)), [feature_names[i] for i in indices])

plt.xlabel(‘Importance’)

plt.title(‘Top 20 Feature Importances’)

plt.tight_layout()

plt.show()

7.5 3D Plotting

from mpl_toolkits.mplot3d import Axes3D

# 3D surface plot

fig = plt.figure(figsize=(12, 8))

ax = fig.add_subplot(111, projection=’3d’)

x = np.linspace(-5, 5, 50)

y = np.linspace(-5, 5, 50)

X, Y = np.meshgrid(x, y)

Z = np.sin(np.sqrt(X**2 + Y**2))

surf = ax.plot_surface(X, Y, Z, cmap=’viridis’, alpha=0.8)

ax.set_xlabel(‘X’)

ax.set_ylabel(‘Y’)

ax.set_zlabel(‘Z’)

ax.set_title(‘3D Scatter Plot’)

fig.colorbar(scatter)

plt.show()

# Decision boundary visualization

def plot_decision_boundary(model, X, y):

“””Plot 2D decision boundary.”””

h = 0.02

x_min, x_max = X[:, 0].min() – 1, X[:, 0].max() + 1

y_min, y_max = X[:, 1].min() – 1, X[:, 1].max() + 1

xx, yy = np.meshgrid(np.arange(x_min, x_max, h),

np.arange(y_min, y_max, h))

Z = model.predict(np.c_[xx.ravel(), yy.ravel()])

Z = Z.reshape(xx.shape)

plt.figure(figsize=(10, 8))

plt.contourf(xx, yy, Z, alpha=0.4, cmap=’viridis’)

plt.scatter(X[:, 0], X[:, 1], c=y, cmap=’viridis’, edgecolors=’black’)

plt.xlabel(‘Feature 1’)

plt.ylabel(‘Feature 2’)

plt.title(‘Decision Boundary’)

plt.colorbar()

plt.show()

7.6 Advanced Styling and Customization

# Custom style

plt.style.use(‘seaborn-v0_8-darkgrid’) # Use built-in style

# Create custom style

custom_style = {

‘figure.figsize’: (12, 8),

‘font.size’: 12,

‘axes.labelsize’: 14,

‘axes.titlesize’: 16,

‘xtick.labelsize’: 12,

‘ytick.labelsize’: 12,

‘legend.fontsize’: 12,

‘lines.linewidth’: 2,

‘lines.markersize’: 8

}

plt.rcParams.update(custom_style)

# Annotation and text

fig, ax = plt.subplots(figsize=(10, 6))

x = np.linspace(0, 10, 100)

y = np.sin(x)

ax.plot(x, y)

ax.annotate(‘Local Maximum’,

xy=(np.pi/2, 1),

xytext=(np.pi/2 + 1, 1.2),

arrowprops=dict(arrowstyle=’->’, color=’red’, lw=2),

fontsize=12, color=’red’)

ax.text(8, -0.5, ‘Sin Wave Plot’,

fontsize=14, bbox=dict(boxstyle=’round’, facecolor=’wheat’, alpha=0.5))

plt.show()

# Saving figures in high quality

plt.figure(figsize=(12, 8))

plt.plot(x, y)

plt.savefig(‘high_quality_plot.png’, dpi=300, bbox_inches=’tight’)

plt.savefig(‘vector_plot.pdf’, bbox_inches=’tight’) # Vector format

plt.savefig(‘transparent_bg.png’, transparent=True, dpi=300)

8. PyTorch: Deep Learning Framework

PyTorch is the leading deep learning framework, known for its dynamic computation graphs and Pythonic design.

8.1 Tensor Basics

import torch

import torch.nn as nn

import torch.optim as optim

# Creating tensors

x = torch.tensor([1, 2, 3, 4, 5])

y = torch.tensor([[1, 2], [3, 4], [5, 6]])

zeros = torch.zeros(3, 4)

ones = torch.ones(2, 3, 4)

random = torch.randn(3, 4) # Normal distribution

uniform = torch.rand(3, 4) # Uniform [0, 1)

# Tensor from NumPy

import numpy as np

np_array = np.array([1, 2, 3])

torch_tensor = torch.from_numpy(np_array)

# Tensor to NumPy

numpy_array = torch_tensor.numpy()

# Tensor attributes

print(random.shape) # torch.Size([3, 4])

print(random.dtype) # torch.float32

print(random.device) # cpu or cuda

print(random.requires_grad) # False by default

# Device management

device = torch.device(‘cuda’ if torch.cuda.is_available() else ‘cpu’)

tensor_gpu = random.to(device)

tensor_cpu = tensor_gpu.cpu()

# Data types

float_tensor = torch.tensor([1.0, 2.0], dtype=torch.float32)

int_tensor = torch.tensor([1, 2], dtype=torch.int64)

bool_tensor = torch.tensor([True, False], dtype=torch.bool)

8.2 Tensor Operations

# Basic operations

a = torch.tensor([[1, 2], [3, 4]], dtype=torch.float32)

b = torch.tensor([[5, 6], [7, 8]], dtype=torch.float32)

# Element-wise operations

c = a + b

c = torch.add(a, b)

c = a * b

c = a / b

c = a ** 2

# Matrix operations

c = torch.mm(a, b) # Matrix multiplication

c = a @ b # Same as above

c = a.T # Transpose

# Reduction operations

sum_all = a.sum()

mean_val = a.mean()

max_val = a.max()

sum_cols = a.sum(dim=0) # Sum along dimension 0

sum_rows = a.sum(dim=1) # Sum along dimension 1

# Reshaping

x = torch.randn(2, 3, 4)

y = x.view(2, 12) # Reshape to (2, 12)

z = x.view(-1) # Flatten to 1D

w = x.permute(2, 0, 1) # Permute dimensions

# Broadcasting

a = torch.randn(3, 1)

b = torch.randn(1, 4)

c = a + b # Result shape: (3, 4)

# Indexing and slicing

x = torch.randn(4, 5)

print(x[0]) # First row

print(x[:, 0]) # First column

print(x[1:3, :]) # Rows 1-2

# Advanced indexing

indices = torch.tensor([0, 2])

selected = x[indices] # Select rows 0 and 2

# Boolean masking

mask = x > 0

positive = x[mask]

8.3 Autograd: Automatic Differentiation

# Basic gradient computation

x = torch.tensor([2.0], requires_grad=True)

y = x ** 2 + 3 * x + 1

y.backward() # Compute gradients

print(x.grad) # dy/dx = 2x + 3 = 7.0

# Multiple variables

x = torch.tensor([1.0, 2.0], requires_grad=True)

y = torch.tensor([3.0, 4.0], requires_grad=True)

z = (x ** 2).sum() + (y ** 3).sum()

z.backward()

print(x.grad) # dz/dx

print(y.grad) # dz/dy

# Gradient accumulation

x = torch.tensor([1.0], requires_grad=True)

for i in range(3):

y = x ** 2

y.backward()

print(f”Iteration {i}: gradient = {x.grad}”)

# Zero gradients

x.grad.zero_()

# Detaching from computation graph

x = torch.randn(3, requires_grad=True)

y = x ** 2

z = y.detach() # z doesn’t track gradients

# Context managers for gradient control

x = torch.randn(3, requires_grad=True)

with torch.no_grad():

y = x ** 2 # No gradients computed

# Gradient checkpointing for memory efficiency

from torch.utils.checkpoint import checkpoint

def custom_function(x):

return x ** 2 + torch.sin(x)

x = torch.randn(1000, requires_grad=True)

y = checkpoint(custom_function, x)

8.4 Building Neural Networks

# Simple neural network using nn.Module

class SimpleNN(nn.Module):

def __init__(self, input_size, hidden_size, output_size):

super(SimpleNN, self).__init__()

self.fc1 = nn.Linear(input_size, hidden_size)

self.relu = nn.ReLU()

self.fc2 = nn.Linear(hidden_size, output_size)

def forward(self, x):

x = self.fc1(x)

x = self.relu(x)

x = self.fc2(x)

return x

# Instantiate and use

model = SimpleNN(784, 128, 10)

x = torch.randn(32, 784) # Batch of 32 samples

output = model(x)

print(output.shape) # torch.Size([32, 10])

# More complex network with dropout and batch normalization

class AdvancedNN(nn.Module):

def __init__(self, input_size, hidden_sizes, output_size, dropout=0.5):

super(AdvancedNN, self).__init__()

layers = []

prev_size = input_size

for hidden_size in hidden_sizes:

layers.append(nn.Linear(prev_size, hidden_size))

layers.append(nn.BatchNorm1d(hidden_size))

layers.append(nn.ReLU())

layers.append(nn.Dropout(dropout))

prev_size = hidden_size

layers.append(nn.Linear(prev_size, output_size))

self.network = nn.Sequential(*layers)

def forward(self, x):

return self.network(x)

# Convolutional Neural Network

class CNN(nn.Module):

def __init__(self, num_classes=10):

super(CNN, self).__init__()

self.conv_layers = nn.Sequential(

nn.Conv2d(3, 32, kernel_size=3, padding=1),

nn.ReLU(),

nn.MaxPool2d(2, 2),

nn.Conv2d(32, 64, kernel_size=3, padding=1),

nn.ReLU(),

nn.MaxPool2d(2, 2),

nn.Conv2d(64, 128, kernel_size=3, padding=1),

nn.ReLU(),

nn.MaxPool2d(2, 2)

)

self.fc_layers = nn.Sequential(

nn.Flatten(),

nn.Linear(128 * 4 * 4, 512),

nn.ReLU(),

nn.Dropout(0.5),

nn.Linear(512, num_classes)

)

def forward(self, x):

x = self.conv_layers(x)

x = self.fc_layers(x)

return x

# Recurrent Neural Network (LSTM)

class LSTMClassifier(nn.Module):

def __init__(self, vocab_size, embedding_dim, hidden_dim, output_dim, n_layers=2, dropout=0.5):

super(LSTMClassifier, self).__init__()

self.embedding = nn.Embedding(vocab_size, embedding_dim)

self.lstm = nn.LSTM(embedding_dim, hidden_dim, num_layers=n_layers,

dropout=dropout, batch_first=True)

self.fc = nn.Linear(hidden_dim, output_dim)

self.dropout = nn.Dropout(dropout)

def forward(self, text):

embedded = self.dropout(self.embedding(text))

output, (hidden, cell) = self.lstm(embedded)

hidden = self.dropout(hidden[-1])

return self.fc(hidden)

# Residual Network Block

class ResidualBlock(nn.Module):

def __init__(self, in_channels, out_channels, stride=1):

super(ResidualBlock, self).__init__()

self.conv1 = nn.Conv2d(in_channels, out_channels, kernel_size=3,

stride=stride, padding=1, bias=False)

self.bn1 = nn.BatchNorm2d(out_channels)

self.relu = nn.ReLU(inplace=True)

self.conv2 = nn.Conv2d(out_channels, out_channels, kernel_size=3,

stride=1, padding=1, bias=False)

self.bn2 = nn.BatchNorm2d(out_channels)

self.shortcut = nn.Sequential()

if stride != 1 or in_channels != out_channels:

self.shortcut = nn.Sequential(

nn.Conv2d(in_channels, out_channels, kernel_size=1,

stride=stride, bias=False),

nn.BatchNorm2d(out_channels)

)

def forward(self, x):

residual = x

out = self.conv1(x)

out = self.bn1(out)

out = self.relu(out)

out = self.conv2(out)

out = self.bn2(out)

out += self.shortcut(residual)

out = self.relu(out)

return out

8.5 Training Loop

# Complete training pipeline

class Trainer:

def __init__(self, model, train_loader, val_loader, criterion, optimizer, device):

self.model = model.to(device)

self.train_loader = train_loader

self.val_loader = val_loader

self.criterion = criterion

self.optimizer = optimizer

self.device = device

self.history = {

‘train_loss’: [],

‘train_acc’: [],

‘val_loss’: [],

‘val_acc’: []

}

def train_epoch(self):

self.model.train()

total_loss = 0

correct = 0

total = 0

for batch_idx, (data, target) in enumerate(self.train_loader):

data, target = data.to(self.device), target.to(self.device)

# Forward pass

self.optimizer.zero_grad()

output = self.model(data)

loss = self.criterion(output, target)

# Backward pass

loss.backward()

self.optimizer.step()

# Statistics

total_loss += loss.item()

pred = output.argmax(dim=1, keepdim=True)

correct += pred.eq(target.view_as(pred)).sum().item()

total += target.size(0)

if batch_idx % 100 == 0:

print(f’Batch {batch_idx}/{len(self.train_loader)}, ‘

f’Loss: {loss.item():.4f}’)

avg_loss = total_loss / len(self.train_loader)

accuracy = 100. * correct / total

return avg_loss, accuracy

def validate(self):

self.model.eval()

total_loss = 0

correct = 0

total = 0

with torch.no_grad():

for data, target in self.val_loader:

data, target = data.to(self.device), target.to(self.device)

output = self.model(data)

loss = self.criterion(output, target)

total_loss += loss.item()

pred = output.argmax(dim=1, keepdim=True)

correct += pred.eq(target.view_as(pred)).sum().item()

total += target.size(0)

avg_loss = total_loss / len(self.val_loader)

accuracy = 100. * correct / total

return avg_loss, accuracy

def train(self, epochs, save_path=’best_model.pth’):

best_val_loss = float(‘inf’)

for epoch in range(epochs):

print(f’\nEpoch {epoch + 1}/{epochs}’)

print(‘-‘ * 50)

train_loss, train_acc = self.train_epoch()

val_loss, val_acc = self.validate()

self.history[‘train_loss’].append(train_loss)

self.history[‘train_acc’].append(train_acc)

self.history[‘val_loss’].append(val_loss)

self.history[‘val_acc’].append(val_acc)

print(f’Train Loss: {train_loss:.4f}, Train Acc: {train_acc:.2f}%’)

print(f’Val Loss: {val_loss:.4f}, Val Acc: {val_acc:.2f}%’)

# Save best model

if val_loss < best_val_loss:

best_val_loss = val_loss

torch.save({

‘epoch’: epoch,

‘model_state_dict’: self.model.state_dict(),

‘optimizer_state_dict’: self.optimizer.state_dict(),

‘val_loss’: val_loss,

}, save_path)

print(f’Model saved with val_loss: {val_loss:.4f}’)

return self.history

# Usage example

from torch.utils.data import DataLoader, TensorDataset

# Create dummy data

X_train = torch.randn(1000, 784)

y_train = torch.randint(0, 10, (1000,))

X_val = torch.randn(200, 784)

y_val = torch.randint(0, 10, (200,))

train_dataset = TensorDataset(X_train, y_train)

val_dataset = TensorDataset(X_val, y_val)

train_loader = DataLoader(train_dataset, batch_size=32, shuffle=True)

val_loader = DataLoader(val_dataset, batch_size=32, shuffle=False)

# Initialize model, criterion, optimizer

model = SimpleNN(784, 128, 10)

criterion = nn.CrossEntropyLoss()

optimizer = optim.Adam(model.parameters(), lr=0.001)

device = torch.device(‘cuda’ if torch.cuda.is_available() else ‘cpu’)

# Train

trainer = Trainer(model, train_loader, val_loader, criterion, optimizer, device)

history = trainer.train(epochs=10)

8.6 Data Loading and Augmentation

from torch.utils.data import Dataset, DataLoader

from torchvision import transforms

from PIL import Image

# Custom dataset

class CustomDataset(Dataset):

def __init__(self, data, labels, transform=None):

self.data = data

self.labels = labels

self.transform = transform

def __len__(self):

return len(self.data)

def __getitem__(self, idx):

sample = self.data[idx]

label = self.labels[idx]

if self.transform:

sample = self.transform(sample)

return sample, label

# Image augmentation pipeline

train_transform = transforms.Compose([

transforms.RandomResizedCrop(224),

transforms.RandomHorizontalFlip(),

transforms.RandomRotation(15),

transforms.ColorJitter(brightness=0.2, contrast=0.2, saturation=0.2),

transforms.ToTensor(),

transforms.Normalize(mean=[0.485, 0.456, 0.406],

std=[0.229, 0.224, 0.225])

])

val_transform = transforms.Compose([

transforms.Resize(256),

transforms.CenterCrop(224),

transforms.ToTensor(),

transforms.Normalize(mean=[0.485, 0.456, 0.406],

std=[0.229, 0.224, 0.225])

])

# Advanced: Custom collate function

def custom_collate(batch):

“””Custom collate function for variable-length sequences.”””

data, labels = zip(*batch)

# Pad sequences to same length

max_len = max(len(seq) for seq in data)

padded_data = torch.zeros(len(data), max_len)

for i, seq in enumerate(data):

padded_data[i, :len(seq)] = seq

labels = torch.tensor(labels)

return padded_data, labels

# DataLoader with multiple workers

train_loader = DataLoader(

train_dataset,

batch_size=64,

shuffle=True,

num_workers=4,

pin_memory=True, # Faster data transfer to GPU

collate_fn=custom_collate

)

8.7 Transfer Learning

import torchvision.models as models

# Load pre-trained model

resnet = models.resnet50(pretrained=True)

# Freeze all layers

for param in resnet.parameters():

param.requires_grad = False

# Replace final layer

num_features = resnet.fc.in_features

resnet.fc = nn.Linear(num_features, 10) # 10 classes

# Only train final layer

optimizer = optim.Adam(resnet.fc.parameters(), lr=0.001)

# Fine-tuning: Unfreeze some layers

def unfreeze_layers(model, num_layers=2):

“””Unfreeze last num_layers for fine-tuning.”””

children = list(model.children())

for child in children[-num_layers:]:

for param in child.parameters():

param.requires_grad = True

unfreeze_layers(resnet, num_layers=2)

# Different learning rates for different layers

optimizer = optim.Adam([

{‘params’: resnet.layer4.parameters(), ‘lr’: 1e-4},

{‘params’: resnet.fc.parameters(), ‘lr’: 1e-3}

])

8.8 Model Saving and Loading

# Save entire model

torch.save(model, ‘complete_model.pth’)

loaded_model = torch.load(‘complete_model.pth’)

# Save only state dict (recommended)

torch.save(model.state_dict(), ‘model_weights.pth’)

model = SimpleNN(784, 128, 10)

model.load_state_dict(torch.load(‘model_weights.pth’))

# Save checkpoint with optimizer state

checkpoint = {

‘epoch’: epoch,

‘model_state_dict’: model.state_dict(),

‘optimizer_state_dict’: optimizer.state_dict(),

‘loss’: loss,

‘accuracy’: accuracy

}

torch.save(checkpoint, ‘checkpoint.pth’)

# Load checkpoint

checkpoint = torch.load(‘checkpoint.pth’)

model.load_state_dict(checkpoint[‘model_state_dict’])

optimizer.load_state_dict(checkpoint[‘optimizer_state_dict’])

epoch = checkpoint[‘epoch’]

loss = checkpoint[‘loss’]

# Save for production deployment

model.eval()

example_input = torch.randn(1, 784)

traced_model = torch.jit.trace(model, example_input)

traced_model.save(‘model_traced.pt’)

8.9 GPU Optimization

# Multi-GPU training

if torch.cuda.device_count() > 1:

print(f”Using {torch.cuda.device_count()} GPUs”)

model = nn.DataParallel(model)

model = model.to(device)

# Mixed precision training for faster computation

from torch.cuda.amp import autocast, GradScaler

scaler = GradScaler()

for data, target in train_loader:

optimizer.zero_grad()

with autocast():

output = model(data)

loss = criterion(output, target)

scaler.scale(loss).backward()

scaler.step(optimizer)

scaler.update()

# Gradient clipping

torch.nn.utils.clip_grad_norm_(model.parameters(), max_norm=1.0)

# Memory management

torch.cuda.empty_cache() # Clear unused cache

# Efficient tensor operations

# Use in-place operations when possible

tensor.add_(1) # In-place addition

tensor.mul_(2) # In-place multiplication

9. Python Design Patterns in AI

Design patterns provide reusable solutions to common problems in software development.

9.1 Factory Pattern

from abc import ABC, abstractmethod

class Model(ABC):

@abstractmethod

def train(self, data):

pass

@abstractmethod

def predict(self, data):

pass

class LogisticRegression(Model):

def train(self, data):

print(“Training Logistic Regression”)

def predict(self, data):

return “LR predictions”

class RandomForest(Model):

def train(self, data):

print(“Training Random Forest”)

def predict(self, data):

return “RF predictions”

class NeuralNet(Model):

def train(self, data):

print(“Training Neural Network”)

def predict(self, data):

return “NN predictions”

class ModelFactory:

“””Factory for creating models.”””

@staticmethod

def create_model(model_type):

models = {

‘logistic’: LogisticRegression,

‘random_forest’: RandomForest,

‘neural_net’: NeuralNet

}

model_class = models.get(model_type)

if model_class is None:

raise ValueError(f”Unknown model type: {model_type}”)

return model_class()

# Usage

model = ModelFactory.create_model(‘neural_net’)

model.train(data)

predictions = model.predict(test_data)

9.2 Strategy Pattern

class OptimizationStrategy(ABC):

@abstractmethod

def optimize(self, gradients, parameters):

pass

class SGDStrategy(OptimizationStrategy):

def __init__(self, learning_rate=0.01):

self.learning_rate = learning_rate

def optimize(self, gradients, parameters):

return parameters – self.learning_rate * gradients

class AdamStrategy(OptimizationStrategy):

def __init__(self, learning_rate=0.001, beta1=0.9, beta2=0.999):

self.learning_rate = learning_rate

self.beta1 = beta1

self.beta2 = beta2

self.m = None

self.v = None

self.t = 0

def optimize(self, gradients, parameters):

if self.m is None:

self.m = np.zeros_like(gradients)

self.v = np.zeros_like(gradients)

self.t += 1

self.m = self.beta1 * self.m + (1 – self.beta1) * gradients

self.v = self.beta2 * self.v + (1 – self.beta2) * (gradients ** 2)

m_hat = self.m / (1 – self.beta1 ** self.t)

v_hat = self.v / (1 – self.beta2 ** self.t)

return parameters – self.learning_rate * m_hat / (np.sqrt(v_hat) + 1e-8)

class Trainer:

def __init__(self, strategy: OptimizationStrategy):

self.strategy = strategy

def set_strategy(self, strategy: OptimizationStrategy):

self.strategy = strategy

def update_parameters(self, gradients, parameters):

return self.strategy.optimize(gradients, parameters)

# Usage

trainer = Trainer(SGDStrategy(learning_rate=0.01))

# … training …

# Switch strategy

trainer.set_strategy(AdamStrategy())

# … continue training with Adam …

9.3 Observer Pattern

class Observable:

def __init__(self):

self._observers = []

def attach(self, observer):

self._observers.append(observer)

def detach(self, observer):

self._observers.remove(observer)

def notify(self, event, data):

for observer in self._observers:

observer.update(event, data)

class TrainingLogger:

def update(self, event, data):

if event == ‘epoch_end’:

print(f”Epoch {data[‘epoch’]}: Loss = {data[‘loss’]:.4f}”)

class CheckpointSaver:

def __init__(self, save_path):

self.save_path = save_path

self.best_loss = float(‘inf’)

def update(self, event, data):

if event == ‘epoch_end’:

if data[‘loss’] < self.best_loss:

self.best_loss = data[‘loss’]

# Save model

print(f”Saving checkpoint at epoch {data[‘epoch’]}”)

class EarlyStopping:

def __init__(self, patience=5):

self.patience = patience

self.counter = 0

self.best_loss = float(‘inf’)

def update(self, event, data):

if event == ‘epoch_end’:

if data[‘loss’] < self.best_loss:

self.best_loss = data[‘loss’]

self.counter = 0

else:

self.counter += 1

if self.counter >= self.patience:

print(“Early stopping triggered!”)

data[‘stop_training’] = True

class ModelTrainer(Observable):

def train(self, epochs):

for epoch in range(epochs):

# Training logic

loss = self._train_epoch()

# Notify observers

event_data = {‘epoch’: epoch, ‘loss’: loss}

self.notify(‘epoch_end’, event_data)

if event_data.get(‘stop_training’, False):

break

# Usage

trainer = ModelTrainer()

trainer.attach(TrainingLogger())

trainer.attach(CheckpointSaver(‘checkpoints/’))

trainer.attach(EarlyStopping(patience=10))

trainer.train(epochs=100)

9.4 Singleton Pattern

class ConfigManager:

_instance = None

def __new__(cls):

if cls._instance is None:

cls._instance = super().__new__(cls)

cls._instance._initialized = False

return cls._instance

def __init__(self):

if self._initialized:

return

self._initialized = True

self.config = {}

def set(self, key, value):

self.config[key] = value

def get(self, key, default=None):

return self.config.get(key, default)

# Usage – same instance everywhere

config1 = ConfigManager()

config1.set(‘learning_rate’, 0.001)

config2 = ConfigManager()

print(config2.get(‘learning_rate’)) # 0.001

print(config1 is config2)

ax.set_ylabel(‘Y’)

ax.set_zlabel(‘Z’)

ax.set_title(‘3D Surface Plot’)

fig.colorbar(surf)

plt.show()

# 3D scatter plot

fig = plt.figure(figsize=(10, 8))

ax = fig.add_subplot(111, projection=’3d’)

n = 500

xs = np.random.randn(n)

ys = np.random.randn(n)

zs = np.random.randn(n)

colors = np.random.randn(n)

scatter = ax.scatter(xs, ys, zs, c=colors, cmap=’viridis’, s=50, alpha=0.6)

ax.set_xlabel(‘X’)