emotion_clf_pipeline.model
Emotion classification model components.
Provides multi-task DEBERTA-based emotion classification with sub-emotion mapping and intensity prediction. Supports both local and Azure ML model synchronization.
Classes
|
Multi-task emotion prediction engine. |
|
Multi-task DEBERTA-based emotion classifier. |
High-level interface for emotion classification. |
|
|
Handles DEBERTA model and tokenizer loading with device management. |
- class emotion_clf_pipeline.model.CustomPredictor(model, tokenizer, device, encoders_dir='models/encoders', feature_config=None)[source]
Bases:
object
Multi-task emotion prediction engine.
Handles emotion classification inference by combining the trained model with feature engineering and post-processing. Maps sub-emotions to main emotions for consistent predictions.
- __init__(model, tokenizer, device, encoders_dir='models/encoders', feature_config=None)[source]
Initialize emotion predictor with model and supporting components.
- post_process(df)[source]
Refine predictions by aligning sub-emotions with main emotions.
Uses probability distributions to select sub-emotions that are consistent with predicted main emotions, improving classification coherence.
- Parameters:
df (pd.DataFrame) – Predictions with sub_emotion_logits column
- Returns:
Refined predictions with emotion_pred_post_processed
- Return type:
pd.DataFrame
- class emotion_clf_pipeline.model.DEBERTAClassifier(*args, **kwargs)[source]
Bases:
Module
Multi-task DEBERTA-based emotion classifier.
Performs simultaneous classification for: - Main emotions (7 categories) - Sub-emotions (28 categories) - Emotion intensity (3 levels)
Combines DEBERTA embeddings with engineered features through projection layers.
- __init__(model_name, feature_dim, num_classes, hidden_dim=256, dropout=0.1)[source]
Initialize the multi-task emotion classifier.
- Parameters:
model_name (str) – Pretrained DEBERTA model identifier
feature_dim (int) – Dimension of engineered features
num_classes (dict) – Class counts for each task (emotion, sub_emotion, intensity)
hidden_dim (int) – Hidden layer dimension. Defaults to 256.
dropout (float) – Dropout probability. Defaults to 0.1.
- forward(input_ids, attention_mask, features)[source]
Compute multi-task emotion predictions.
- Parameters:
input_ids (torch.Tensor) – Tokenized input text
attention_mask (torch.Tensor) – Attention mask for input
features (torch.Tensor) – Engineered features
- Returns:
Logits for each classification task
- Return type:
- class emotion_clf_pipeline.model.EmotionPredictor[source]
Bases:
object
High-level interface for emotion classification.
Provides a simple API for predicting emotions from text with automatic model loading, Azure ML synchronization, and feature configuration. Handles single texts or batches transparently.
- ensure_best_baseline()[source]
Ensure we have the best available baseline model from Azure ML.
This is an alias for ensure_best_baseline_model() for backward compatibility. Checks Azure ML for models with better F1 scores than the current local baseline and downloads them if found.
- Returns:
True if a better model was downloaded and loaded, False otherwise
- Return type:
- ensure_best_baseline_model()[source]
Ensure we have the best available baseline model from Azure ML.
This method checks Azure ML for models with better F1 scores than the current local baseline and downloads them if found. It forces a reload of the prediction model to use the updated baseline.
- Returns:
True if a better model was downloaded and loaded, False otherwise
- Return type:
- predict(texts, feature_config=None, reload_model=False)[source]
Predict emotions for single text or batch of texts.
Automatically handles model loading, feature extraction, and result formatting. Returns structured predictions with emotion, sub-emotion, and intensity classifications.
- Parameters:
- Returns:
Prediction dict for single text, list for batch
- Return type:
- class emotion_clf_pipeline.model.ModelLoader(model_name='microsoft/deberta-v3-xsmall', device=None)[source]
Bases:
object
Handles DEBERTA model and tokenizer loading with device management.
Supports loading pretrained models, applying custom weights, and creating predictor instances. Provides automatic device selection (GPU/CPU).
- __init__(model_name='microsoft/deberta-v3-xsmall', device=None)[source]
Initialize model loader with tokenizer.
- Parameters:
model_name (str) – Pretrained model identifier. Defaults to ‘microsoft/deberta-v3-xsmall’.
device (torch.device, optional) – Target device. Auto-detects if None.
- create_predictor(model, encoders_dir='models/encoders', feature_config=None)[source]
Create predictor instance for emotion classification.
- Parameters:
- Returns:
Ready-to-use predictor instance
- Return type:
- ensure_best_baseline_model()[source]
Ensure we have the best available baseline model from Azure ML.
This method checks Azure ML for models with better F1 scores than the current local baseline and downloads them if found. It forces a reload of the prediction model to use the updated baseline.
- Returns:
True if a better model was downloaded and loaded, False otherwise
- Return type:
- load_baseline_model(weights_dir='models/weights', sync_azure=True)[source]
Load stable production model with optional Azure ML sync.
- load_dynamic_model(weights_dir='models/weights', sync_azure=True)[source]
Load latest trained model with optional Azure ML sync.
- load_model(feature_dim, num_classes, weights_path=None, hidden_dim=256, dropout=0.1)[source]
Create and optionally load pretrained model weights.
- Parameters:
- Returns:
Loaded model ready for inference or training
- Return type:
- Raises:
FileNotFoundError – If weights_path doesn’t exist
RuntimeError – If weight loading fails