Data preparation for the Inference Pipeline =========================================== This module contains functions for data ingestion and preprocessing during the inference phase. Data Ingestion -------------- .. automodule:: emotion_detective.data.inference.data_ingestion :members: :undoc-members: :show-inheritance: - **mov_to_mp3_audio**: Extracts audio from a video file and saves it as an mp3 file. .. code-block:: python from emotion_detective.data.inference.data_ingestion import mov_to_mp3_audio # Example usage mov_to_mp3_audio('input_video.mp4') Troubleshooting ~~~~~~~~~~~~~~~ - **Problem**: `FileNotFoundError` when the input video file is not found. **Solution**: Ensure the input file path is correct and the file exists. .. code-block:: python try: mov_to_mp3_audio('input_video.mp4') except FileNotFoundError: print("The specified video file was not found. Please check the file path.") Data Preprocessing ------------------ .. automodule:: emotion_detective.data.inference.data_preprocessing :members: :undoc-members: :show-inheritance: - **transcribe_translate**: Transcribes and translates audio files. .. code-block:: python from emotion_detective.data.inference.data_preprocessing import transcribe_translate # Example usage transcribe_translate('audio_file.mp3') Troubleshooting ~~~~~~~~~~~~~~~ - **Problem**: `FileNotFoundError` when the audio file is not found. **Solution**: Ensure the input file path is correct and the file exists. .. code-block:: python try: transcribe_translate('audio_file.mp3') except FileNotFoundError: print("The specified audio file was not found. Please check the file path.") - **Problem**: `UnsupportedFormatError` when the audio file format is not supported. **Solution**: Convert the audio file to a supported format such as Mp3. .. code-block:: python try: transcribe_translate('audio_file.wav') except UnsupportedFormatError: print("The audio file format is not supported. Please convert it to Mp3 and try again.") - **dataset_loader**: Creates a PyTorch DataLoader for a given DataFrame. .. code-block:: python from emotion_detective.data.inference.data_preprocessing import dataset_loader # Example usage loader = dataset_loader(df, 'text_column', batch_size=32) Troubleshooting ~~~~~~~~~~~~~~~ - **Problem**: `KeyError` when the specified text column is not found in the DataFrame. **Solution**: Verify that the DataFrame contains the specified text column. .. code-block:: python try: loader = dataset_loader(df, 'text_column', batch_size=32) except KeyError: print("The specified text column was not found in the DataFrame. Please check the column name.") - **Problem**: `ValueError` when the batch size is invalid. **Solution**: Ensure that the batch size is a positive integer. .. code-block:: python try: loader = dataset_loader(df, 'text_column', batch_size=32) except ValueError: print("Invalid batch size. Please ensure the batch size is a positive integer.")