label_processing Package
The label_processing package contains the core image processing functionality for the Entomological Label Information Extraction system.
Package Contents
Configuration module for entomological label information extraction. |
|
Empty Label Detection Module |
|
Label Detection Module (Detectron2 / Detecto) |
|
Label Rotation Module (TensorFlow) |
|
Utility functions for the entomological label processing pipeline. |
Modules
Configuration
Configuration module for entomological label information extraction. Handles platform-specific paths and environment variables.
- class label_processing.config.PathConfig[source]
Bases:
objectCentralized path configuration for cross-platform compatibility.
- get_model_path(model_type)[source]
Get path for a specific model type.
- Parameters:
model_type (str) – Type of model (‘detection’, ‘identifier’, ‘handwritten_printed’, ‘multi_single’)
- Returns:
Path to the model file
- Return type:
Path
- Raises:
ValueError – If model type is not recognized
- get_class_names(model_type)[source]
Get class names for a specific model type.
- ensure_directories()[source]
Create necessary directories if they don’t exist.
- validate_paths()[source]
Validate that all required paths exist.
- get_temp_dir()[source]
Get a temporary directory for the current platform.
- Returns:
Platform-appropriate temporary directory
- Return type:
Path
- label_processing.config.get_project_root()[source]
Get the project root directory.
- Return type:
- label_processing.config.get_model_path(model_type)[source]
Get path for a specific model.
Empty Label Detection
Empty Label Detection Module
Classifies label images as empty or non-empty based on the proportion of dark pixels within a cropped region. Used as the first filtering step in the traditional pipeline.
- label_processing.detect_empty_labels.detect_dark_pixels(image, crop_box, threshold=100)[source]
Detect the proportion of dark pixels in an image.
- label_processing.detect_empty_labels.is_empty(image, crop_margin, threshold)[source]
Determines if an image is empty based on a given threshold and crop margin.
- Parameters:
image (<module 'PIL.Image' from '/home/docs/checkouts/readthedocs.org/user_builds/entomological-label-information-extraction/envs/latest/lib/python3.11/site-packages/PIL/Image.py'>) – PIL Image object
crop_margin (float) – float, proportion of the image size to crop from the borders
threshold (float) – float, proportion of black pixels below which the image is considered empty
- Returns:
bool, whether the image is empty or not
- Return type:
- label_processing.detect_empty_labels.find_empty_labels(input_folder, output_folder, threshold=0.01, crop_margin=0.1)[source]
Find and copy empty and non-empty labels to respective folders (keeps originals in input).
- Parameters:
- Returns:
None
- Return type:
None
Label Detection
Label Detection Module (Detectron2 / Detecto)
Detects and crops individual labels from full specimen photographs using a trained Faster R-CNN object-detection model. Used by the traditional MLI pipeline; the Gemini pipeline uses gemini_processor.detect_and_classify instead.
- class label_processing.label_detection.PredictLabel(path_to_model, classes, jpg_path=None, threshold=0.8)[source]
Bases:
objectClass for predicting labels using a trained object detection model.
- path_to_model
Path to the trained model file.
- Type:
- classes
List of classes used in the model.
- Type:
- jpg_path
Path to a specific JPG file for prediction.
- Type:
str|Path|None
- threshold
Threshold value for scores. Defaults to 0.8.
- Type:
- model
Trained object detection model.
- Type:
detecto.core.Model
- property jpg_path
Property for JPG path.
- Type:
str|Path|None
- retrieve_model()[source]
Retrieve the trained object detection model using Detecto’s Model.load. Includes cross-platform compatibility fixes and integrity verification.
- Return type:
detecto.core.Model
- class_prediction(jpg_path=None)[source]
Predict labels for a given JPG file.
- Parameters:
jpg_path (Path) – Path to the JPG file.
- Returns:
Pandas DataFrame with prediction results.
- Return type:
pd.DataFrame
- label_processing.label_detection.prediction_parallel(jpg_dir, predictor, n_processes)[source]
Perform predictions for all JPG files in a directory with parallel processing.
- label_processing.label_detection.clean_predictions(jpg_dir, dataframe, threshold, out_dir=None)[source]
Filter predictions based on a threshold and save the results to a CSV file.
- Parameters:
- Returns:
Pandas DataFrame with filtered results.
- Return type:
pd.DataFrame
- label_processing.label_detection.crop_picture(img_raw, path, filename, **coordinates)[source]
Crop the picture using the given coordinates.
- Parameters:
img_raw (numpy.ndarray) – Input JPG converted to a numpy matrix by cv2.
path (str) – Path where the picture should be saved.
filename (str) – Name of the picture.
coordinates – Coordinates for cropping.
- Return type:
None
- label_processing.label_detection.create_crops(jpg_dir, dataframe, out_dir=PosixPath('/home/docs/checkouts/readthedocs.org/user_builds/entomological-label-information-extraction/checkouts/latest/docs'))[source]
Creates crops by using the csv from applying the model and the original pictures inside a directory.
Label Rotation
Label Rotation Module (TensorFlow)
Predicts and corrects the orientation of label images using a trained TensorFlow classification model that outputs one of four angle classes (0°, 90°, 180°, 270°). Used by the traditional pipeline; the Gemini pipeline determines rotation angles via the Gemini API instead.
- label_processing.label_rotation.load_image(image_path)[source]
Load an image from a file path.
- Parameters:
image_path (str) – Path to the image file.
- Returns:
Loaded image.
- Return type:
np.ndarray
- label_processing.label_rotation.rotate_image(image, angle)[source]
Rotate an image based on a given angle.
- Parameters:
image (np.ndarray) – Input image.
angle (int) – Angle of rotation in multiples of 90 degrees.
- Returns:
Rotated image.
- Return type:
np.ndarray
- label_processing.label_rotation.save_image(image, output_path)[source]
Save an image to a file path.
- label_processing.label_rotation.rotate_single_image(image_path, angle, output_dir)[source]
Rotate a single image based on a given angle and save the rotated image.
- label_processing.label_rotation.get_image_paths(input_image_dir)[source]
Get a list of image paths in the input directory.
- label_processing.label_rotation.load_images(image_paths)[source]
Load images from a list of image paths.
- Parameters:
image_paths (list) – List of image paths.
- Returns:
Loaded images.
- Return type:
np.ndarray
- label_processing.label_rotation.get_predicted_angles(model, images)[source]
Predict angles for a list of images using a trained model.
- Parameters:
model (tf.keras.Model) – Trained model.
images (np.ndarray) – List of images.
- Returns:
List of predicted angles.
- Return type:
- label_processing.label_rotation.rotate_images(image_paths, predicted_angles, output_image_dir)[source]
Rotate images based on their predicted angles and save them to the output directory.
- label_processing.label_rotation.debug_save_by_angle(image_paths, predicted_angles, output_base_dir)[source]
Copy images into angle-named subdirectories for visual debugging.
- label_processing.label_rotation.predict_angles(input_image_dir, output_image_dir, model_path, debug=False)[source]
Load a trained model, predict angles for input images, and rotate images accordingly.
- label_processing.label_rotation.rotate_image_pil(image_path, angle_deg, output_path)[source]
Rotate an image using PIL and save the result.
OCR Vision
- class label_processing.ocr_vision.VisionApi(path, image, credentials, encoding)[source]
Bases:
objectClass for interacting with the Google Cloud Vision API for OCR tasks on images.
- static read_image(path, credentials, encoding='utf8')[source]
Read an image file and return an instance of the VisionApi class.
- process_string(result_raw)[source]
Process the Google Vision OCR output, replacing newlines with spaces and encoding as specified.
TensorFlow Classifier
- label_processing.tensorflow_classifier.get_model(path_to_model)[source]
Load a trained Keras Sequential image classifier model with cross-platform compatibility.
- Parameters:
path_to_model (str) – Path to the model file.
- Returns:
Trained Keras Sequential image classifier model.
- Return type:
model (tf.keras.Sequential)
- label_processing.tensorflow_classifier.class_prediction(model, class_names, jpg_dir, out_dir=None, batch_size=32, max_images=10000)[source]
Create a dataframe with predicted classes for each picture with memory-safe batch processing.
- Parameters:
model (tf.keras.Sequential) – Trained Keras Sequential image classifier model.
class_names (list) – Model’s predicted classes.
jpg_dir (str) – Path to the directory containing the original jpgs.
out_dir (str) – Path where the CSV file will be stored.
batch_size (int) – Number of images to process in each batch (default: 32)
max_images (int) – Maximum number of images to process (default: 10000)
- Returns:
Pandas DataFrame with the predicted results.
- Return type:
DataFrame (pd.DataFrame)
- label_processing.tensorflow_classifier.create_dirs(dataframe, path)[source]
Create separate directories for every class.
- Parameters:
dataframe (pd.Dataframe) – DataFrame containing the classes as a column.
path (str) – Path of the chosen directory.
- Return type:
None
- label_processing.tensorflow_classifier.make_file_name(label_id, pic_class)[source]
Create a fitting filename.
- label_processing.tensorflow_classifier.rename_picture(img_raw, path, filename, pic_class)[source]
Rename the pictures using the predicted class.
- Parameters:
img_raw (numpy.ndarray) – Input jpg converted to a numpy matrix by cv2.
path (str) – Path where the picture should be saved.
filename (str) – Name of the picture.
pic_class (str) – Class of the label.
- Return type:
None
- label_processing.tensorflow_classifier.filter_pictures(jpg_dir, dataframe, out_dir=PosixPath('/home/docs/checkouts/readthedocs.org/user_builds/entomological-label-information-extraction/checkouts/latest/docs'))[source]
Create new folders for each class of the newly named classified pictures.
- Parameters:
jpg_dir (str) – Path to directory with jpgs.
dataframe (pd.DataFrame) – Pandas DataFrame with class predictions.
out_dir (Path) – Path to the target directory to save the cropped jpgs.
- Return type:
None
Text Recognition
- label_processing.text_recognition.find_tesseract()[source]
Searches for the tesseract executable and raises an error if it is not found.
- Return type:
None
- class label_processing.text_recognition.ImageProcessor(image, path, blocksize=None, c_value=None)[source]
Bases:
objectA class for image preprocessing and other image actions.
- property blocksize: int
- property c_value: int
- property image: ndarray
- property path: str
- copy_this()[source]
Creates a copy of the current Image instance.
- Returns:
A copy of the current Image instance.
- Return type:
ImageProcessor
- static read_image(path)[source]
Read an image from the specified path and return an ImageProcessor instance.
- Parameters:
path (str) – The path to a JPG file.
- Returns:
An instance of the ImageProcessor class.
- Return type:
ImageProcessor
- get_grayscale()[source]
Convert the image to grayscale.
- Returns:
An instance representing the grayscale image.
- Return type:
ImageProcessor
- blur(ksize=(5, 5))[source]
Apply Gaussian blur to the image.
- remove_noise()[source]
Remove noise from the image using median blur.
- Returns:
An instance representing the noise-reduced image.
- Return type:
ImageProcessor
- apply_clahe(clip_limit=2.0, tile_grid_size=(8, 8))[source]
Apply Contrast Limited Adaptive Histogram Equalization (CLAHE).
CLAHE improves contrast in images with uneven illumination or low contrast, which is common in aged specimen labels or images with inconsistent lighting.
- Parameters:
- Returns:
An instance of the Image class with CLAHE applied.
- Return type:
ImageProcessor
- normalize_illumination()[source]
Normalize image illumination using morphological operations.
This method corrects uneven lighting by estimating and removing the background illumination, useful for images with shadows or uneven flash lighting.
- Returns:
An instance of the Image class with normalized illumination.
- Return type:
ImageProcessor
- thresholding(thresh_mode)[source]
Perform thresholding on the image.
- Parameters:
thresh_mode (Threshmode) – The thresholding mode to use (OTSU, ADAPTIVE_MEAN, or ADAPTIVE_GAUSSIAN).
- Returns:
An instance representing the thresholded image.
- Return type:
ImageProcessor
- dilate()[source]
Dilate the image using a 5x5 kernel.
- Returns:
An instance representing the dilated image.
- Return type:
ImageProcessor
- erode()[source]
Erode the image using a 5x5 kernel.
- Returns:
An instance representing the eroded image.
- Return type:
ImageProcessor
- get_skew_angle()[source]
Calculate and return the skew angle of the image.
- Returns:
The skew angle in degrees or None if it couldn’t be determined.
- Return type:
Optional[np.float64]
- deskew(angle)[source]
Rotate the image to deskew it.
- Parameters:
angle (Optional[np.float64]) – The skew angle to use for deskewing.
- Returns:
An instance representing the deskewed image.
- Return type:
ImageProcessor
- preprocessing(thresh_mode, use_clahe=False, normalize_illum=False, clahe_clip_limit=2.0, clahe_tile_grid_size=(8, 8))[source]
Perform a series of preprocessing steps on the image.
- Parameters:
thresh_mode (Threshmode) – The thresholding mode to use (OTSU, ADAPTIVE_MEAN, or ADAPTIVE_GAUSSIAN).
use_clahe (bool, optional) – Apply CLAHE for contrast enhancement. Useful for low-contrast or faded labels. Defaults to False.
normalize_illum (bool, optional) – Apply illumination normalization to correct uneven lighting. Useful for images with shadows or hotspots. Defaults to False.
clahe_clip_limit (float, optional) – CLAHE contrast limiting threshold. Defaults to 2.0.
clahe_tile_grid_size (tuple[int, int], optional) – CLAHE grid size. Defaults to (8, 8).
- Returns:
An instance of the Image class representing the preprocessed image.
- Return type:
ImageProcessor
- read_qr_code()[source]
Tries to identify if a picture has a QR-code and then reads and returns it.
- Returns:
Decoded QR-code text as a str or None if there is no QR-code found.
- Return type:
Optional[str]
- save_image(dir_path, appendix=None)[source]
Save the image to a specified directory with an optional appendix.
- class label_processing.text_recognition.Threshmode(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)[source]
Bases:
EnumDifferent possibilities for thresholding.
- Parameters:
Enum (int)
- OTSU = 1
- ADAPTIVE_MEAN = 2
- ADAPTIVE_GAUSSIAN = 3
OCR preprocessing summary
The text_recognition.ImageProcessor applies, prior to Tesseract OCR:
- grayscale conversion
- Gaussian/median denoising
- binarization via Otsu or adaptive mean/Gaussian (block size/C configurable)
- skew estimation within ±10° and deskewing
- optional morphological cleaning (dilation/erosion)
Google Vision OCR is invoked on the rotated ROI without thresholding; word-level bounding boxes are captured via ocr_vision.
Utilities
Utility functions for the entomological label processing pipeline.
Provides image validation, filename generation, JSON/CSV I/O, NURI format checking, and model integrity verification helpers used across all pipeline variants.
- label_processing.utils.validate_image_integrity(filepath, max_size_mb=25, max_dimensions=(8000, 8000))[source]
Validate image file integrity with strict memory safety limits.
- label_processing.utils.check_dir(directory)[source]
Checks if the directory contains valid jpg files with integrity validation.
- Parameters:
directory (str) – path to directory
- Raises:
FileNotFoundError – raised if no valid jpg files are found in the directory
ValueError – raised if corrupted image files are detected
- Return type:
None
- label_processing.utils.generate_filename(original_path, appendix, extension=None)[source]
Gets the path to a file or directory as an input and returns it with an appendix added to the end.
- label_processing.utils.save_json(data, filename, path)[source]
Saves a json file with human-readable format.
- label_processing.utils.check_nuri_format(transcript)[source]
Check NURI’s format in OCR transcription “text”.
- label_processing.utils.replace_nuri(transcript)[source]
Correct NURI format in OCR transcription JSON output.
- label_processing.utils.load_dataframe(filepath_csv)[source]
Loads the CSV file using Pandas.
- Parameters:
filepath_csv (str) – path to the CSV file
- Returns:
The CSV as a Pandas DataFrame
- Return type:
pd.DataFrame
- label_processing.utils.load_jpg(filepath)[source]
Loads the jpg files using the OpenCV module.
- Parameters:
filepath (str) – path to jpg files
- Returns:
OpenCV image object
- Return type:
np.ndarray
- label_processing.utils.load_json(file)[source]
Load JSON data from a file and deserialize it.
- label_processing.utils.read_vocabulary(file)[source]
Read a CSV file containing vocabulary and convert it to a dictionary.
- label_processing.utils.verify_model_integrity(model_path, checksums_file=None, require_checksum=True)[source]
SECURITY: Mandatory model file integrity verification using SHA256 checksums.
- Parameters:
- Returns:
True if model integrity is verified, False otherwise
- Return type:
- Raises:
SecurityError – If model integrity cannot be verified and require_checksum=True