scripts.processing.gemini_ocr
Gemini OCR / HTR Script
Performs text recognition on label images using the Gemini API. Unlike Tesseract and Google Vision which only handle printed text, Gemini can process printed, handwritten, and mixed labels.
Output format matches the existing pipeline: JSON list of {ID, text, confidence}.
- Usage:
python gemini_ocr.py -d <image_dir> -o <output_dir> python gemini_ocr.py -d <output_dir> -o <output_dir> –categories printed handwritten mixed
Functions
|
|
|
Parse command-line arguments. |