Skip to content

Overview

MyOCR provides a powerful and flexible framework for building and deploying your own OCR pipelines. This library is designed with production readiness and developer experience in mind, it offers a high-level component architecture for easy integration and extension.

Core Components

MyOCR is built around several key concepts:

MyOCR Components

  • Model: Represents a neural network model. MyOCR supports loading ONNX models (OrtModel), standard PyTorch models (PyTorchModel), and custom PyTorch models defined by YAML configurations (CustomModel). Models handle the core computation.
  • Processor (CompositeProcessor): Prepares input data for a model and processes the model's raw output into a more usable format. Each predictor uses a specific processor.
  • Predictor: Combines a Model and a Processor to perform a specific inference task (e.g., text detection). It provides a user-friendly interface, accepting standard inputs (like PIL Images) and returning processed results (like bounding boxes).
  • Pipeline: Orchestrates multiple Predictors to perform complex, multi-step tasks like end-to-end OCR. Pipelines offer the highest-level interface for most common use cases.

Class Diagram MyOCR Class

Customization and Extension

MyOCR's modular design allows for easy customization:

Comments