Train a High-Quality Model

This section introduces the factors that most affect the model quality and how to train a high-quality text detection model.

Ensure Image Quality

Avoid overexposure, dimming, blur, occlusion, etc. These conditions can lead to the loss of features that the deep learning model relies on, which will affect the model training effect.

Poor images: The text areas are partially occluded, which may affect the detection result and the subsequent recognition result.

Suggestion: Make sure that the text areas are clear and complete.

Poor images: The text areas are partially occluded, which may affect the detection result and the subsequent recognition result.

improve model accuracy occlusion 1

improve model accuracy occlusion 2

Suggestion: Make sure that the text areas are clear and complete.

Poor image: overexposure	Poor image: underexposure

Suggestion: You can avoid overexposure by methods such as shading and avoid dimming by methods such as supplemental light.

Poor image: overexposure

Poor image: underexposure

improve model accuracy overexposure

improve model accuracy underexposure

Suggestion: You can avoid overexposure by methods such as shading and avoid dimming by methods such as supplemental light.

Ensure Data Quality

The Text Detection module can be used to detect the text area in an image. In practical applications, the text to be detected only takes a small part of the image. Thus, the camera and lighting conditions should stay stable once adjusted during data collection to avoid data deviations and ensure model performance.

Data Selection

Select Images with Different Text Orientations

Select diverse images. In practical applications, the texts in images may be horizontal (0° and 180°), vertical (90°), or tilted (other angles). If only images with horizontal or vertical texts are applied for training, the trained model can hardly detect tilted texts in images.

If some situations are not included in the dataset, the model will not go through adequate learning. As a result, the trained model cannot effectively detect the text areas given such conditions. In this case, data on such conditions must be collected and added to reduce the errors.

Select Appropriate Data

Control the image quantity of training set

When you first use the Text Detection module, it is recommended to use 30 to 50 images as a training set for model training. If the validation results are not up to standard, you can then add more data.
Balance data proportion

In the training set, the images with different text orientations should be balanced. Using diverse data can help improve model performance.
Reduce inadequate data

High-quality images can ensure better model training. Poor images or irrelevant images may do damage to model training.

It is not true that the larger the number of images the better. Adding a large number of inadequate images in the early stage is not conducive to model improvement later and will slow down the training process.

Ensure Labeling Quality

Ensure accuracy. During labeling, the selection frame should be placed as close to the edges of the text area as possible to minimize interference. Incomplete selections or overly large selection frames should be avoided.

Bad labeling

Good labeling

improve model accuracy fit 0

improve model accuracy fit 1

improve model accuracy fit 2