Collect the Training Data

Attention

Collecting data is one of the most critical parts of a deep learning project. The final effect of the model largely depends on the quality of the training data. A high-quality dataset is a prerequisite for effective model training and accurate prediction.

Check the Data Collection Environment

  1. Please avoid conditions including overexposure, underexposure, color distortion, blurriness, blockage, etc., that will result in the loss of the features on which the deep learning model relies and thereby affect the model’s performance.

    ../../../_images/1_overexposed1.png
    ../../../_images/2_darker_lighting.png
    ../../../_images/3_color_distortion.png
    ../../../_images/4_obscure.png
    ../../../_images/5_occluded.png

    Figure 1. Examples of data collection environment conditions

  2. Please ensure that the backgrounds, perspectives, and camera distances from the objects for data collection are consistent with those of the actual application scenarios. Any inconsistencies will reduce the performance of the model in the actual application. In severe cases, data need to be recollected and the model needs to be re-trained. Therefore, please confirm the detailed conditions of the actual application scenario before data collection.

    ../../../_images/6_background_inconsistent.png
    ../../../_images/7_field_mismatch.png
    ../../../_images/8_height_mismatch.png

    Figure 2. Inconsistencies between data collection environment and application scenario

Quantity of Data to Collect

  • If there is only one object class, please collect around 50 images.

  • If there are multiple object classes, please collect around 30 images for each class. Total number of images to collect = 30 * number of classes.

  • The above is a general guideline for the quantity of data to collect, and typical industrial applications have more specific requirements. Please see Data Collection Examples from Past Projects for an example.

Attention

If the training dataset is too small, the model will not have enough samples and can not be trained effectively; the test error rate will also be high. If the training dataset is too large, the training time will be significantly increased. Please make sure the size of the dataset is appropriate for actual needs.

Object Placing for Data Collection

All different placing conditions should be included in the dataset, and the number of images for each placing condition should be reasonably allocated based on the actual project conditions.

For example, if the objects come in horizontal and vertical poses in the actual application, but only images of horizontal incoming objects are collected and used for training, then the resulting model’s performance on the vertical objects cannot be guaranteed.

Another example is that, if the objects come overlapping in the actual application, but only images of separately placed objects are collected and used for training, then the resulting model’s performance on the overlapping objects cannot be guaranteed.

Therefore, when collecting data, please take all circumstances in the actual application into consideration as much as possible. Factors include the following:

  • All object orientations that may appear in the actual application;

  • All object positions that may appear in the actual application;

  • All spatial relationships between objects that may appear in the actual application.

Attention

If some circumstances are omitted from data collection, the deep learning model will not be trained properly for and will fail to output satisfactory results in such circumstances. In this case, please include data on omitted circumstances to avoid errors.

  1. Object orientation

    ../../../_images/9_different_towards.png

    Figure 3. Objects’ different sides face up

  2. Object position

    ../../../_images/10_different_situations.png

    Figure 4. Objects are in the center, along the edges, or in the corners of the bin

    ../../../_images/11_different_layers.png

    Figure 5. Objects are on different layers

  3. Spatial relationship between objects

    ../../../_images/12_different_positions.png

    Figure 6. Objects are separately placed or overlapping

    ../../../_images/13_different_positions.png

    Figure 7. Objects are closely fitted

Use Mech-Vision to Collect Data

After checking the data collection environment, determining the data quantity to collect, and listing all the possible ways of object placing, please use the following Steps in Mech-Vision to collect the image data. See Capture Images From Camera for detailed instructions.

../../../_images/step_combination.png

Figure 8. Data collection Steps in Mech-Vision

Data Collection Examples from Past Projects

Metal workpiece, single class

  • Data quantity: 50 pictures for single-class objects.

  • Orientation: The objects may lie flat or stand on sides, both of which need to be considered.

  • Position: The objects may be in the center, along the edges, or in the corners of the bin, or placed on different layers.

  • Spatial relationship: The objects may be overlapping, and occasionally parallelly placed.

The following are some examples of the images collected:

../../../_images/14_metal_part_placement_status.png

Figure 9. Sparsely scattered (top left), densely scattered (top right), overlapping (bottom left), and very densely scattered (bottom right)

../../../_images/15_metal_part_poses.png

Figure 10. Lying flat, standing on sides, overlapping, and parallelly placed

Beauty and personal care products, seven classes

  • Classification is required as there is more than one class of objects.

  • Cases, where products of the same class are placed in many orientations and products of multiple classes are placed together, need to be considered to fully capture the object features.

  • For cases where only a single class of objects are placed in the bin, five images for each class should be collected. For cases where objects of multiple classes are mixed in the bin, the total number of images to collect should be (20 * number of classes).

  • Orientation: The objects may lie flat, stand on sides, or lean at an angle. All sides of the objects need to be captured.

  • Position: The objects may be in the center, along the edges, or in the corners of the bin.

  • Spatial relationship: The objects may be overlapping, occasionally parallelly placed, and tightly fitted.

The following are some examples of the images collected:

One class

../../../_images/16_singel_class_subject_positions.png

Figure 11. In the corners (top left), tightly fitted (top right), closely placed (bottom left), and sparsely scattered (bottom right)

Multiple classes

../../../_images/17_mix_classes_subject_positions.png

Figure 12. Closely placed, jammed in bin corners, and scattered and overlapping

Track shoe unit, multiple classes (models)

  • Number of images to collect: 30 * number of models.

  • Orientation: Only the case where the front face is up needs to be considered.

  • Position: Relatively simple. Only objects in the top, middle, and bottom layers need to be considered.

  • Spatial relationship: The objects are orderly placed and tightly fitted.

The following are some examples of the images collected:

../../../_images/18_different_layers.png

Figure 13. Objects in the top, middle, and bottom layers

Metal workpiece, single class

  • Data quantity: The objects are of a single class and are placed in a single layer, so 50 images need to be collected.

  • Orientation: Only the case where the front face is up needs to be considered.

  • Position: The objects may be in the center, along the edges, or in the corners of the bin.

  • Spatial relationship: The case where the objects are tightly fitted needs to be considered.

The following are some examples of the images collected:

../../../_images/19_different_situations.png

Figure 14. One full layer, along the edges, and in the corners of the bin

Metal workpiece, single class

  • The objects are stacked in multiple layers, and 30 images need to be collected.

  • Orientation: Only the case where the front face is up needs to be considered.

  • Position: The objects may be in the center, along the edges, or in the corners of the bin, as well as in the top, middle, and bottom layers.

  • Spatial relationship: The case where the objects are tightly fitted needs to be considered.

The following are some examples of the images collected:

../../../_images/20_different_layers_positions.png

Figure 15. In the top layer (top left), a small amount in the top layer (top right), a full bottom layer (bottom left), and in the bottom layer along the bin edges (bottom right)