Collect the Training Data¶
Attention
Collecting data is one of the most critical parts of a deep learning project. The final effect of the model largely depends on the quality of the training data. A high-quality dataset is a prerequisite for effective model training and accurate prediction.
Check the Data Collection Environment¶
Please avoid conditions including overexposure, underexposure, color distortion, blurriness, blockage, etc., that will result in the loss of the features on which the deep learning model relies and thereby affect the model’s performance.
Please ensure that the backgrounds, perspectives, and camera distances from the objects for data collection are consistent with those of the actual application scenarios. Any inconsistencies will reduce the performance of the model in the actual application. In severe cases, data need to be recollected and the model needs to be re-trained. Therefore, please confirm the detailed conditions of the actual application scenario before data collection.
Quantity of Data to Collect¶
If there is only one object class, please collect around 50 images.
If there are multiple object classes, please collect around 30 images for each class. Total number of images to collect = 30 * number of classes.
The above is a general guideline for the quantity of data to collect, and typical industrial applications have more specific requirements. Please see Data Collection Examples from Past Projects for an example.
Attention
If the training dataset is too small, the model will not have enough samples and can not be trained effectively; the test error rate will also be high. If the training dataset is too large, the training time will be significantly increased. Please make sure the size of the dataset is appropriate for actual needs.
Object Placing for Data Collection¶
All different placing conditions should be included in the dataset, and the number of images for each placing condition should be reasonably allocated based on the actual project conditions.
For example, if the objects come in horizontal and vertical poses in the actual application, but only images of horizontal incoming objects are collected and used for training, then the resulting model’s performance on the vertical objects cannot be guaranteed.
Another example is that, if the objects come overlapping in the actual application, but only images of separately placed objects are collected and used for training, then the resulting model’s performance on the overlapping objects cannot be guaranteed.
Therefore, when collecting data, please take all circumstances in the actual application into consideration as much as possible. Factors include the following:
All object orientations that may appear in the actual application;
All object positions that may appear in the actual application;
All spatial relationships between objects that may appear in the actual application.
Attention
If some circumstances are omitted from data collection, the deep learning model will not be trained properly for and will fail to output satisfactory results in such circumstances. In this case, please include data on omitted circumstances to avoid errors.
Object orientation
Object position
Spatial relationship between objects
Use Mech-Vision to Collect Data¶
After checking the data collection environment, determining the data quantity to collect, and listing all the possible ways of object placing, please use the following Steps in Mech-Vision to collect the image data. See Capture Images From Camera for detailed instructions.
Data Collection Examples from Past Projects¶
Metal workpiece, single class¶
Data quantity: 50 pictures for single-class objects.
Orientation: The objects may lie flat or stand on sides, both of which need to be considered.
Position: The objects may be in the center, along the edges, or in the corners of the bin, or placed on different layers.
Spatial relationship: The objects may be overlapping, and occasionally parallelly placed.
The following are some examples of the images collected:
Beauty and personal care products, seven classes¶
Classification is required as there is more than one class of objects.
Cases, where products of the same class are placed in many orientations and products of multiple classes are placed together, need to be considered to fully capture the object features.
For cases where only a single class of objects are placed in the bin, five images for each class should be collected. For cases where objects of multiple classes are mixed in the bin, the total number of images to collect should be (20 * number of classes).
Orientation: The objects may lie flat, stand on sides, or lean at an angle. All sides of the objects need to be captured.
Position: The objects may be in the center, along the edges, or in the corners of the bin.
Spatial relationship: The objects may be overlapping, occasionally parallelly placed, and tightly fitted.
The following are some examples of the images collected:
One class
Multiple classes
Track shoe unit, multiple classes (models)¶
Number of images to collect: 30 * number of models.
Orientation: Only the case where the front face is up needs to be considered.
Position: Relatively simple. Only objects in the top, middle, and bottom layers need to be considered.
Spatial relationship: The objects are orderly placed and tightly fitted.
The following are some examples of the images collected:
Metal workpiece, single class¶
Data quantity: The objects are of a single class and are placed in a single layer, so 50 images need to be collected.
Orientation: Only the case where the front face is up needs to be considered.
Position: The objects may be in the center, along the edges, or in the corners of the bin.
Spatial relationship: The case where the objects are tightly fitted needs to be considered.
The following are some examples of the images collected:
Metal workpiece, single class¶
The objects are stacked in multiple layers, and 30 images need to be collected.
Orientation: Only the case where the front face is up needs to be considered.
Position: The objects may be in the center, along the edges, or in the corners of the bin, as well as in the top, middle, and bottom layers.
Spatial relationship: The case where the objects are tightly fitted needs to be considered.
The following are some examples of the images collected: