TY - GEN
T1 - Human-readable fiducial marker classification using convolutional neural networks
AU - Liu, Yanfeng
AU - Psota, Eric T.
AU - Pérez, Lance C.
PY - 2017/9/27
Y1 - 2017/9/27
N2 - Many applications require both the location and identity of objects in images and video. Most existing solutions, like QR codes, AprilTags, and ARTags use complex machine-readable fiducial markers with heuristically derived methods for detection and classification. However, in applications where humans are integral to the system and need to be capable of locating objects in the environment, fiducial markers must be human readable. An obvious and convenient choice for human readable fiducial markers are alphanumeric characters (Arabic numbers and English letters). Here, a method for classifying characters using a convolutional neural network (CNN) is presented. The network is trained with a large set of computer generated images of characters where each is subjected to a carefully designed set of augmentations designed to simulate the conditions inherent in video capture. These augmentations include rotation, scaling, shearing, and blur. Results demonstrate that training on large numbers of synthetic images produces a system that works on real images captured by a video camera. The result also reveal that certain characters are generally more reliable and easier to recognize than others, thus the results can be used to intelligently design a human-readable fiducial markers system that avoids confusing characters.
AB - Many applications require both the location and identity of objects in images and video. Most existing solutions, like QR codes, AprilTags, and ARTags use complex machine-readable fiducial markers with heuristically derived methods for detection and classification. However, in applications where humans are integral to the system and need to be capable of locating objects in the environment, fiducial markers must be human readable. An obvious and convenient choice for human readable fiducial markers are alphanumeric characters (Arabic numbers and English letters). Here, a method for classifying characters using a convolutional neural network (CNN) is presented. The network is trained with a large set of computer generated images of characters where each is subjected to a carefully designed set of augmentations designed to simulate the conditions inherent in video capture. These augmentations include rotation, scaling, shearing, and blur. Results demonstrate that training on large numbers of synthetic images produces a system that works on real images captured by a video camera. The result also reveal that certain characters are generally more reliable and easier to recognize than others, thus the results can be used to intelligently design a human-readable fiducial markers system that avoids confusing characters.
KW - computer vision
KW - convolutional neural network
KW - fiducial marker
KW - machine learning
UR - http://www.scopus.com/inward/record.url?scp=85033700518&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85033700518&partnerID=8YFLogxK
U2 - 10.1109/EIT.2017.8053435
DO - 10.1109/EIT.2017.8053435
M3 - Conference contribution
AN - SCOPUS:85033700518
T3 - IEEE International Conference on Electro Information Technology
SP - 606
EP - 610
BT - 2017 IEEE International Conference on Electro Information Technology, EIT 2017
PB - IEEE Computer Society
T2 - 2017 IEEE International Conference on Electro Information Technology, EIT 2017
Y2 - 14 May 2017 through 17 May 2017
ER -