Converting Youtube Video to American Sign Language Translation Using Convolution Neural Network and Video Processing

Reetu Jain

Authors

Reetu Jain

Keywords:

Youtube video, American Sign Language, Image processing, Video processing, Convolution neural network, subtitles

Abstract

Sign language is a form of communication used by the deaf population of the world with others or amongst the deaf population. At an estimate 5% of the world population is either deaf or suffers from hearing loss. It is often observed that the youtube videos, although the subtitles are available in English or other native languages, do not have any SL based subtitles. Therefore the comprehensive intention of the present study is to develop an American Sign Language (ASL) based subtitles for the youtube videos. The proposed method is a three phase framework that not only automates the process of youtube video and its transcript downloading but also automatically converts the text in the transcript to ASL based subtitles and mount that on the video. The proposed method is an integration of deep learning based Convolution Neural Network (CNN) and image and video processing techniques. A torch based CNN model is developed and coded in Python 3.8.5. The model showed training and testing accuracy of 99.982% and 98% respectively. The strength of a model lies in its ability to be applied in a practical problem. Therefore, the proposed integrated method is applied to extract a random video from youtube.

References

Starner, T., Weaver, J., & Pentland, A. (1998). Real-time american sign language recognition using desk and wearable computer based video. IEEE Transactions on pattern analysis and machine intelligence, 20(12), 1371-1375.

Mitchell, R. E., Young, T. A., Bachelda, B., & Karchmer, M. A. (2006). How many people use ASL in the United States? Why estimates need updating. Sign Language Studies, 6(3), 306-335.

Starner, T. E. (1995). Visual Recognition of American Sign Language Using Hidden Markov Models. Massachusetts Inst Of Tech Cambridge Dept Of Brain And Cognitive Sciences.

Oz, C., & Leu, M. C. (2011). American sign language word recognition with a sensory glove using artificial neural networks. Engineering Applications of Artificial Intelligence, 24(7), 1204-1213.

Oz, C., & Leu, M. C. (2007). Linguistic properties based on American Sign Language isolated word recognition with artificial neural networks using a sensory glove and motion tracker. Neurocomputing, 70(16-18), 2891–2901. https://doi.org/10.1016/j.neucom.2006.04.016

Huenerfauth, M., & Lu, P. (2010). Accurate and accessible motion-capture glove calibration for sign language data collection. ACM Transactions on Accessible Computing (TACCESS), 3(1), 1-32.

Luzanin, O., & Plancak, M. (2014). Hand gesture recognition using low-budget data glove and cluster-trained probabilistic neural network. Assembly Automation.

Sun, C., Zhang, T., Bao, B. K., Xu, C., & Mei, T. (2013). Discriminative exemplar coding for sign language recognition with kinect. IEEE Transactions on Cybernetics, 43(5), 1418-1428.

Tao, W., Leu, M. C., & Yin, Z. (2018). American Sign Language alphabet recognition using Convolutional Neural Networks with multiview augmentation and inference fusion. Engineering Applications of Artificial Intelligence, 76, 202-213.

Fujiwara, E., dos Santos, M. F. M., & Suzuki, C. K. (2014). Flexible optical fibre bending transducer for application in glove-based sensors. IEEE Sensors Journal, 14(10), 3631-3636.

Tubaiz, N., Shanableh, T., & Assaleh, K. (2015). Glove-based continuous Arabic sign language recognition in user-dependent mode. IEEE Transactions on Human-Machine Systems, 45(4), 526-533.

Aly, W., Aly, S., & Almotairi, S. (2019). User-independent American sign language alphabet recognition based on depth image and PCANet features. IEEE Access, 7, 123138-123150.

Lee, B. G., & Lee, S. M. (2017). Smart wearable hand device for sign language interpretation system with sensors fusion. IEEE Sensors Journal, 18(3), 1224-1232.

Paudyal, P., Lee, J., Banerjee, A., & Gupta, S. K. (2019). A comparison of techniques for sign language alphabet recognition using armband wearables. ACM Transactions on Interactive Intelligent Systems (TiiS), 9(2-3), 1-26.

Wu, J., Sun, L., & Jafari, R. (2016). A wearable system for recognizing American sign language in real-time using IMU and surface EMG sensors. IEEE journal of biomedical and health informatics, 20(5), 1281-1290.

Wu, J., & Jafari, R. (2017). Wearable Computers for Sign Language Recognition. In Handbook of Large-Scale Distributed Computing in Smart Healthcare (pp. 379-401). Springer, Cham.

Wu, J., Tian, Z., Sun, L., Estevez, L., & Jafari, R. (2015, June). Real-time American sign language recognition using wrist-worn motion and surface EMG sensors. In 2015 IEEE 12th International Conference on Wearable and Implantable Body Sensor Networks (BSN) (pp. 1-6). IEEE.

Gu, S., Pednekar, M., & Slater, R. (2019). Improve image classification using data augmentation and neural networks. SMU Data Science Review, 2(2), 1.

Deafness and hearing loss, retrieved from https://www.who.int/news-room/fact-sheets/detail/deafness-and-hearing-loss on 30th April, 2022.

Converting Youtube Video to American Sign Language Translation Using Convolution Neural Network and Video Processing

Authors

Keywords:

Abstract

References

Downloads

Published

How to Cite

Issue

Section

Most read articles by the same author(s)