Addressing the selection bias in voice assistance: training voice assistance model in python with equal data selection

Journals

Title：
Addressing the selection bias in voice assistance: training voice assistance model in python with equal data selection
Author：
Kashav Piya, Srijal Shrestha, Cameran Frank, Estephanos Jebessa, Tauheed Khan Mohd
Author Affiliation：
Department of Math & Computer Science, Augustana College, Rock Island, IL, USA
Received：Apr.5, 2023
Accepted：Apr.20, 2023
Published：May.5, 2023

Abstract

In recent times, voice assistants have become a part of our day-to-day lives, allowing information retrieval by voice synthesis, voice recognition, and natural language processing. These voice assistants can be found in many modern-day devices such as Apple, Amazon, Google, and Samsung. This project is primarily focused on Virtual Assistance in Natural Language Processing. Natural Language Processing is a form of AI that helps machines understand people and create feedback loops. This project will use deep learning to create a Voice Recognizer and use Commonvoice and data collected from the local community for model training using Google Colaboratory. After recognizing a command, the AI assistant will be able to perform the most suitable actions and then give a response. The motivation for this project comes from the race and gender bias that exists in many virtual assistants. The computer industry is primarily dominated by the male gender, and because of this, many of the products produced do not regard women. This bias has an impact on natural language processing. This project will be utilizing various open-source projects to implement machine learning algorithms and train the assistant algorithm to recognize different types of voices, accents, and dialects. Through this project, the goal to use voice data from underrepresented groups to build a voice assistant that can recognize voices regardless of gender, race, or accent. Increasing the representation of women in the computer industry is important for the future of the industry. By representing women in the initial study of voice assistants, it can be shown that females play a vital role in the development of this technology. In line with related work, this project will use first-hand data from the college population and middle-aged adults to train voice assistant to combat gender bias.

Keywords

Voice Assistance, Virtual Assistance, Python 3.10, Pyttsx3, PyTorch, JSON.

Doi

https://doi.org/10.58396/cvs020103

References

[1] Subhash, S., Srivatsa, P. N., Siddesh, S., Ullas, A., & Santhosh, B. (2020, July). Artificial intelligence-based voice assistant. In 2020 Fourth world conference on smart trends in systems, security and sustainability (WorldS4) (pp. 593-596). IEEE, 2020.DOI:10.1109/WorldS450073.2020.9210344

[2] Nasirian, F., Ahmadian, M., & Lee, O. K. D. (2017). AI-based voice assistant systems: Evaluating from the interaction and trust perspectives.

[3] Kudina, O. (2021). “Alexa, who am I?”: voice assistants and hermeneutic lemniscate as the technologically mediated sensemaking. Human Studies, 44(2), 233-253. DOI：10.1007/s10746-021-09572-910

[4] Chattaraman, V., Kwon, W. S., Gilbert, J. E., & Ross, K. (2019). Should AI-Based, conversational digital assistants employ social-or task-oriented interaction style? A task-competency and reciprocity perspective for older adults. Computers in Human Behavior, 90, 315-330. DOI：10.1016/j.chb.2018.08.048

[5] Schmidt, M., & Braunger, P. (2018). Towards a speaking style-adaptive assistant for task-oriented applications. Studientexte zur Sprachkommunikation: Elektronische Sprachsignalverarbeitung 2018, 143-150.

[6] Bernaras, A. (1994). Problem-oriented and task-oriented models of design in the COMMONKADS framework. In Artificial Intelligence in Design’94 (pp. 499-516). Springer Netherlands.

[7] Bringsjord, S., & Schimanski, B. (2003, August). What is artificial intelligence? Psychometric AI as an answer. In IJCAI (pp. 887-893).

[8] Hendrix, G. G., Sacerdoti, E. D., Sagalowicz, D., & Slocum, J. (1978). Developing a natural language interface to complex data. ACM Transactions on Database Systems (TODS), 3(2), 105-147. DOI：10.1145/320251.320253

[9] Wold, J. B. (2006). Difficulties in learning English as a second or foreign language. DOI：10.13140/RG.2.2.35775.23202

[10] Beller, G., & Rodet, X. (2007, August). Content-based transformation of the expressivity in speech. In Proceedings of the 16th ICPhS (pp. 2157-2160).

[11] Caliskan, A. (2021). Detecting and mitigating bias in natural language processing. Res. Rep, Brookings Inst., Washington, DC [Google Scholar].

[12] Chowdhary, K., & Chowdhary, K. R. (2020). Natural language processing. Fundamentals of artificial intelligence, 603-649. DOI：10.1002/aris.1440370103

[13] Reddy, D. R. (1976). Speech recognition by machine: A review. Proceedings of the IEEE, 64(4), 501-531. DOI ：10.1109/PROC.1976.10158

[14] Meyer, J., Dentel, L., & Meunier, F. (2013). Speech recognition in natural background noise. PloS one, 8(11), e79279. DOI：10.1371/annotation/012d9419-8135-40ab-8c81-ce46e8e708d0

[15] Laroche, J. (2002). Time and pitch scale modification of audio signals. Applications of digital signal processing to audio and acoustics, 279-309.

[16] Kim, D. S., Lee, S. Y., & Kil, R. M. (1999). Auditory processing of speech signals for robust speech recognition in real-world noisy environments. IEEE Transactions on speech and audio processing, 7(1), 55-69. DOI：10.1109/89.736331

[17] Walden, R. H. (1999). Analog-to-digital converter survey and analysis. IEEE Journal on selected areas in communications, 17(4), 539-550.

[18] Israel, G. D. (1992). Determining sample size.

[19] Bajorek, J. P. (2019). Voice recognition still has significant race and gender biases. Harvard Business Review, 10.

[20] Lima, L., Furtado, V., Furtado, E., & Almeida, V. (2019, May). Empirical analysis of bias in voice-based personal assistants. In Companion Proceedings of the 2019 World Wide Web Conference (pp. 533-538). DOI:10.1145/3308560.3317597

[21] Zhang, N., Mi, X., Feng, X., Wang, X., Tian, Y., & Qian, F. (2019, May). Dangerous skills: Understanding and mitigating security risks of voice-controlled third-party functions on virtual personal assistant systems. In 2019 IEEE Symposium on Security and Privacy (SP) (pp. 1381-1396). IEEEcc. DOI:10.1109/SP.2019.00016

[22] Diao, W., Liu, X., Zhou, Z., & Zhang, K. (2014, November). Your voice assistant is mine: How to abuse speakers to steal information and control your phone. In Proceedings of the 4th ACM Workshop on Security and Privacy in Smartphones & Mobile Devices (pp. 63-74). DOI：10.1145/2666620.2666623

[23] Watson, S. (2019). The Unheard Female Voice: Women are more likely to be talked over and unheeded. But SLPs can help them speak up and be heard.

[24] 3 Types of Speech Recognition Data (and What They’re Used For). (2021, May 31). https://summalinguae.com/. Retrieved April 19, 2023, from https://summalinguae.com/data/types-of-speech-recognition-data/.

[25] Zilberman, A., & Ice, L. (2021). Why computer occupations are behind strong STEM employment growth in the 2019–29 decade. Computer, 4(5,164.6), 11-5.

[26] Carter, L. (2006). Why students with an apparent aptitude for computer science don't choose to major in computer science. ACM SIGCSE Bulletin, 38(1), 27-31. DOI：10.1145/1124706.1121352

[27] Writers, S. (2021). Women in Computer Science: Getting Involved in STEM. Computer Science. September, 29.

[28] Robison, M. (2020). Voice assistants have a gender bias problem. What can we do about it.

[29] Yamazaki, K., Ueda, R., Nozawa, S., Kojima, M., Okada, K., Matsumoto, K., ... & Inaba, M. (2012). Home-assistant robot for an aging society. Proceedings of the IEEE, 100(8), 2429-2441. DOI：10.1109/JPROC.2012.2200563

[30] Holzwarth, M., Janiszewski, C., & Neumann, M. M. (2006). The influence of avatars on online consumer shopping behavior. Journal of marketing, 70(4), 19-36. DOI：10.1509/jmkg.70.4.19