Umar notes that code-switching, which involves mixing multiple languages in same text, is common in low-resource language communities and can make propaganda detection more challenging
Mohamed bin Zayed University of Artificial Intelligence commencement portraits at Masdar City in Abu Dhabi , United Arab Emirates on May 8, 2023. Christopher Pike - www.cpike.com
Mohamed bin Zayed University of Artificial Intelligence (MBZUAI) graduate Muhammad Umar through his research has made a significant contribution in detecting propaganda on social media platforms, especially in cases where there is a mixture of low and high-resource languages.
While globally, there is vast amounts of research and time being spent on languages other than English for preservation, education, and language models, Umar, who hails from Pakistan, is contributing in his native language of Roman Urdu, among others.
Umar, a Master of Science in natural language processing (NLP), recognises the power of language in shaping opinions and influencing public discourse.
“Propaganda is a pervasive tool used to manipulate public opinion, and it is a growing concern in the digital age, especially in bilingual communities where little to no work has been done to detect it. Most propaganda detection work has been done on high-resource languages, such as English, leaving low-resource languages largely unexplored,” said Umar, who is part of the university’s first cohort of NLP graduates.
Umar noted that code-switching, which involves mixing multiple languages in the same text, is common in low-resource language communities and can make propaganda detection more challenging.
“In linguistics, code-switching refers to the practice of alternating between two or more languages or language varieties in a single conversation or text. In the context of my thesis, code-switched social media text specifically refers to social media text that uses a mixture of different languages, including English and Roman Urdu.”
Despite graduating, Umar is continuing his research and hopes to submit a paper related to detecting propaganda techniques in code-switched text at the 2023 Empirical Methods in Natural Language Processing (EMNLP) conference, one of the primary high impact NLP and artificial intelligence conferences for NLP research.
His model can be extended to other underrepresented or low-resource languages.
“This is one of the advantages of using deep learning models for NLP tasks as they can be trained on large amounts of text data and then fine-tuned for specific tasks. I ran experiments using several pre-trained mono-lingual, multilingual, and cross-lingual language models, fine-tuning them on the annotated code-switched dataset I prepared, and evaluated their performances. I found XLM RoBERTa, fine-tuned on Roman Urdu, outperformed all other baseline NLP models on our task and dataset.”
Umar said he feels privileged to be part of the groundbreaking programme, which is leading the way in the field of NLP.
“Being a part of the first batch has given me the opportunity to work closely with world-class faculty members, cutting-edge technologies, and a diverse group of fellow students who share a passion for NLP.”
Positive, negative impacts of NLP
Umar noted that NLP is a rapidly growing field with significant potential for positive and negative impacts.
“On one hand, it has led to development of sophisticated tools like ChatGPT that can help us with several research-based tasks. On the other hand, there are concerns about how NLP can be used to manipulate or deceive people, especially with the rise of deepfake technology. However, despite these challenges, I firmly believe that NLP has the potential to shape our future in a positive way.”
Relocating to Abu Dhabi in 2020 after receiving his bachelor’s in computer science from Lahore University of Management Sciences, Pakistan, is the first time Umar left his home country for higher studies.
“My journey at MBZUAI was challenging at first, but in the end, it was incredibly rewarding. I struggled initially to balance the demands of coursework with my ongoing research and personal life, but eventually learned to manage my time more effectively and prioritise my responsibilities. My experience at MBZUAI has been nothing short of transformative,” said Umar, who is first in his family to receive a master’s degree.
“I have had access to world-class resources and a supportive learning environment that has allowed me to develop my skills. The faculty and staff at MBZUAI are among the best in their fields, and their expertise and guidance have been invaluable to my growth as a researcher.”
Umar undertook a voluntary internship as a data scientist at G42 Healthcare, and currently working there as a data engineer.
ALSO READ:
Ashwani Kumar is a versatile journalist who explores every beat in Abu Dhabi with an insatiable curiosity. He loves uncovering stories that are informative and help readers form their own opinions.