GAN-AVI: facial expression translator in Twitter avatar analogous to tweet sentiments.
Kunada Dhana Sree Devi, Kireet Muppavaram, Choudapur Atheeq, Asadullah Shaikh, Ali Alqazzaz, Mana Saleh Al Reshan, Samar M Alqhtani, Mousa Alalhareth
Abstract
Open AccessAs widely spread social media platform, Twitter (Now named as X) is seamlessly used by many to share their thoughts and opinions. Twitter Avatar which is the profile image of the user, is initially uploaded when the user takes his account and never changes as long as the user changes it. From sorrow to agony, from happiness to sadness all varied feelings are commuted long way via tweets. Air crashes, accidents by rail, road and ethnic violence's of the country results in heavy raise of tweets. Whatever be the tweet sentiment, it was observed the user AVI (Avatar) remained the same. When a great nations leader tweeted on a deadliest Air crash, it was observed that his tweet carried lot of sorrow, but his AVI was all smiling. Instances like these may affect our ethical conscience. From the psychological perceptive, the sentiment of the tweet if not carried by the facial expression, there is a great possibility of holding disrespect towards the tweeter and may have a changed self-opinion on the tweeter. This work proposes a deep learning GAN model GAN-AVI, which is trained to translate the AVI facial expressions dynamically based on the tweet sentiment before the tweet actually reaches the crowd. The proposed architecture includes two frameworks the tweet sentiment label extraction framework and the target face synthesis framework. The tweet sentiment label extraction frame work is trained on 10,000 tweets with various sentiment categories, the face expression synthesis framework is trained over 35,000 images to extract custom 32-Landmarks using which the target face matching the tweet sentiment can be synthesised. Results of GAN-AVI are evaluated using landmark based AED, SSIM, LMK, ID and NCC metrics. The effectiveness of our proposed GAN-AVI is showcased in comparison to few Baseline models GANimation, X2Face, AttentionGAN, C2GAN. Unlike GANimation and X2Face, our approach explicitly incorporates temporal word dependencies from sentiment text via masked embedding's, improving sentiment-expression alignment. The experimental evaluations showed the proposed approach reaches the lowest error of 0. 2%, demonstrating its enhanced accuracy in landmark localization for micro-expression generation tasks.