Architecture graduate student wins Best in Tools at the MIT XR Hackathon
Yifeng Wang (MArch 2021) won the Best in Tools Prize at the MIT Reality Hack alongside his team that consisted of students from MIT Sloan, Tufts University, and a working professional from Boston. The event was hosted at Massachusetts Institute of Technology from January 16-20, 2020.
Wang's team developed a project called "TalkAI." The TalkAI's goal was to assist public speakers and communicators to understand and predict the effectiveness of their speech. This goal was achieved by utilizing AI-driven techniques and AR-enabled tools that help people overcome the fear of public speaking and develop the ability to be heard.
Wang shared that the experience was not only challenging and exciting but also inspirational as the event has helped him find the research direction that he wants to develop deeper in his research and thesis. Wang would like to use his time at Berkeley to study more about the mechanics of using sensors and virtual reality (XR) to augment and mediate human experience with the built environment.
The MIT Reality Hack was a 5-day event held at MIT and co-hosted by a student organization VR/AV@MIT that comprises of thought leaders, brand mentors and creators, participants, students, and technology lovers to come together and attend tech workshops, talks, discussions, fireside chats, collaborations, hacking, and more. Participants of various backgrounds and all skill levels attend from all over the world.
Project description:
We are very passionate about giving everyone the ability to have their voices heard! Public speaking is a fear facing many of us, and because of this fear and the general lack of resources, many talented people do not have the chance to receive credit for their work or to make their opinions known.
First, the key pain point facing speakers today were brainstormed, which led to the key issue of the lack of opportunities to present in front of a real audience while feeling like you are in a safe environment. Therefore, a solution to this problem is a virtual environment, where the virtual audience can react and give feedback to the speaker, so the speaker can practice speaking and improve while not feeling unsafe or worried. This environment was created on the NReal platform.
In generating the reaction in real-time, the natural language processing to extract sentiment from the user's spoken content was used (using Valence Aware Dictionary and Sentiment Reasoner). Additionally, neural networks were used to train a model on the speech audio file (using 1600+ video clips from RAVDESS dataset). Through the two-dimensional analysis, a model was created that listens to the user's spoken content and generates a reaction accordingly, which then feeds into the reaction of the virtual audience.
Audience reaction animations that emulated common emotions such as engaged, happy, sad, surprised, were made to make the simulation feel more genuine. Verbal feedback that corresponds to each one of the emotions were another feature included. So while the user wears the device and speaks, the user is able to see the virtual audience, who reacts in real-time.
Areas focusing on improvement feedback areas, such as speaking speed, tonal variety, volume adjustment, were desired features to be included; however, due to time constraints, there was not enough time to engineer the front end of these features. Nonetheless, these are areas that the team is looking to work on going forward!