Lighting the Fireworks

Hand Interaction Design

Product Manager, Tencent, Shanghai

January 2020

image.gif

With the 2D hand skeleton algorithm released by our machine learning engineering team, I need to design a framework to apply it to a short video filming experience. The application not only needs to be versatile for multiple use cases but also has to be creative compared to our competitors.

skeleton.png

I found that when people see themselves through the front camera with AR content, they tend to interact with it with their hands, by poking, pinching, or grabbing. Thus, I made this video demo with a designer.

To define this interaction, I abstract it into two components: trigger point and hotspot. The trigger point can be any key point defined by the 2D finger skeleton algorithm. While a hotspot is a rectangular area, either being stable relative to the screen or following any key point like the left eye.

Since Chinese New Year was approaching at that time, I proposed Lighting the Fireworks AR experience. Using a tongue of flame as an indicator of the trigger point, the AR experience allows users to light the fireworks while filming a short video and their happy new year wish.

The filter was ranked as the most popular AR filter for the Chinese New Year 2020 in Wesee mobile app.

timeline.png
Design Challenges Along the Way

1) AI Accuracy

Along the user testing, I found out a great percentage led to trigger failure due to AI uncertainty. After talking with machine learning engineers in the team, I learned that the training dataset consists of a great number of hand gesture images like the number one gesture. Thus, the model acts poorly to recognize the fingertip in two scenarios:

  1. when the whole hand doesn’t look like any gesture

  2. when the hand is only partially in view

We added a few training data to solve this problem

handgesture.png

2) Interaction Clue

When notified by pointing at the fireworks at the beginning of this AR experience, users can act differently, mainly in two ways:

  1. looking at the camera-captured content, and pointing with their virtual fingertips (illustrated on the left)

  2. touching the screen with real fingertips (illustrated on the right)

Because of the front camera position, touching the screen won’t be captured and thus, always results in confusion on the viewing side. Thus, an interaction clue was necessary to encourage the first interaction way.

I used the sparking of fireworks as the clue. When the fingertip is not detected, the sparking disappears, and a reminder shows up. In this way, those who intended to touch the screen will find fireworks go out and understand they won’t trigger them, and most likely, will act in another way as we designed.

Other demonstrations like a sample video are also presented to the users to better understand.

IMG_0313.jpg