top of page

Computer Vision for Hand Tracking

Computer Vision is a subfield within AI/ML that enables computers to interpret visual data, extracting meaningful information. In this project, I've harnessed the power of computer vision libraries in Python, including OpenCV and MediaPipe, alongside PyAutoGUI for mouse control, to enable intuitive control of both a cursor and a game controller. Leveraging these frameworks, I developed a game with a focus on rehabilitation through interactive exercises facilitated by hand gestures.

​

OpenCV serves as the backbone of real-time video processing and precise hand tracking, providing essential capabilities for detecting and tracking hand landmarks. Complementing this, MediaPipe, developed by Google, offers pre-trained models and algorithms tailored for hand landmark detection. Coupled with PyAutoGUI, Python gains the capability to simulate mouse movements and clicks, bridging the gap between hand gestures and computer interaction.

​

Next, using the PyGame library, I developed a computer game with an emphasis on rehabilitation, where hand gestures drive interactive exercises aimed at enhancing motor skills and hand-eye coordination. Through the integration of computer vision and Python, this project showcases the transformative impact of human-computer interaction.

Additionally, I found which gestures were easiest to track through iterating between various positions finding the following to be most effective and feel the most intuitive: Pinch gesture translates into a left-click, middle finger against the thumb enables right-click and index finger crossing in front of the middle finger, triggering a double left-click action. I required a stand-alone double left click gesture as pinching twice was detecting as two 'hold-downs' in my testing.

Find the full script here to try it yourself:

In my game implementation, I combine art, physics, and hand tracking to create a dynamic experience where players must skillfully maneuver a box to collect randomly appearing coins, enhancing hand-eye coordination while maintaining control.

One challange was determining which point to track for controlling the cursor. Initially, the code tracked the fingertips, however, this approach posed difficulties, particularly when users pinched or closed their fists, causing erratic cursor behavior. To mitigate this issue, an adjustment was made to track the palm's base instead. By focusing on the palm, which serves as a more stable reference point, the cursor control became more consistent and responsive, even during complex hand gestures. Additionally, the camera reversed my controls so that left had become right, to fix this I inverted the X axis of my script so that my movements mapped accurately.

​

In future iterations, I would prioritise refining the gesture recognition algorithm to enhance its robustness and versatility. Additionally, implementing user customisation options for gesture mapping and sensitivity settings could further enhance accessibility and user satisfaction, catering to personal preferences and use cases.

bottom of page