WellAlly Logo
WellAlly康心伴
Development

AI Form Corrector: TensorFlow.js Pose Estimation (On-Device, Private)

Real-time squat and deadlift form analysis on mobile. TensorFlow.js MoveNet, 30fps pose detection, and on-device processing (no cloud). React Native + Expo—full implementation.

W
2025-12-12
Verified 2025-12-20
10 min read

Key Takeaways

  • On-device AI preserves privacy—no video data leaves the user's phone
  • MoveNet SINGLEPOSE_LIGHTNING balances speed and accuracy for mobile
  • Tensor disposal with tf.dispose() prevents critical memory leaks
  • Geometric calculations enable joint angle analysis for form correction
  • GPU acceleration via expo-gl is essential for real-time performance

Who This Guide Is For

This guide is for mobile ML developers building computer vision applications for fitness and health. You should have solid understanding of React Native, TensorFlow.js, and mobile performance optimization. If you're creating AI-powered fitness apps, pose estimation tools, or any application requiring real-time body tracking, this guide is for you.


Key Takeaways

  • On-Device AI Preserves Privacy: Running pose estimation directly on the device means no video or workout data ever leaves the user's phone—a critical feature for health and fitness applications.
  • MoveNet Balances Speed and Accuracy: The SINGLEPOSE_LIGHTNING model provides real-time performance on mobile devices while maintaining sufficient accuracy for exercise form analysis.
  • Tensor Disposal Prevents Memory Leaks: Calling tf.dispose() on tensors after processing is crucial—failing to do so will cause memory leaks and eventually crash the app.
  • Geometric Calculations Enable Form Analysis: Simple vector math (calculating angles between three keypoints) allows you to derive meaningful metrics like squat depth and joint alignment.
  • GPU Acceleration is Essential: Using expo-gl and WebGL enables TensorFlow.js to leverage the phone's GPU, providing the performance needed for real-time pose estimation.

We've all been there: following a workout video at home, wondering if our form is right. Is my back straight during this squat? Are my knees caving in? Incorrect exercise form not only reduces the effectiveness of a workout but is a leading cause of injury. Professional trainers are expensive, and most fitness apps are just glorified video players.

What if your phone could be your personal trainer?

In this tutorial, we'll build exactly that: a React Native application that uses your phone's camera and on-device machine learning to provide real-time feedback on your exercise form. We'll leverage the power of TensorFlow.js to run a state-of-the-art pose estimation model called MoveNet directly on the device. This approach is fast, works offline, and, most importantly, is completely private—the camera feed is never sent to a server.

What we'll build: We will create a simple workout assistant that:

  1. Accesses the front-facing camera using Expo Camera.
  2. Analyzes the video stream in real-time with TensorFlow.js and the MoveNet pose detection model.
  3. Draws the detected body keypoints and skeleton over the camera feed.
  4. Calculates joint angles to determine if a squat is being performed correctly.
  5. Displays simple, actionable feedback like "Go Lower!" or "Good Squat!".

This project is a perfect entry point into the exciting world of AI-powered health tech, combining computer vision and mobile development in a practical, impactful way.

Understanding the Problem

The core challenge is translating a visual stream of a person exercising into actionable feedback. This breaks down into a few technical hurdles:

  • Real-time Analysis: We need to process video frames quickly enough to give immediate feedback without lagging.
  • Precise Pose Estimation: The system must accurately identify the positions of key body joints (shoulders, elbows, hips, knees, etc.).
  • Biomechanical Logic: Once we have the joint positions, we need to apply rules based on biomechanics to evaluate the form. For example, in a squat, we need to measure the angle of the knee to determine depth.
  • On-Device Performance: Running a deep learning model on a phone can be resource-intensive. We need an efficient model and optimized code to ensure a smooth user experience.

Our solution uses MoveNet, a model optimized for speed and accuracy on devices like smartphones, making it ideal for our real-time application.

Prerequisites

To follow this tutorial, you'll need:

  • Node.js (LTS version recommended).
  • Yarn or npm for package management.
  • The Expo Go app on your iOS or Android phone for testing.
  • A physical device is required to test camera functionality. The iOS/Android simulators will not work.
  • Basic knowledge of React and React Hooks (useState, useEffect).

Step 1: Setting Up the Project

First, let's create a new Expo project and install the necessary dependencies.

What we're doing

We'll initialize a blank Expo app and add libraries for the camera, TensorFlow.js, WebGL for GPU acceleration, and a 2D graphics library for drawing the feedback.

Implementation

Open your terminal and run the following commands:

code
# Create a new Expo project
npx create-expo-app ai-workout-trainer
cd ai-workout-trainer

# Install TensorFlow.js and its dependencies
npm install @tensorflow/tfjs @tensorflow-models/pose-detection @react-native-async-storage/async-storage

# Install Expo modules for camera, WebGL (for GPU support), and file system
npm install expo-camera expo-gl expo-gl-cpp expo-file-system

# Install a canvas library for drawing. We'll use React Native Skia.
npm install @shopify/react-native-skia
Code collapsed

How it works

  • @tensorflow/tfjs: The core TensorFlow.js library.
  • @tensorflow-models/pose-detection: Provides pre-trained models, including MoveNet.
  • expo-camera: Allows us to access and display the device camera feed.
  • expo-gl & expo-gl-cpp: These enable WebGL, allowing TensorFlow.js to use the phone's GPU for significantly faster model inference.
  • @shopify/react-native-skia: A powerful 2D graphics library we'll use to draw the skeleton and feedback on a canvas overlay.

Step 2: Integrating the Camera and TensorFlow.js

Now, let's get the camera running and load our AI model. We'll start by creating a custom hook to manage the TensorFlow.js setup.

What we're doing

We will set up the camera component, request user permissions, and create a reusable hook to initialize TensorFlow.js and load the MoveNet model. This keeps our main component clean.

Implementation

Create a new file src/useTensorFlow.js:

code
// src/useTensorFlow.js
import { useEffect, useState } from 'react';
import * as tf from '@tensorflow/tfjs';
import '@tensorflow/tfjs-react-native';
import * as poseDetection from '@tensorflow-models/pose-detection';

export const useTensorFlow = () => {
  const [isReady, setIsReady] = useState(false);
  const [model, setModel] = useState(null);

  useEffect(() => {
    const setup = async () => {
      // 1. Wait for TensorFlow.js to be ready
      await tf.ready();

      // 2. Create a MoveNet detector
      const detectorConfig = {
        modelType: poseDetection.movenet.modelType.SINGLEPOSE_LIGHTNING,
      };
      const detector = await poseDetection.createDetector(
        poseDetection.SupportedModels.MoveNet,
        detectorConfig
      );
      setModel(detector);

      // 3. Set the ready flag
      setIsReady(true);
      console.log('TF and model ready!');
    };

    setup();
  }, []);

  return { model, isReady };
};
Code collapsed

Now, let's update App.js to use this hook and display the camera.

code
// App.js
import React, { useState, useEffect } from 'react';
import { StyleSheet, Text, View, Dimensions } from 'react-native';
import { Camera, CameraType } from 'expo-camera';
import { useTensorFlow } from './src/useTensorFlow';

const { width, height } = Dimensions.get('window');

export default function App() {
  const [cameraPermission, setCameraPermission] = useState(null);
  const { model, isReady } = useTensorFlow();

  useEffect(() => {
    (async () => {
      const { status } = await Camera.requestCameraPermissionsAsync();
      setCameraPermission(status === 'granted');
    })();
  }, []);

  if (cameraPermission === null) {
    return <View />;
  }
  if (cameraPermission === false) {
    return <Text>No access to camera</Text>;
  }

  return (
    <View style={styles.container}>
      <Text style={styles.loadingText}>
        {isReady ? 'Ready!' : 'Loading Model...'}
      </Text>
      <Camera
        style={styles.camera}
        type={CameraType.front}
        // We'll add the frame processing logic here later
      />
    </View>
  );
}

const styles = StyleSheet.create({
  container: {
    flex: 1,
    backgroundColor: '#fff',
    alignItems: 'center',
    justifyContent: 'center',
  },
  camera: {
    width: width,
    height: height,
  },
  loadingText: {
    position: 'absolute',
    top: 50,
    color: 'white',
    fontSize: 18,
    zIndex: 10,
  },
});
Code collapsed

How it works

  • useTensorFlow hook: This encapsulates the model loading logic. We call tf.ready() to initialize the TF.js backend and then poseDetection.createDetector to load the pre-trained MoveNet model. We chose SINGLEPOSE_LIGHTNING because it's the fastest version, perfect for real-time mobile use.
  • App.js: We request camera permission using Camera.requestCameraPermissionsAsync(). We then display a loading message until our custom hook reports that the model is ready. Finally, we render the full-screen Camera component.

Run the app now with npx expo start and open it on your phone. You should see the loading message, followed by "Ready!", and your front camera feed.

Step 3: Real-time Pose Estimation

This is where the magic happens! We need to capture frames from the camera, send them to our model, and get the pose keypoints back. expo-camera doesn't provide a direct frame-by-frame stream. For this tutorial, we'll use a clever workaround with takePictureAsync inside a looping function.

Note: For a production app, react-native-vision-camera with its Frame Processor API is a more performant choice as it avoids the overhead of saving images.

What we're doing

We'll create a function that continuously captures images from the camera, converts them to the tensor format TensorFlow.js expects, runs the model, and stores the detected poses in our component's state.

Implementation

Update App.js with the pose detection logic:

code
// App.js (additions)
import { cameraWithTensors } from '@tensorflow/tfjs-react-native';
import * as FileSystem from 'expo-file-system';

// Wrap Camera with cameraWithTensors HOC
const TensorCamera = cameraWithTensors(Camera);

// ... inside the App component
const [poses, setPoses] = useState([]);
const cameraRef = React.useRef(null);

// This function will be called for each frame from the camera
const handleCameraStream = (images) => {
    const loop = async () => {
      const nextImageTensor = images.next().value;

      if (!nextImageTensor) {
        // No new image, do nothing
        requestAnimationFrame(loop);
        return;
      }

      // 1. Estimate poses
      const estimatedPoses = await model.estimatePoses(nextImageTensor);
      setPoses(estimatedPoses);
      
      // 2. Dispose the tensor to free up memory
      tf.dispose(nextImageTensor);

      // 3. Loop to the next frame
      requestAnimationFrame(loop);
    };
    loop();
};

// ... inside the return statement, replace <Camera> with <TensorCamera>
<TensorCamera
    ref={cameraRef}
    style={styles.camera}
    type={CameraType.front}
    cameraTextureHeight={1920}
    cameraTextureWidth={1080}
    resizeHeight={200}
    resizeWidth={152}
    resizeDepth={3}
    onReady={handleCameraStream}
    autorender={true}
/>
Code collapsed

How it works

  • cameraWithTensors: This is a Higher-Order Component from @tensorflow/tfjs-react-native. It wraps the standard Camera component and provides a new prop, onReady, which gives us a stream of image tensors directly from the camera feed. This is far more efficient than taking pictures and converting them manually.
  • handleCameraStream: This function receives the image stream. We create an async loop that gets the nextImageTensor from the stream.
  • model.estimatePoses(): This is the core TensorFlow.js function. We pass it the image tensor, and it returns an array of detected poses. Since we're using a single-pose model, this array will have at most one pose object.
  • tf.dispose(): Crucial for performance! Tensors consume GPU memory. We must explicitly dispose of them when we're done to prevent memory leaks that would crash the app.
  • requestAnimationFrame(loop): This creates an efficient loop that processes frames as fast as the device can handle without blocking the UI thread.

Step 4: Drawing the Skeleton (Visual Feedback)

Now that we have the pose data, let's visualize it! We'll overlay a Skia Canvas on top of the camera and draw the keypoints and the lines connecting them.

What we're doing

We'll create a new component, PoseSkeleton, that takes the pose data as a prop and uses React Native Skia to draw circles for the keypoints and lines for the bones.

Implementation

First, create src/PoseSkeleton.js:

code
// src/PoseSkeleton.js
import React from 'react';
import { Canvas, Path, Skia, Circle } from '@shopify/react-native-skia';
import { StyleSheet } from 'react-native';

const PoseSkeleton = ({ poses, cameraWidth, cameraHeight }) => {
  if (!poses || poses.length === 0) {
    return null;
  }

  // Define connections between keypoints to form the skeleton
  const skeletonConnections = [
    ['nose', 'left_eye'], ['left_eye', 'left_ear'],
    ['nose', 'right_eye'], ['right_eye', 'right_ear'],
    ['nose', 'left_shoulder'], ['nose', 'right_shoulder'],
    ['left_shoulder', 'right_shoulder'],
    ['left_shoulder', 'left_elbow'], ['left_elbow', 'left_wrist'],
    ['right_shoulder', 'right_elbow'], ['right_elbow', 'right_wrist'],
    ['left_shoulder', 'left_hip'], ['right_shoulder', 'right_hip'],
    ['left_hip', 'right_hip'],
    ['left_hip', 'left_knee'], ['left_knee', 'left_ankle'],
    ['right_hip', 'right_knee'], ['right_knee', 'right_ankle'],
  ];

  const pose = poses; // We only have one pose
  const keypoints = pose.keypoints.reduce((acc, keypoint) => {
    acc[keypoint.name] = keypoint;
    return acc;
  }, {});
  
  const path = Skia.Path.Make();
  skeletonConnections.forEach(([startName, endName]) => {
    const start = keypoints[startName];
    const end = keypoints[endName];
    if (start && end && start.score > 0.5 && end.score > 0.5) {
      path.moveTo(start.x, start.y);
      path.lineTo(end.x, end.y);
    }
  });

  return (
    <Canvas style={StyleSheet.absoluteFill}>
      {Object.values(keypoints).map((p) => {
        if (p.score > 0.5) { // Only draw confident keypoints
          return <Circle key={p.name} cx={p.x} cy={p.y} r={5} color: "cyan" />;
        }
        return null;
      })}
      <Path
        path={path}
        color: "aqua"
        style: "stroke"
        strokeWidth={3}
      />
    </Canvas>
  );
};

export default PoseSkeleton;
Code collapsed

Now integrate it into App.js:

code
// App.js (additions)
import PoseSkeleton from './src/PoseSkeleton';
// ... inside the App component's return statement, after TensorCamera
{poses && poses.length > 0 && (
    <PoseSkeleton poses={poses} cameraWidth={width} cameraHeight={height} />
)}
Code collapsed

How it works

  • The PoseSkeleton component receives the poses array.
  • We check if a keypoint's score (confidence level) is above a threshold (0.5) before drawing it.
  • Drawing Keypoints: We map over the keypoints and render a Skia <Circle> for each one.
  • Drawing the Skeleton: We create a skeletonConnections map that defines which joints to connect. We then create a Skia <Path> and use moveTo and lineTo to draw the lines, effectively connecting the dots.
  • We use StyleSheet.absoluteFill on the <Canvas> to make it a transparent overlay that perfectly matches the camera's dimensions.

If you run the app now, you should see a cyan skeleton overlaid on your body in real-time! ✨

Step 5: Implementing Squat Form Correction Logic

This is the "trainer" part of our AI trainer. We'll calculate the angle of the knees and hips to determine if a squat is deep enough and provide feedback.

What we're doing

We'll write a helper function to calculate the angle between three points. Then, in our main component, we'll use this function on the hip, knee, and ankle keypoints to measure the squat depth and display feedback.

Implementation

First, create a utility file src/geometry.js:

code
// src/geometry.js
// Function to calculate the angle between three points
export function calculateAngle(a, b, c) {
  const radians = Math.atan2(c.y - b.y, c.x - b.x) - Math.atan2(a.y - b.y, a.x - b.x);
  let angle = Math.abs(radians * 180.0 / Math.PI);
  if (angle > 180.0) {
    angle = 360 - angle;
  }
  return angle;
}
Code collapsed

Now, let's use this in App.js to provide feedback.

code
// App.js (additions)
import { calculateAngle } from './src/geometry';

// ... inside the App component
const [feedback, setFeedback] = useState('');

// This useEffect will run whenever the poses change
useEffect(() => {
    if (poses && poses.length > 0) {
      const keypoints = poses.keypoints.reduce((acc, keypoint) => {
          acc[keypoint.name] = keypoint;
          return acc;
      }, {});

      const leftHip = keypoints['left_hip'];
      const leftKnee = keypoints['left_knee'];
      const leftAnkle = keypoints['left_ankle'];

      if (leftHip && leftKnee && leftAnkle &&
          leftHip.score > 0.5 && leftKnee.score > 0.5 && leftAnkle.score > 0.5) {
        
        // Calculate the angle of the left knee
        const kneeAngle = calculateAngle(leftHip, leftKnee, leftAnkle);

        // Simple logic for squat depth
        if (kneeAngle > 160) {
          setFeedback('Start Squat');
        } else if (kneeAngle < 100) {
          setFeedback('Good Squat! 👍');
        } else {
          setFeedback('Go Lower!');
        }
      }
    }
}, [poses]);

// ... in the return statement, add a Text component for feedback
<Text style={styles.feedbackText}>{feedback}</Text>

// ... in StyleSheet, add styles for the feedback text
feedbackText: {
    position: 'absolute',
    bottom: 100,
    left: 20,
    backgroundColor: 'rgba(0, 0, 0, 0.5)',
    color: 'white',
    fontSize: 24,
    padding: 10,
    borderRadius: 5,
    zIndex: 10,
}
Code collapsed

How it works

  • calculateAngle: This function uses vector math (atan2) to find the angle formed by three keypoints. We use this to measure the bend in our joints.
  • useEffect hook: We run our analysis logic inside a useEffect that depends on poses. This ensures the logic runs every time a new pose is detected.
  • Form Logic: We get the hip, knee, and ankle keypoints. We calculate the kneeAngle and apply simple rules:
    • If the angle is > 160°, the user is likely standing.
    • If the angle is < 100°, they've reached a good squat depth.
    • Otherwise, they need to go lower.
  • The feedback is stored in state and displayed in a <Text> component overlaid on the screen.

Performance Considerations

  • Model Choice: We used SINGLEPOSE_LIGHTNING for a reason—it's fast. For higher accuracy, you could use SINGLEPOSE_THUNDER, but expect a performance hit.
  • Tensor Management: Forgetting to call tf.dispose() on tensors will lead to memory leaks and app crashes. cameraWithTensors helps manage this, but be mindful if you create tensors manually.
  • Frame Rate: The cameraWithTensors approach is good but not perfect. For production-grade apps that need consistent high FPS, explore react-native-vision-camera and its JSI-based Frame Processors, which allow you to run JavaScript code synchronously on the camera thread for maximum performance.
  • Model Quantization: For even better performance, you can use post-training quantization to reduce the size of your model, leading to faster load times and inference.

Security Best Practices

One of the biggest advantages of this architecture is its privacy.

  • On-Device Processing: All image processing happens on the user's device. No video or personal data is sent to the cloud. This is a critical feature for any health and fitness app and a major selling point.
  • Permissions: We correctly request camera permissions and handle the case where the user denies them. Always explain why you need a permission before asking for it.

Conclusion

Congratulations! You've just built a sophisticated AI-powered fitness application with React Native. We've taken a real-time camera feed, used a powerful on-device machine learning model to understand human poses, and provided instant, actionable feedback to help a user improve their squat form.

This is just the beginning. You can now expand on this foundation:

  • Add logic for other exercises (push-ups, lunges).
  • Implement a rep counter.
  • Create full workout routines.
  • Track progress over time and save it locally.

The combination of mobile development with on-device AI opens up a world of possibilities for creating intelligent, private, and genuinely helpful applications.

Resources

Frequently Asked Questions

Q: Can this detect multiple people in the frame at once?

A: The SINGLEPOSE_LIGHTNING model we used detects only one person. For multi-person detection, you would use the MULTIPOSE_LIGHTNING or THUNDER variants, though these come with increased computational cost. For a workout app focused on personal training, single-pose detection is usually sufficient.

Q: How accurate is the angle calculation for squat depth?

A: The accuracy depends on the quality of the pose detection and the camera angle. For best results, users should position their phone so their full body is visible and the camera is positioned at roughly hip height. You may want to add calibration instructions in your app to guide users.

Q: Can I add rep counting to this application?

A: Yes! Rep counting can be implemented by tracking the knee angle over time and counting each complete cycle of going below the target depth and returning to standing. You'd need to add state management to track rep count and debounce the counting logic to avoid counting partial reps.

Q: Will this work on older phones?

A: TensorFlow.js and MoveNet are optimized to run on a range of devices, but performance will vary. Older phones may experience lower frame rates. You can add performance settings to adjust the resolution or frame rate based on device capabilities.

Q: How do I add support for other exercises like push-ups or lunges?

A: Each exercise requires different keypoint analysis. For push-ups, you'd track elbow angles. For lunges, you'd monitor both front and back knee angles. The pattern is the same: identify the relevant keypoints, calculate the appropriate angles, and define thresholds for good form.

Related Articles

#

Article Tags

reactnative
ai
tensorflow
healthtech

Related Medical Knowledge

Learn more about related medical concepts and tests

W

WellAlly's core development team, comprised of healthcare professionals, software engineers, and UX designers committed to revolutionizing digital health management.

Expertise

Healthcare Technology
Software Development
User Experience
AI & Machine Learning

Found this article helpful?

Try KangXinBan and start your health management journey