We've all been there: following a workout video at home, wondering if our form is right. Is my back straight during this squat? Are my knees caving in? Incorrect exercise form not only reduces the effectiveness of a workout but is a leading cause of injury. Professional trainers are expensive, and most fitness apps are just glorified video players.
What if your phone could be your personal trainer?
In this tutorial, we'll build exactly that: a React Native application that uses your phone's camera and on-device machine learning to provide real-time feedback on your exercise form. We'll leverage the power of TensorFlow.js to run a state-of-the-art pose estimation model called MoveNet directly on the device. This approach is fast, works offline, and, most importantly, is completely private—the camera feed is never sent to a server.
What we'll build: We will create a simple workout assistant that:
- Accesses the front-facing camera using Expo Camera.
- Analyzes the video stream in real-time with TensorFlow.js and the MoveNet pose detection model.
- Draws the detected body keypoints and skeleton over the camera feed.
- Calculates joint angles to determine if a squat is being performed correctly.
- Displays simple, actionable feedback like "Go Lower!" or "Good Squat!".
This project is a perfect entry point into the exciting world of AI-powered health tech, combining computer vision and mobile development in a practical, impactful way.
Understanding the Problem
The core challenge is translating a visual stream of a person exercising into actionable feedback. This breaks down into a few technical hurdles:
- Real-time Analysis: We need to process video frames quickly enough to give immediate feedback without lagging.
- Precise Pose Estimation: The system must accurately identify the positions of key body joints (shoulders, elbows, hips, knees, etc.).
- Biomechanical Logic: Once we have the joint positions, we need to apply rules based on biomechanics to evaluate the form. For example, in a squat, we need to measure the angle of the knee to determine depth.
- On-Device Performance: Running a deep learning model on a phone can be resource-intensive. We need an efficient model and optimized code to ensure a smooth user experience.
Our solution uses MoveNet, a model optimized for speed and accuracy on devices like smartphones, making it ideal for our real-time application.
Prerequisites
To follow this tutorial, you'll need:
- Node.js (LTS version recommended).
- Yarn or npm for package management.
- The Expo Go app on your iOS or Android phone for testing.
- A physical device is required to test camera functionality. The iOS/Android simulators will not work.
- Basic knowledge of React and React Hooks (
useState,useEffect).
Step 1: Setting Up the Project
First, let's create a new Expo project and install the necessary dependencies.
What we're doing
We'll initialize a blank Expo app and add libraries for the camera, TensorFlow.js, WebGL for GPU acceleration, and a 2D graphics library for drawing the feedback.
Implementation
Open your terminal and run the following commands:
# Create a new Expo project
npx create-expo-app ai-workout-trainer
cd ai-workout-trainer
# Install TensorFlow.js and its dependencies
npm install @tensorflow/tfjs @tensorflow-models/pose-detection @react-native-async-storage/async-storage
# Install Expo modules for camera, WebGL (for GPU support), and file system
npm install expo-camera expo-gl expo-gl-cpp expo-file-system
# Install a canvas library for drawing. We'll use React Native Skia.
npm install @shopify/react-native-skia
How it works
@tensorflow/tfjs: The core TensorFlow.js library.@tensorflow-models/pose-detection: Provides pre-trained models, including MoveNet.expo-camera: Allows us to access and display the device camera feed.expo-gl&expo-gl-cpp: These enable WebGL, allowing TensorFlow.js to use the phone's GPU for significantly faster model inference.@shopify/react-native-skia: A powerful 2D graphics library we'll use to draw the skeleton and feedback on a canvas overlay.
Step 2: Integrating the Camera and TensorFlow.js
Now, let's get the camera running and load our AI model. We'll start by creating a custom hook to manage the TensorFlow.js setup.
What we're doing
We will set up the camera component, request user permissions, and create a reusable hook to initialize TensorFlow.js and load the MoveNet model. This keeps our main component clean.
Implementation
Create a new file src/useTensorFlow.js:
// src/useTensorFlow.js
import { useEffect, useState } from 'react';
import * as tf from '@tensorflow/tfjs';
import '@tensorflow/tfjs-react-native';
import * as poseDetection from '@tensorflow-models/pose-detection';
export const useTensorFlow = () => {
const [isReady, setIsReady] = useState(false);
const [model, setModel] = useState(null);
useEffect(() => {
const setup = async () => {
// 1. Wait for TensorFlow.js to be ready
await tf.ready();
// 2. Create a MoveNet detector
const detectorConfig = {
modelType: poseDetection.movenet.modelType.SINGLEPOSE_LIGHTNING,
};
const detector = await poseDetection.createDetector(
poseDetection.SupportedModels.MoveNet,
detectorConfig
);
setModel(detector);
// 3. Set the ready flag
setIsReady(true);
console.log('TF and model ready!');
};
setup();
}, []);
return { model, isReady };
};
Now, let's update App.js to use this hook and display the camera.
// App.js
import React, { useState, useEffect } from 'react';
import { StyleSheet, Text, View, Dimensions } from 'react-native';
import { Camera, CameraType } from 'expo-camera';
import { useTensorFlow } from './src/useTensorFlow';
const { width, height } = Dimensions.get('window');
export default function App() {
const [cameraPermission, setCameraPermission] = useState(null);
const { model, isReady } = useTensorFlow();
useEffect(() => {
(async () => {
const { status } = await Camera.requestCameraPermissionsAsync();
setCameraPermission(status === 'granted');
})();
}, []);
if (cameraPermission === null) {
return <View />;
}
if (cameraPermission === false) {
return <Text>No access to camera</Text>;
}
return (
<View style={styles.container}>
<Text style={styles.loadingText}>
{isReady ? 'Ready!' : 'Loading Model...'}
</Text>
<Camera
style={styles.camera}
type={CameraType.front}
// We'll add the frame processing logic here later
/>
</View>
);
}
const styles = StyleSheet.create({
container: {
flex: 1,
backgroundColor: '#fff',
alignItems: 'center',
justifyContent: 'center',
},
camera: {
width: width,
height: height,
},
loadingText: {
position: 'absolute',
top: 50,
color: 'white',
fontSize: 18,
zIndex: 10,
},
});
How it works
useTensorFlowhook: This encapsulates the model loading logic. We calltf.ready()to initialize the TF.js backend and thenposeDetection.createDetectorto load the pre-trained MoveNet model. We choseSINGLEPOSE_LIGHTNINGbecause it's the fastest version, perfect for real-time mobile use.App.js: We request camera permission usingCamera.requestCameraPermissionsAsync(). We then display a loading message until our custom hook reports that the model is ready. Finally, we render the full-screenCameracomponent.
Run the app now with npx expo start and open it on your phone. You should see the loading message, followed by "Ready!", and your front camera feed.
Step 3: Real-time Pose Estimation
This is where the magic happens! We need to capture frames from the camera, send them to our model, and get the pose keypoints back. expo-camera doesn't provide a direct frame-by-frame stream. For this tutorial, we'll use a clever workaround with takePictureAsync inside a looping function.
Note: For a production app, react-native-vision-camera with its Frame Processor API is a more performant choice as it avoids the overhead of saving images.
What we're doing
We'll create a function that continuously captures images from the camera, converts them to the tensor format TensorFlow.js expects, runs the model, and stores the detected poses in our component's state.
Implementation
Update App.js with the pose detection logic:
// App.js (additions)
import { cameraWithTensors } from '@tensorflow/tfjs-react-native';
import * as FileSystem from 'expo-file-system';
// Wrap Camera with cameraWithTensors HOC
const TensorCamera = cameraWithTensors(Camera);
// ... inside the App component
const [poses, setPoses] = useState([]);
const cameraRef = React.useRef(null);
// This function will be called for each frame from the camera
const handleCameraStream = (images) => {
const loop = async () => {
const nextImageTensor = images.next().value;
if (!nextImageTensor) {
// No new image, do nothing
requestAnimationFrame(loop);
return;
}
// 1. Estimate poses
const estimatedPoses = await model.estimatePoses(nextImageTensor);
setPoses(estimatedPoses);
// 2. Dispose the tensor to free up memory
tf.dispose(nextImageTensor);
// 3. Loop to the next frame
requestAnimationFrame(loop);
};
loop();
};
// ... inside the return statement, replace <Camera> with <TensorCamera>
<TensorCamera
ref={cameraRef}
style={styles.camera}
type={CameraType.front}
cameraTextureHeight={1920}
cameraTextureWidth={1080}
resizeHeight={200}
resizeWidth={152}
resizeDepth={3}
onReady={handleCameraStream}
autorender={true}
/>
How it works
cameraWithTensors: This is a Higher-Order Component from@tensorflow/tfjs-react-native. It wraps the standardCameracomponent and provides a new prop,onReady, which gives us a stream of image tensors directly from the camera feed. This is far more efficient than taking pictures and converting them manually.handleCameraStream: This function receives the image stream. We create an asyncloopthat gets thenextImageTensorfrom the stream.model.estimatePoses(): This is the core TensorFlow.js function. We pass it the image tensor, and it returns an array of detected poses. Since we're using a single-pose model, this array will have at most one pose object.tf.dispose(): Crucial for performance! Tensors consume GPU memory. We must explicitly dispose of them when we're done to prevent memory leaks that would crash the app.requestAnimationFrame(loop): This creates an efficient loop that processes frames as fast as the device can handle without blocking the UI thread.
Step 4: Drawing the Skeleton (Visual Feedback)
Now that we have the pose data, let's visualize it! We'll overlay a Skia Canvas on top of the camera and draw the keypoints and the lines connecting them.
What we're doing
We'll create a new component, PoseSkeleton, that takes the pose data as a prop and uses React Native Skia to draw circles for the keypoints and lines for the bones.
Implementation
First, create src/PoseSkeleton.js:
// src/PoseSkeleton.js
import React from 'react';
import { Canvas, Path, Skia, Circle } from '@shopify/react-native-skia';
import { StyleSheet } from 'react-native';
const PoseSkeleton = ({ poses, cameraWidth, cameraHeight }) => {
if (!poses || poses.length === 0) {
return null;
}
// Define connections between keypoints to form the skeleton
const skeletonConnections = [
['nose', 'left_eye'], ['left_eye', 'left_ear'],
['nose', 'right_eye'], ['right_eye', 'right_ear'],
['nose', 'left_shoulder'], ['nose', 'right_shoulder'],
['left_shoulder', 'right_shoulder'],
['left_shoulder', 'left_elbow'], ['left_elbow', 'left_wrist'],
['right_shoulder', 'right_elbow'], ['right_elbow', 'right_wrist'],
['left_shoulder', 'left_hip'], ['right_shoulder', 'right_hip'],
['left_hip', 'right_hip'],
['left_hip', 'left_knee'], ['left_knee', 'left_ankle'],
['right_hip', 'right_knee'], ['right_knee', 'right_ankle'],
];
const pose = poses; // We only have one pose
const keypoints = pose.keypoints.reduce((acc, keypoint) => {
acc[keypoint.name] = keypoint;
return acc;
}, {});
const path = Skia.Path.Make();
skeletonConnections.forEach(([startName, endName]) => {
const start = keypoints[startName];
const end = keypoints[endName];
if (start && end && start.score > 0.5 && end.score > 0.5) {
path.moveTo(start.x, start.y);
path.lineTo(end.x, end.y);
}
});
return (
<Canvas style={StyleSheet.absoluteFill}>
{Object.values(keypoints).map((p) => {
if (p.score > 0.5) { // Only draw confident keypoints
return <Circle key={p.name} cx={p.x} cy={p.y} r={5} color="cyan" />;
}
return null;
})}
<Path
path={path}
color="aqua"
style="stroke"
strokeWidth={3}
/>
</Canvas>
);
};
export default PoseSkeleton;
Now integrate it into App.js:
// App.js (additions)
import PoseSkeleton from './src/PoseSkeleton';
// ... inside the App component's return statement, after TensorCamera
{poses && poses.length > 0 && (
<PoseSkeleton poses={poses} cameraWidth={width} cameraHeight={height} />
)}
How it works
- The
PoseSkeletoncomponent receives theposesarray. - We check if a keypoint's
score(confidence level) is above a threshold (0.5) before drawing it. - Drawing Keypoints: We map over the keypoints and render a Skia
<Circle>for each one. - Drawing the Skeleton: We create a
skeletonConnectionsmap that defines which joints to connect. We then create a Skia<Path>and usemoveToandlineToto draw the lines, effectively connecting the dots. - We use
StyleSheet.absoluteFillon the<Canvas>to make it a transparent overlay that perfectly matches the camera's dimensions.
If you run the app now, you should see a cyan skeleton overlaid on your body in real-time! ✨
Step 5: Implementing Squat Form Correction Logic
This is the "trainer" part of our AI trainer. We'll calculate the angle of the knees and hips to determine if a squat is deep enough and provide feedback.
What we're doing
We'll write a helper function to calculate the angle between three points. Then, in our main component, we'll use this function on the hip, knee, and ankle keypoints to measure the squat depth and display feedback.
Implementation
First, create a utility file src/geometry.js:
// src/geometry.js
// Function to calculate the angle between three points
export function calculateAngle(a, b, c) {
const radians = Math.atan2(c.y - b.y, c.x - b.x) - Math.atan2(a.y - b.y, a.x - b.x);
let angle = Math.abs(radians * 180.0 / Math.PI);
if (angle > 180.0) {
angle = 360 - angle;
}
return angle;
}
Now, let's use this in App.js to provide feedback.
// App.js (additions)
import { calculateAngle } from './src/geometry';
// ... inside the App component
const [feedback, setFeedback] = useState('');
// This useEffect will run whenever the poses change
useEffect(() => {
if (poses && poses.length > 0) {
const keypoints = poses.keypoints.reduce((acc, keypoint) => {
acc[keypoint.name] = keypoint;
return acc;
}, {});
const leftHip = keypoints['left_hip'];
const leftKnee = keypoints['left_knee'];
const leftAnkle = keypoints['left_ankle'];
if (leftHip && leftKnee && leftAnkle &&
leftHip.score > 0.5 && leftKnee.score > 0.5 && leftAnkle.score > 0.5) {
// Calculate the angle of the left knee
const kneeAngle = calculateAngle(leftHip, leftKnee, leftAnkle);
// Simple logic for squat depth
if (kneeAngle > 160) {
setFeedback('Start Squat');
} else if (kneeAngle < 100) {
setFeedback('Good Squat! 👍');
} else {
setFeedback('Go Lower!');
}
}
}
}, [poses]);
// ... in the return statement, add a Text component for feedback
<Text style={styles.feedbackText}>{feedback}</Text>
// ... in StyleSheet, add styles for the feedback text
feedbackText: {
position: 'absolute',
bottom: 100,
left: 20,
backgroundColor: 'rgba(0, 0, 0, 0.5)',
color: 'white',
fontSize: 24,
padding: 10,
borderRadius: 5,
zIndex: 10,
}
How it works
calculateAngle: This function uses vector math (atan2) to find the angle formed by three keypoints. We use this to measure the bend in our joints.useEffecthook: We run our analysis logic inside auseEffectthat depends onposes. This ensures the logic runs every time a new pose is detected.- Form Logic: We get the hip, knee, and ankle keypoints. We calculate the
kneeAngleand apply simple rules:- If the angle is > 160°, the user is likely standing.
- If the angle is < 100°, they've reached a good squat depth.
- Otherwise, they need to go lower.
- The feedback is stored in state and displayed in a
<Text>component overlaid on the screen.
Performance Considerations
- Model Choice: We used
SINGLEPOSE_LIGHTNINGfor a reason—it's fast. For higher accuracy, you could useSINGLEPOSE_THUNDER, but expect a performance hit. - Tensor Management: Forgetting to call
tf.dispose()on tensors will lead to memory leaks and app crashes.cameraWithTensorshelps manage this, but be mindful if you create tensors manually. - Frame Rate: The
cameraWithTensorsapproach is good but not perfect. For production-grade apps that need consistent high FPS, explorereact-native-vision-cameraand its JSI-based Frame Processors, which allow you to run JavaScript code synchronously on the camera thread for maximum performance. - Model Quantization: For even better performance, you can use post-training quantization to reduce the size of your model, leading to faster load times and inference.
Security Best Practices
One of the biggest advantages of this architecture is its privacy.
- On-Device Processing: All image processing happens on the user's device. No video or personal data is sent to the cloud. This is a critical feature for any health and fitness app and a major selling point.
- Permissions: We correctly request camera permissions and handle the case where the user denies them. Always explain why you need a permission before asking for it.
Conclusion
Congratulations! You've just built a sophisticated AI-powered fitness application with React Native. We've taken a real-time camera feed, used a powerful on-device machine learning model to understand human poses, and provided instant, actionable feedback to help a user improve their squat form.
This is just the beginning. You can now expand on this foundation:
- Add logic for other exercises (push-ups, lunges).
- Implement a rep counter.
- Create full workout routines.
- Track progress over time and save it locally.
The combination of mobile development with on-device AI opens up a world of possibilities for creating intelligent, private, and genuinely helpful applications.