MediaPipe Facemesh is a lightweight machine learning pipeline predicting 486 3D facial landmarks to infer the approximate surface geometry of a human face (paper).

More background information about the model, as well as its performance characteristics on different datasets, can be found here: https://drive.google.com/file/d/1VFC_wIpw4O7xBOiTgUldl79d9LA-LsnA/view

The model is designed for front-facing cameras on mobile devices, where faces in view tend to occupy a relatively large fraction of the canvas. MediaPipe Facemesh may struggle to identify far-away faces.

Check out our demo, which uses the model to detect facial landmarks in a live video stream.

This model is also available as part of MediaPipe, a framework for building multimodal applied ML pipelines.

Installation

Via script tags:

<!-- Require the peer dependencies of facemesh. -->
<script src="https://unpkg.com/@tensorflow/tfjs-core@2.1.0/dist/tf-core.js"></script>
<script src="https://unpkg.com/@tensorflow/tfjs-converter@2.1.0/dist/tf-converter.js"></script>

<!-- You must explicitly require a TF.js backend if you're not using the tfs union bundle. -->
<script src="https://unpkg.com/@tensorflow/tfjs-backend-wasm@2.1.0/dist/tf-backend-wasm.js"></script>
<!-- Alternatively you can use the WebGL backend: <script src="https://unpkg.com/@tensorflow/tfjs-backend-webgl@2.1.0/dist/tf-backend-webgl.js"></script> -->

Via npm:

Using yarn:

$ yarn add @tensorflow-models/facemesh

$ yarn add @tensorflow/tfjs-core, @tensorflow/tfjs-converter
$ yarn add @tensorflow/tfjs-backend-wasm # or @tensorflow/tfjs-backend-webgl

Usage

If you are using via npm, first add:

const facemesh = require('@tensorflow-models/facemesh');

// If you are using the WASM backend:
require('@tensorflow/tfjs-backend-wasm'); // You need to require the backend explicitly because facemesh itself does not

// If you are using the WebGL backend:
// require('@tensorflow/tfjs-backend-webgl');

Then:


async function main() {
  // Load the MediaPipe facemesh model.
  const model = await facemesh.load();

  // Pass in a video stream (or an image, canvas, or 3D tensor) to obtain an
  // array of detected faces from the MediaPipe graph.
  const predictions = await model.estimateFaces(document.querySelector("video"));

  if (predictions.length > 0) {
    /*
    `predictions` is an array of objects describing each detected face, for example:

    [
      {
        faceInViewConfidence: 1, // The probability of a face being present.
        boundingBox: { // The bounding box surrounding the face.
          topLeft: [232.28, 145.26],
          bottomRight: [449.75, 308.36],
        },
        mesh: [ // The 3D coordinates of each facial landmark.
          [92.07, 119.49, -17.54],
          [91.97, 102.52, -30.54],
          ...
        ],
        scaledMesh: [ // The 3D coordinates of each facial landmark, normalized.
          [322.32, 297.58, -17.54],
          [322.18, 263.95, -30.54]
        ],
        annotations: { // Semantic groupings of the `scaledMesh` coordinates.
          silhouette: [
            [326.19, 124.72, -3.82],
            [351.06, 126.30, -3.00],
            ...
          ],
          ...
        }
      }
    ]
    */

    for (let i = 0; i < predictions.length; i++) {
      const keypoints = predictions[i].scaledMesh;

      // Log facial keypoints.
      for (let i = 0; i < keypoints.length; i++) {
        const [x, y, z] = keypoints[i];

        console.log(`Keypoint ${i}: [${x}, ${y}, ${z}]`);
      }
    }
  }
}

main();

Parameters for facemesh.load()

facemesh.load() takes a configuration object with the following properties:

maxContinuousChecks - How many frames to go without running the bounding box detector. Only relevant if maxFaces > 1. Defaults to 5.
detectionConfidence - Threshold for discarding a prediction. Defaults to 0.9.
maxFaces - The maximum number of faces detected in the input. Should be set to the minimum number for performance. Defaults to 10.
iouThreshold - A float representing the threshold for deciding whether boxes overlap too much in non-maximum suppression. Must be between [0, 1]. Defaults to 0.3.
scoreThreshold - A threshold for deciding when to remove boxes based on score in non-maximum suppression. Defaults to 0.75.

Parameters for model.estimateFace()

input - The image to classify. Can be a tensor, DOM element image, video, or canvas.
returnTensors - (defaults to false) Whether to return tensors as opposed to values.
flipHorizontal - Whether to flip/mirror the facial keypoints horizontally. Should be true for videos that are flipped by default (e.g. webcams).

Keypoints

Here is map of the keypoints:

The UV coordinates for these keypoints are available via the getUVCoords() method on the FaceMesh model object. They can also be found in src/uv_coords.ts.

FAQs

What is @tensorflow-models/facemesh?

Is @tensorflow-models/facemesh well maintained?

Last updated on 12 Oct 2020

Did you know?

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install