llm
Package to connect and trace LLM calls.
Usage
import { LLM } from "@empiricalrun/llm";
const llm = new LLM({
provider: "openai",
defaultModel: "gpt-4o",
});
const llmResponse = await llm.createChatCompletion({ ... });
Vision utilities
This package also contains utilities for vision.
Query
Ask a question against the image (e.g. to extract some info, make a decision) and get the answer.
import { query } from "@empiricalrun/llm/vision";
const data = await driver.saveScreenshot("dummy.png");
const instruction =
"Extract number of ATOM tokens from the image. Return only the number.";
const text = await query(data.toString("base64"), instruction);
Get bounding boxes
import { getBoundingBox } from "@empiricalrun/llm/vision/bbox";
const data = await driver.saveScreenshot("dummy.png");
const instruction =
"This screenshot shows a screen to send crypto tokens. What is the bounding box for the dropdown to select the token?";
const bbox = await getBoundingBox(data.toString("base64"), instruction);
const centerToTap = bbox.center;
Bounding box can require some prompt iterations, and you can do that with a debug
flag. This flag
returns a base64 image that has the bounding box drawn on top of the original image.
const bbox = await getBoundingBox(data.toString("base64"), instruction, {
debug: true,
});
console.log(bbox.annotatedImage);