Embeddings
An embedding turns a piece of text into a vector of numbers that captures
its meaning, so you can compare texts by distance (semantic search, clustering,
retrieval-augmented generation). Effect’s EmbeddingModel
service is the provider-agnostic interface for producing those vectors: your
code calls embed / embedMany, and which provider answers (OpenAI, an
OpenAI-compatible endpoint, a local model, …) is a Layer you wire up once.
The service is intentionally small. A provider only has to supply one batch
function — embedMany — and EmbeddingModel.make derives everything else from
it, including the ability to coalesce many concurrent single-input embed calls
into a single provider request.
import { Effect } from "effect"import { EmbeddingModel } from "effect/unstable/ai"
const program = Effect.gen(function* () { const model = yield* EmbeddingModel.EmbeddingModel
// One input -> one vector. const response = yield* model.embed("the quick brown fox") return response.vector // => readonly number[]})Embedding a single string
Section titled “Embedding a single string”embed takes one string and resolves to an EmbedResponse whose vector is
the embedding. Internally each embed call goes through a
request resolver, so several embed calls running concurrently
are batched into a single provider request (see Batching).
import { Effect } from "effect"import { EmbeddingModel } from "effect/unstable/ai"
const query = Effect.gen(function* () { const model = yield* EmbeddingModel.EmbeddingModel
const { vector } = yield* model.embed("how do I cancel my order?") return vector // => [0.0123, -0.0481, 0.0099, ...]})Embedding many strings
Section titled “Embedding many strings”When you already hold a batch, call embedMany. It preserves input order and
returns an EmbedManyResponse with one EmbedResponse per input plus
provider-reported token usage.
import { Effect } from "effect"import { EmbeddingModel } from "effect/unstable/ai"
const ingest = Effect.gen(function* () { const model = yield* EmbeddingModel.EmbeddingModel
const response = yield* model.embedMany([ "Effect is a TypeScript library", "Embeddings are numeric vectors", "Layers wire up dependencies" ])
response.embeddings.length // => 3 response.embeddings[0].vector // => number[] for the first input response.usage.inputTokens // => number | undefined (when the provider reports it)
return response.embeddings.map((e) => e.vector)})Providing a concrete model
Section titled “Providing a concrete model”EmbeddingModel.EmbeddingModel is a Context.Service — a requirement that must
be satisfied by a Layer at the edge of your app. Provider packages expose
helpers that build that Layer (and the matching Dimensions
service) from a client.
import { OpenAiClient, OpenAiEmbeddingModel } from "@effect/ai-openai-compat"import { Config, Effect, Layer } from "effect"import { FetchHttpClient } from "effect/unstable/http"import { EmbeddingModel } from "effect/unstable/ai"
// 1. The client Layer holds your API key and an HttpClient.const OpenAiClientLayer = OpenAiClient.layerConfig({ apiKey: Config.redacted("OPENAI_API_KEY")}).pipe(Layer.provide(FetchHttpClient.layer))
// 2. The model Layer selects a concrete embedding model + its dimensions.// `model(...)` provides BOTH EmbeddingModel and Dimensions.const EmbeddingLayer = OpenAiEmbeddingModel.model("text-embedding-3-small", { dimensions: 1536})
const program = Effect.gen(function* () { const model = yield* EmbeddingModel.EmbeddingModel const dimensions = yield* EmbeddingModel.Dimensions // => 1536
const { vector } = yield* model.embed("hello world") return { vector, dimensions }}).pipe( Effect.provide(EmbeddingLayer), Effect.provide(OpenAiClientLayer))OpenAiEmbeddingModel.layer(...) provides only the EmbeddingModel service
(no Dimensions) when you manage the vector size yourself.
A fake provider for tests
Section titled “A fake provider for tests”You rarely build the provider Layer by hand in application code, but it is the
clearest way to see the contract: implement embedMany, and make derives the
rest. This is exactly how the real provider packages are built, and it is ideal
for unit tests.
import { Effect, Layer } from "effect"import { EmbeddingModel } from "effect/unstable/ai"import type { AiError } from "effect/unstable/ai"
// A deterministic stub: each vector is just [length of the input].const TestEmbeddingLayer = Layer.effect( EmbeddingModel.EmbeddingModel, EmbeddingModel.make({ embedMany: ({ inputs }: EmbeddingModel.ProviderOptions): Effect.Effect< EmbeddingModel.ProviderResponse, AiError.AiError > => Effect.succeed({ results: inputs.map((input) => [input.length]), usage: { inputTokens: inputs.join(" ").length } }) }))
const test = Effect.gen(function* () { const model = yield* EmbeddingModel.EmbeddingModel const { vector } = yield* model.embed("hello") return vector // => [5]}).pipe(Effect.provide(TestEmbeddingLayer))Batching many embeddings
Section titled “Batching many embeddings”Embedding APIs are far cheaper per token when you send many inputs in one HTTP
request. EmbeddingModel.make builds embed on top of a
RequestResolver, so concurrent embed calls collapse into a
single embedMany provider call automatically — you do not have to hand-build
batches.
import { Effect } from "effect"import { EmbeddingModel } from "effect/unstable/ai"
const search = Effect.gen(function* () { const model = yield* EmbeddingModel.EmbeddingModel
// Three independent embed calls running concurrently... const [a, b, c] = yield* Effect.all( [model.embed("apple"), model.embed("banana"), model.embed("cherry")], { concurrency: "unbounded" } )
// ...are coalesced into ONE provider `embedMany(["apple","banana","cherry"])` // call. Vectors come back in request order. return [a.vector, b.vector, c.vector]})Because each embed is a request, results can also be cached.
The service exposes its underlying resolver, so you can wrap it with
RequestResolver.withCache and issue requests through
Effect.request directly — embedding the same string twice then hits the cache
instead of the provider.
import { Effect, RequestResolver } from "effect"import { EmbeddingModel } from "effect/unstable/ai"
const cached = Effect.gen(function* () { const model = yield* EmbeddingModel.EmbeddingModel
// Wrap the resolver in an in-memory cache (keyed by the request value). const cachedResolver = yield* RequestResolver.withCache(model.resolver, { capacity: 256 })
const embed = (input: string) => Effect.request(new EmbeddingModel.EmbeddingRequest({ input }), cachedResolver)
// The second identical request is served from the cache. const first = yield* embed("repeat me") const second = yield* embed("repeat me") return [first.vector, second.vector]})A small semantic search
Section titled “A small semantic search”Putting it together: embed a corpus once with embedMany, embed the query with
embed, and rank documents by cosine similarity.
import { Array, Effect } from "effect"import { EmbeddingModel } from "effect/unstable/ai"
const cosine = (a: ReadonlyArray<number>, b: ReadonlyArray<number>) => { let dot = 0 let na = 0 let nb = 0 for (let i = 0; i < a.length; i++) { dot += a[i] * b[i] na += a[i] * a[i] nb += b[i] * b[i] } return dot / (Math.sqrt(na) * Math.sqrt(nb))}
const semanticSearch = (query: string, documents: ReadonlyArray<string>) => Effect.gen(function* () { const model = yield* EmbeddingModel.EmbeddingModel
const { embeddings } = yield* model.embedMany(documents) const { vector } = yield* model.embed(query)
return Array.map(documents, (doc, i) => ({ doc, score: cosine(vector, embeddings[i].vector) })).sort((x, y) => y.score - x.score) // => documents ranked most-similar first })API reference
Section titled “API reference”Everything below is exported from effect/unstable/ai/EmbeddingModel.
EmbeddingModel
Section titled “EmbeddingModel”The Context.Service tag for embedding operations. Yield it to obtain a
Service with embed, embedMany, and the underlying resolver.
import { Effect } from "effect"import { EmbeddingModel } from "effect/unstable/ai"
const program = Effect.gen(function* () { const model = yield* EmbeddingModel.EmbeddingModel return yield* model.embed("hello")})// program requires: EmbeddingModel.EmbeddingModelDimensions
Section titled “Dimensions”A separate Context.Service (a number) carrying the configured embedding
vector size. Provider helpers like OpenAiEmbeddingModel.model(...) provide it
alongside the model; downstream code (e.g. a vector store schema) can read it.
import { Effect, Layer } from "effect"import { EmbeddingModel } from "effect/unstable/ai"
const program = Effect.gen(function* () { const size = yield* EmbeddingModel.Dimensions return size // => 1536}).pipe(Effect.provide(Layer.succeed(EmbeddingModel.Dimensions, 1536)))Service
Section titled “Service”The interface behind the EmbeddingModel tag. embed resolves one input,
embedMany resolves a batch, and resolver is the low-level
RequestResolver<EmbeddingRequest> that embed is built on.
import type { Effect } from "effect"import type { RequestResolver } from "effect"import type { EmbeddingModel } from "effect/unstable/ai"import type { AiError } from "effect/unstable/ai"
// Shape (for reference):interface Service { readonly resolver: RequestResolver.RequestResolver<EmbeddingModel.EmbeddingRequest> readonly embed: ( input: string ) => Effect.Effect<EmbeddingModel.EmbedResponse, AiError.AiError> readonly embedMany: ( input: ReadonlyArray<string> ) => Effect.Effect<EmbeddingModel.EmbedManyResponse, AiError.AiError>}Builds a Service from a single provider embedMany function. It wires up a
request resolver so concurrent embed calls batch into one provider call, and
short-circuits embedMany([]) without invoking the provider. Returns
Effect<Service> (typically wrapped in Layer.effect).
import { Effect, Layer } from "effect"import { EmbeddingModel } from "effect/unstable/ai"
const layer = Layer.effect( EmbeddingModel.EmbeddingModel, EmbeddingModel.make({ embedMany: ({ inputs }) => Effect.succeed({ results: inputs.map((s) => [s.length]), usage: { inputTokens: undefined } }) }))// layer: Layer<EmbeddingModel.EmbeddingModel>EmbedResponse
Section titled “EmbedResponse”A Schema.Class for a single embedding result. Its only field is vector, the
array of finite numbers.
import { EmbeddingModel } from "effect/unstable/ai"
const r = new EmbeddingModel.EmbedResponse({ vector: [0.1, 0.2, 0.3] })r.vector // => [0.1, 0.2, 0.3]EmbedManyResponse
Section titled “EmbedManyResponse”A Schema.Class for a batch result. embeddings is an array of
EmbedResponse in input order, and usage is an
EmbeddingUsage.
import { EmbeddingModel } from "effect/unstable/ai"
const r = new EmbeddingModel.EmbedManyResponse({ embeddings: [ new EmbeddingModel.EmbedResponse({ vector: [1, 2] }), new EmbeddingModel.EmbedResponse({ vector: [3, 4] }) ], usage: new EmbeddingModel.EmbeddingUsage({ inputTokens: 9 })})
r.embeddings.length // => 2r.embeddings[1].vector // => [3, 4]r.usage.inputTokens // => 9EmbeddingUsage
Section titled “EmbeddingUsage”A Schema.Class holding token usage metadata. inputTokens is number | undefined — undefined when the provider does not report usage (or when
embedMany([]) skips the provider).
import { EmbeddingModel } from "effect/unstable/ai"
new EmbeddingModel.EmbeddingUsage({ inputTokens: 42 }).inputTokens // => 42new EmbeddingModel.EmbeddingUsage({ inputTokens: undefined }).inputTokens // => undefinedEmbeddingRequest
Section titled “EmbeddingRequest”A Request.TaggedClass representing one input to be embedded. It resolves to an
EmbedResponse and can fail with AiError. You build these
only when working directly with the resolver; embed does it for
you.
import { Effect } from "effect"import { EmbeddingModel } from "effect/unstable/ai"
const program = Effect.gen(function* () { const { resolver } = yield* EmbeddingModel.EmbeddingModel // Issue a request directly against the resolver (what `embed` does internally). const response = yield* Effect.request( new EmbeddingModel.EmbeddingRequest({ input: "hello" }), resolver ) return response.vector // => number[]})ProviderOptions
Section titled “ProviderOptions”The input a provider’s embedMany receives: { inputs: ReadonlyArray<string> }.
This is the only argument your provider implementation is handed.
import type { EmbeddingModel } from "effect/unstable/ai"
const options: EmbeddingModel.ProviderOptions = { inputs: ["a", "b", "c"]}options.inputs // => ["a", "b", "c"]ProviderResponse
Section titled “ProviderResponse”The value a provider’s embedMany must return: results, an array of raw
numeric vectors (one per input, in order), and usage.inputTokens
(number | undefined).
import type { EmbeddingModel } from "effect/unstable/ai"
const response: EmbeddingModel.ProviderResponse = { results: [ [0.1, 0.2], [0.3, 0.4] ], usage: { inputTokens: 8 }}response.results.length // => 2response.usage.inputTokens // => 8See also
Section titled “See also”- Language Model — text generation, structured output, and streaming against a provider-agnostic model.
- Batching — the request/resolver machinery that powers
embedbatching and caching. - Services and Layers — how the
EmbeddingModelandDimensionsservices are provided.