Async + bulk batching

For bulk pipelines (thousands of records), asyncio-based services (FastAPI/Starlette), or anything that benefits from concurrent calls under a single client, use AsyncMinerva instead of the sync Minerva entry point. The API surface is identical — just await every call.

Basic usage

import asyncio
from minerva import AsyncMinerva

async def main():
    async with AsyncMinerva() as mc:
        resp = await mc.api.enrich(
            [{"record_id": "1", "linkedin_url": "https://www.linkedin.com/in/x"}]
        )
        print(resp.to_df())

asyncio.run(main())

The async with form closes the underlying HTTP pool on exit. Without it, call await mc.close() manually.

Auto-batching with `enrich_many` / `resolve_many`

The Minerva API caps /v2/enrich and /v2/resolve at 500 records per request. Rather than write your own chunking + fan-out, the SDK’s batch helpers do it for you and aggregate results:

async with AsyncMinerva() as mc:
    results = await mc.api.enrich_many(
        my_10000_records,                  # any size
        batch_size=500,                    # default; capped at 500 (API max)
        max_concurrency=20,                # optional ceiling on in-flight batches
    )
    # `results` is a flat list of EnrichResult, concatenated in input order.

resolve_many has the same shape for resolves.

How `max_concurrency` interacts with rate limits

max_concurrency is a hard ceiling on simultaneously-in-flight batches — useful when you also hold a separate scarce resource (e.g. a downstream DB). For most users it’s safe to leave unset: the SDK’s built-in rate limiter governs throughput so you can fan out hundreds of batches without exceeding your plan.

Failure semantics

A single batch that errors propagates out of enrich_many / resolve_many and cancels in-flight peers. There’s no partial-success aggregation in the helpers. For partial-failure handling, fan out manually:

batches = [records[i:i+500] for i in range(0, len(records), 500)]
results = await asyncio.gather(
    *[mc.api.enrich(b) for b in batches],
    return_exceptions=True,
)
# results contains EnrichResponse or Exception per batch

Records arriving from a stream?

enrich_many / resolve_many expect the whole list in memory. If records arrive incrementally (CSV cursor, Kafka, SQS, generator), use the streaming batcher instead — feed it one record at a time and it fires when the buffer hits size. Same rate-limiter, same transient-retry semantics; just a different ergonomic entry point.

Bulk batching for custom endpoints

enrich_many and resolve_many only cover the typed endpoints. For an arbitrary preview / client-specific path (mc.api.call(...)), the equivalent is AsyncBatcher wrapping the call — see the streaming-batcher guide → “Bulk fan-out: 10k records through a custom endpoint” for the worked example.

Getting started

Endpoints

Power features

Reference

Basic usage

Auto-batching with `enrich_many` / `resolve_many`

How `max_concurrency` interacts with rate limits

Failure semantics

Records arriving from a stream?

Bulk batching for custom endpoints

​Basic usage

​Auto-batching with enrich_many / resolve_many

​How max_concurrency interacts with rate limits

​Failure semantics

​Records arriving from a stream?

​Bulk batching for custom endpoints

Basic usage

Auto-batching with `enrich_many` / `resolve_many`

How `max_concurrency` interacts with rate limits

Failure semantics

Records arriving from a stream?

Bulk batching for custom endpoints