Skip to main content
Retrieve specific chunks by their document ID and chunk number in a single batch operation. Useful for fetching exact chunks after retrieval or for building custom pipelines.
from morphik import Morphik

db = Morphik("your-uri")

chunks = db.batch_get_chunks(
    sources=[
        {"document_id": "doc_abc123", "chunk_number": 0},
        {"document_id": "doc_abc123", "chunk_number": 1},
        {"document_id": "doc_xyz789", "chunk_number": 5}
    ],
    folder_name="/reports",
    use_colpali=True,
    output_format="url"
)

for chunk in chunks:
    print(f"Doc {chunk.document_id}, Chunk {chunk.chunk_number}")
    print(f"Content: {chunk.content[:200]}...")

Parameters

ParameterTypeDefaultDescription
sourcesarrayrequiredList of {document_id, chunk_number} objects
use_colpalibooleantrueUse Morphik multimodal embeddings when available
output_formatstring"base64"Image format: base64, url, or text
folder_namestringnullOptional folder scope

Response

[
  {
    "document_id": "doc_abc123",
    "chunk_number": 0,
    "content": "Introduction to the quarterly report...",
    "content_type": "text/plain",
    "score": 1.0,
    "metadata": { "department": "sales" }
  },
  {
    "document_id": "doc_abc123",
    "chunk_number": 1,
    "content": "Revenue highlights for Q4...",
    "content_type": "text/plain",
    "score": 1.0,
    "metadata": { "department": "sales" }
  }
]
This is useful when you already know which chunks you need (e.g., from a previous retrieval result) and want to fetch their full content efficiently.