Retrieve specific chunks by their document ID and chunk number in a single batch operation. Useful for fetching exact chunks after retrieval or for building custom pipelines.
from morphik import Morphik
db = Morphik("your-uri")
chunks = db.batch_get_chunks(
sources=[
{"document_id": "doc_abc123", "chunk_number": 0},
{"document_id": "doc_abc123", "chunk_number": 1},
{"document_id": "doc_xyz789", "chunk_number": 5}
],
folder_name="/reports",
use_colpali=True,
output_format="url"
)
for chunk in chunks:
print(f"Doc {chunk.document_id}, Chunk {chunk.chunk_number}")
print(f"Content: {chunk.content[:200]}...")
import Morphik from 'morphik';
// For Teams/Enterprise, use your dedicated host: https://companyname-api.morphik.ai
const client = new Morphik({
apiKey: process.env.MORPHIK_API_KEY,
baseURL: 'https://api.morphik.ai'
});
const chunks = await client.batch.retrieveChunks({
sources: [
{ document_id: 'doc_abc123', chunk_number: 0 },
{ document_id: 'doc_abc123', chunk_number: 1 },
{ document_id: 'doc_xyz789', chunk_number: 5 }
],
folder_name: '/reports',
use_colpali: true,
output_format: 'url'
});
chunks.forEach(chunk => {
console.log(`Doc ${chunk.document_id}, Chunk ${chunk.chunk_number}`);
console.log(`Content: ${chunk.content.slice(0, 200)}...`);
});
curl -X POST "https://api.morphik.ai/batch/chunks" \
-H "Authorization: Bearer $MORPHIK_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"sources": [
{"document_id": "doc_abc123", "chunk_number": 0},
{"document_id": "doc_abc123", "chunk_number": 1},
{"document_id": "doc_xyz789", "chunk_number": 5}
],
"folder_name": "/reports",
"use_colpali": true,
"output_format": "url"
}'
Parameters
| Parameter | Type | Default | Description |
|---|
sources | array | required | List of {document_id, chunk_number} objects |
use_colpali | boolean | true | Use Morphik multimodal embeddings when available |
output_format | string | "base64" | Image format: base64, url, or text |
folder_name | string | null | Optional folder scope |
Response
[
{
"document_id": "doc_abc123",
"chunk_number": 0,
"content": "Introduction to the quarterly report...",
"content_type": "text/plain",
"score": 1.0,
"metadata": { "department": "sales" }
},
{
"document_id": "doc_abc123",
"chunk_number": 1,
"content": "Revenue highlights for Q4...",
"content_type": "text/plain",
"score": 1.0,
"metadata": { "department": "sales" }
}
]
This is useful when you already know which chunks you need (e.g., from a previous retrieval result) and want to fetch their full content efficiently.