When you want to add "image search by text," "semantic search," or "similarity recommendations" to your AI app, what's the first tool that comes to mind? Chroma? Pinecone? Maybe Weaviate?
Today I'm introducing an open-source project that might make you rethink your vector database choices entirely. It's called Zvec, an embedded vector database open-sourced by Alibaba's Tongyi Lab — often described as "the SQLite of vector databases."
No server, no Docker, no config files. Just pip install zvec and it runs right inside your Python process. It really is that simple.
Project Overview
Zvec is a lightweight, in-process vector database open-sourced by Alibaba. Its design philosophy mirrors SQLite — no external dependencies, just embed it directly into your application.
| Feature | Details |
|---|---|
| GitHub Stars | 10,400+ (trending on GitHub Trending in June 2026) |
| License | Apache 2.0 |
| Language | C++ core with multi-language SDKs (Python / Node.js / Go / Rust / Dart) |
| Latest Version | v0.5.0 (2026-06-12) |
| Underlying Engine | Proxima (Alibaba's internal vector search engine) |
Zvec vs Traditional Vector Databases
| Feature | Zvec | Chroma | Pinecone | Milvus |
|---|---|---|---|---|
| Deployment | In-process (embedded) | Requires a service | Cloud-hosted | Requires deployment |
| Installation | pip install zvec |
Needs Docker | Sign up for an account | Complex setup |
| Memory Footprint | Low (DiskANN index supports disk storage) | Medium | N/A | High |
| Best For | Local apps, edge, small-to-medium RAG | Prototyping | Production cloud | Large-scale clusters |
Bottom line: If you need a vector database that just works out of the box — without fiddling with servers or configs — Zvec is the lightest and fastest option.
Installation
Zvec supports multiple operating systems and programming languages. Installation is straightforward.
Python (Recommended)
pip install zvec
Requires Python 3.10–3.14.
Node.js
npm install @zvec/zvec
Go
go get github.com/zvec-ai/zvec-go
Rust
# Cargo.toml
[dependencies]
zvec = "0.5"
System Requirements
| Platform | Architecture |
|---|---|
| Linux | x86_64, ARM64 |
| macOS | ARM64 (Apple Silicon) |
| Windows | x86_64 |
Quick Start
Your First Vector Search
Just a few lines of code to create a complete vector database and run a similarity search:
import zvec
# 1. Define the collection schema
schema = zvec.CollectionSchema(
name="my_collection",
vectors=zvec.VectorSchema("embedding", zvec.DataType.VECTOR_FP32, 4),
)
# 2. Create and open the collection (auto-persisted to disk)
collection = zvec.create_and_open(path="./my_zvec_data", schema=schema)
# 3. Insert vector data
collection.insert([
zvec.Doc(id="doc_1", vectors={"embedding": [0.1, 0.2, 0.3, 0.4]}),
zvec.Doc(id="doc_2", vectors={"embedding": [0.2, 0.3, 0.4, 0.1]}),
zvec.Doc(id="doc_3", vectors={"embedding": [0.9, 0.1, 0.1, 0.2]}),
])
# 4. Run a vector similarity search
results = collection.query(
zvec.VectorQuery("embedding", vector=[0.2, 0.3, 0.3, 0.1]),
topk=2
)
# Print results
for r in results:
print(f"ID: {r['id']}, Similarity Score: {r['score']:.4f}")
Output:
ID: doc_2, Similarity Score: 0.9899
ID: doc_1, Similarity Score: 0.9798
Building a Simple RAG Application
Combine Zvec with OpenAI's embedding model to quickly set up a document Q&A system:
import zvec
import openai
# Initialize the embedding model
client = openai.OpenAI(api_key="your-api-key")
def get_embedding(text: str) -> list[float]:
response = client.embeddings.create(
model="text-embedding-3-small",
input=text
)
return response.data[0].embedding
# Create the collection
schema = zvec.CollectionSchema(
name="rag_docs",
vectors=zvec.VectorSchema("embedding", zvec.DataType.VECTOR_FP32, 1536),
fields=[
zvec.FieldSchema("content", zvec.DataType.STRING),
zvec.FieldSchema("source", zvec.DataType.STRING),
],
)
collection = zvec.create_and_open(path="./rag_data", schema=schema)
# Index documents
documents = [
("Zvec is a lightweight vector database open-sourced by Alibaba", "doc1.md"),
("Vector databases are used to store and retrieve high-dimensional vector data", "doc2.md"),
("RAG technology combines retrieval and generation — it's the mainstream approach for LLM apps today", "doc3.md"),
]
for content, source in documents:
embedding = get_embedding(content)
collection.insert([
zvec.Doc(
vectors={"embedding": embedding},
fields={"content": content, "source": source},
)
])
# Search for the most relevant document
question = "What is a vector database?"
query_embedding = get_embedding(question)
results = collection.query(
zvec.VectorQuery("embedding", vector=query_embedding),
topk=2
)
print("Relevant documents:")
for r in results:
print(f"- [{r['fields']['source']}] {r['fields']['content']}")
print(f" Similarity: {r['score']:.4f}\n")
Advanced Features
Full-Text Search
Version 0.5.0 introduced native full-text search — no need to bring in Elasticsearch or any other external search engine:
import zvec
schema = zvec.CollectionSchema(
name="ft_search",
vectors=zvec.VectorSchema("embedding", zvec.DataType.VECTOR_FP32, 384),
fields=[
zvec.FieldSchema("title", zvec.DataType.STRING),
zvec.FieldSchema("body", zvec.DataType.STRING),
],
)
collection = zvec.create_and_open(path="./fts_data", schema=schema)
# Create a full-text search index on the body field
collection.create_fts_index("body_fts", field="body")
# Insert data
collection.insert([
zvec.Doc(
id="1",
vectors={"embedding": [0.1] * 384},
fields={"title": "Python Tutorial", "body": "Python is a powerful programming language widely used in AI and data analysis."},
),
zvec.Doc(
id="2",
vectors={"embedding": [0.2] * 384},
fields={"title": "JavaScript Basics", "body": "JavaScript is the core language for frontend development, supporting Web, Node.js, and more."},
),
])
# Full-text search
results = collection.query(
zvec.FTSQuery("body_fts", "Python data analysis"),
topk=5
)
for r in results:
print(f"[{r['fields']['title']}] {r['fields']['body'][:30]}...")
Hybrid Search
One of Zvec's most powerful features: combining vector search, full-text search, and scalar filtering in a single query for maximum relevance:
# Hybrid search: vector + full-text + scalar filter
results = collection.query(
zvec.MultiQuery([
# Vector similarity
zvec.VectorQuery("embedding", vector=[0.15] * 384, weight=0.7),
# Full-text search
zvec.FTSQuery("body_fts", "programming language tutorial", weight=0.3),
]),
topk=3
)
for r in results:
print(f"[{r['id']}] Hybrid Score: {r['score']:.4f}")
DiskANN Disk Index
When your vector data grows too large for memory, Zvec's DiskANN index stores most of the index on disk, dramatically reducing memory usage:
schema = zvec.CollectionSchema(
name="large_collection",
vectors=zvec.VectorSchema("embedding", zvec.DataType.VECTOR_FP32, 768),
)
# Create collection with DiskANN index
collection = zvec.create_and_open(
path="./diskann_data",
schema=schema,
index_options=zvec.IndexOptions(
index_type=zvec.IndexType.DISKANN, # disk-based index
)
)
# Insert large amounts of data (millions of vectors are no problem)
for i in range(100000):
collection.insert([zvec.Doc(
id=f"doc_{i}",
vectors={"embedding": [0.1 * (i % 768)] * 768},
)])
Scalar Filtering
Add structured field filters on top of vector search:
results = collection.query(
zvec.ScalarQuery("category = 'AI' AND published_year >= 2025"),
topk=10
)
Persistence & Concurrency
Zvec uses WAL (Write-Ahead Logging) to guarantee data persistence — even if the process crashes or power is lost, your data stays safe. It also supports multiple processes reading the same collection concurrently (writes are exclusive):
# Process A: writing
collection_a = zvec.open("./shared_data", mode="write")
collection_a.insert([zvec.Doc(id="new_1", vectors={"embedding": [0.5] * 4})])
collection_a.flush()
# Process B: reading (in parallel with Process A)
collection_b = zvec.open("./shared_data", mode="read")
results = collection_b.query(
zvec.VectorQuery("embedding", vector=[0.5] * 4),
topk=5
)
Performance
Based on official benchmarks, Zvec delivers strong results across multiple dimensions:
- Search latency: Millisecond-level response, even on billion-vector datasets
- Memory efficiency: DiskANN index uses 10x+ less memory than traditional in-memory indexes
- Throughput: Supports high-concurrency query scenarios
- Vs. competitors: On the same hardware, Zvec outperforms Chroma and Milvus in both QPS (queries per second) and latency
For detailed benchmark data, see the official Benchmarks documentation.
Visual Debugging Tool: Zvec Studio
Prefer a GUI over writing code to inspect vector data? Zvec officially provides Zvec Studio:
git clone https://github.com/zvec-ai/zvec-studio
cd zvec-studio
npm install && npm run dev
Open your browser and you can browse collection data, debug queries, and check index status — zero code required.
Use Cases
| Scenario | Why Zvec Fits |
|---|---|
| Personal RAG app | No external service needed — pip install and go |
| Desktop AI applications | Embedded design, no extra process |
| Mobile / Edge devices | Lightweight, low memory footprint |
| Rapid prototyping | Zero config, get a full pipeline running in 5 minutes |
| Small-to-medium production | WAL persistence + DiskANN for reliability |
| Multi-language projects | Python / Node.js / Go / Rust / Dart SDKs available |
Wrap-Up
Zvec fills an important gap — sitting between Docker-based vector databases like Chroma and cloud services like Pinecone, it offers a truly lightweight, truly embedded alternative.
Zvec is worth a 5-minute try if you're the kind of developer who:
- Wants to run a RAG app locally without dealing with Docker
- Needs a simple vector database for semantic search prototyping
- Is building a desktop or mobile AI app and needs a lightweight embedded solution
- Is tired of complex configs and just wants something that works with
pip install
Project: github.com/alibaba/zvec
Docs: zvec.org
Already using Zvec? Share your experience in the comments. Or are you using a different vector database? Let's chat about how Zvec compares.