Milvus Vector Database Tutorial

What is the Milvus Vector Database?

Milvus was created in 2019 with the main goal of storing, indexing, and managing large-scale embedding vectors generated by deep neural networks and other machine learning (ML) models.

As a database specifically designed to handle input vector queries, Milvus can index vectors on the scale of tens of billions. Unlike existing relational databases that primarily deal with pre-defined patterns and structured data, Milvus is designed from the ground up to handle embedding vectors derived from unstructured data.

With the continuous expansion of the internet, unstructured data has become increasingly common, including email, papers, IoT sensor data, Facebook photos, protein structures, and more. To enable computers to understand and process unstructured data, these data need to be transformed into vectors using embedding techniques. Milvus stores and indexes these vectors. Milvus can analyze the similarity distance between two vectors by computing their similarity distance, inferring their correlation. If two embedding vectors are very similar, it indicates that the original data sources are also similar.

Milvus Workflow:

Workflow

Key Concepts

If you are not familiar with the world of vector databases and similarity search, you may find the following key concepts helpful.

Learn more about Milvus terms.

Unstructured Data

Unstructured data, including images, videos, audio, and natural language, refers to information that does not follow a predefined model or organizational structure. This data type accounts for approximately 80% of global data and can be transformed into vectors using various artificial intelligence (AI) and machine learning (ML) models.

Embedding Vectors

Embedding vectors are the feature abstractions of unstructured data (such as email, IoT sensor data, Instagram photos, protein structures, etc.). From a mathematical perspective, embedding vectors consist of floating-point numbers or binary arrays. Modern embedding techniques are used to transform unstructured data into embedding vectors.

Vector Similarity Search

Vector similarity search involves comparing a vector with a database to find the most similar vector to the query vector. Using approximate nearest neighbor (ANN) search algorithms can accelerate the search process. If two embedding vectors are very similar, it indicates that the original data source is also similar.

Why Choose Milvus?

High performance for vector search on large-scale datasets.
Developer-centric community support with multi-language and toolchain support.
Achieves scalability and high reliability in the cloud, even maintaining stability in the event of failures.
Enables mixed search by combining scalar filtering with vector similarity search.

What Indexes and Metric Standards Are Supported?

Indexes are the organizational units of data. Before searching or querying the inserted entities, you must declare the index type and similarity metric standard. If you do not specify the index type, Milvus defaults to using brute-force search.

Index Types

Milvus supports most vector index types for approximate nearest neighbor search (ANNS), including:

FLAT: Suitable for scenarios seeking completely accurate and precise search results on small-scale (millions) datasets.
IVF_FLAT: Based on quantization index, suitable for scenarios seeking an ideal balance between accuracy and query speed. There is also a GPU version called GPU_IVF_FLAT.
IVF_SQ8: Based on quantization index, suitable for scenarios seeking a significant reduction in disk, CPU, and GPU memory consumption due to very limited resources.
IVF_PQ: Based on quantization index, suitable for scenarios pursuing high query speed even at the expense of accuracy. There is also a GPU version called GPU_IVF_PQ.
HNSW: Based on graph index, suitable for scenarios with very high requirements for search efficiency.

For more detailed information, please refer to Vector Index.

Similarity Metrics

In Milvus, similarity metrics are used to measure the similarity between vectors. Choosing a good distance metric can significantly improve classification and clustering performance. Depending on the form of the input data, specific similarity metrics are chosen to achieve the best performance.

Common metrics used for floating-point embeddings are:

Euclidean distance (L2): This metric is commonly used in the field of computer vision.
Inner Product (IP): This metric is commonly used in the field of natural language processing.

Common metrics used for binary embeddings are:

Hamming distance: This metric is commonly used in the field of natural language processing.
Jaccard similarity: This metric is commonly used for molecular similarity search.

Sample Applications

Milvus makes it easy to add similarity search to your applications. Sample applications of Milvus include:

Image similarity search: Makes images searchable and returns the most similar images from a large database almost instantly.
Video similarity search: By converting keyframes into vectors and then inputting the results into Milvus, it can search and recommend billions of videos in near real-time.
Audio similarity search: Quickly query large amounts of audio data, such as speech, music, sound effects, and similar sounds.
Recommendation systems: Recommends information or products based on user behavior and needs.
Question-answering systems: Interactive digital question-answering chatbots that can automatically answer user questions.
DNA sequence classification: Accurately classify genes in milliseconds by comparing similar DNA sequences.
Text search engines: Helps users find the information they are looking for by comparing keywords with a text database.

Design Concept of Milvus

As a cloud-based vector database, Milvus separates storage from computation in its design. To enhance elasticity and flexibility, all components in Milvus are stateless.

The system is divided into four layers:

Access Layer: Comprising a group of stateless proxies that serve as the frontend layer of the system and the endpoints for users.
Coordinating Service: This service assigns tasks to worker nodes and acts as the central hub of the system.
Worker Nodes: These nodes act as the arms and legs of the system, following instructions from the coordinating service and executing DML/DDL commands triggered by users in a brainless manner.
Storage: This is the backbone of the system responsible for data persistence. It includes metadata storage, log agents, and object storage.

Milvus Architecture .

Developer Tools

Milvus provides a rich set of APIs and tools for development and operation.

API Access

Milvus provides client libraries wrapped around the Milvus API, which can be used to programmatically insert, delete, and query data from application code:

PyMilvus
Node.js SDK
Go SDK
Java SDK

Milvus Ecosystem Tools

The Milvus ecosystem offers several useful tools, including:

Milvus CLI
Attu: A graphical management system for Milvus.
MilvusDM (Milvus Data Migration): An open-source tool specifically for importing and exporting data with Milvus.
Milvus Capacity Planning Tool: Helps estimate the required raw file size, memory size, and stable disk size through various index types.

Milvus Limitations

Milvus is committed to providing the best vector database for driving AI applications and vector similarity searches. However, the team is continuously working to introduce more features and optimal tools to enhance user experience. This page lists some known limitations that users may encounter when using Milvus.

Length of Resource Names

Resource	Limit
Collection	255 characters
Field	255 characters
Index	255 characters
Partition	255 characters

Naming Rules

Resource names may consist of numbers, letters, and underscores (_). The names must start with a letter or underscore.

Number of Resources

Resource	Limit
Collection	65,536
Connections/Agents	65,536

Number of Resources in Collections

Resource	Limit
Partition	4,096
Shard	64
Field	64
Index	1
Entity	Unlimited

String Length

Data Type	Limit
VARCHAR	65,535

Vector Dimension

Attribute	Limit
Dimension	32,768

Input and Output for Each RPC

Operation	Limit
Insert Operation	512 MB
Search Operation	512 MB
Query Operation	512 MB

Load Limitation

In the current version, the data to be loaded must be within 90% of the total memory resources of all query nodes in order to reserve memory resources for the execution engine.

Search Limitation

Vector	Limit
`topk` (number of most similar results to return)	16,384
`nq` (number of search requests)	16,384