Each collection of data is divided into segments. Each segment has independent vector storage, payload storage, and index.

The data in the segments usually do not overlap. However, storing the same point in different segments does not cause problems because the search contains a deduplication mechanism.

Segments include vector storage, payload storage, vector index, payload index, and an ID mapper for storing the relationship between internal and external IDs.

Segments can be "appendable" or "non-appendable" depending on the type of storage and index used. You can freely add, delete, and query data in "appendable" segments, while "non-appendable" segments can only read and delete data.

The configuration of segments in a collection can be different and independent of each other, but at least one "appendable" segment is required.

Vector Storage

Depending on the application's needs, Qdrant can use one of the following data storage options. The choice must balance between search speed and RAM usage.

Memory Storage - Stores all vectors in RAM, providing the highest speed because disk access is only required during persistence.

Memmap Storage - Creates a virtual address space associated with a file on disk. Mapped files are not directly loaded into RAM but accessed using page caching. This approach allows flexible use of available memory. With enough RAM, its speed is almost as fast as memory storage.

Configuring Memmap Storage

There are two ways to configure the use of memmap (also known as on-disk storage):

  • Set the on_disk option for vectors in the collection creation API:

Only applicable to v1.2.0 and higher

PUT /collections/{collection_name}

{
    "vectors": {
      "size": 768,
      "distance": "Cosine",
      "on_disk": true
    }
}

This will create a collection that immediately stores all vectors in memmap storage. This is the recommended approach when the Qdrant instance uses fast disks and needs to handle large collections.

  • Set the memmap_threshold_kb option. This option sets the threshold for converting segments to memmap storage.

There are two ways to implement this:

  1. Set the threshold globally in the configuration file. The parameter name is memmap_threshold_kb.
  2. Set the threshold individually for each collection during creation or update.
PUT /collections/{collection_name}

{
    "vectors": {
      "size": 768,
      "distance": "Cosine"
    },
    "optimizers_config": {
        "memmap_threshold": 20000
    }
}

The rule of thumb for setting the memmap threshold parameter is simple:

  • If the usage scenario is balanced – set the memmap threshold to the same as the indexing_threshold (default is 20000). In this case, the optimizer will not perform any additional runs and will optimize all thresholds at once.
  • If the write load is high and RAM is low – set the memmap threshold lower than the indexing_threshold, for example, 10000. In this case, the optimizer will first convert segments to memmap storage and then apply indexing.

Additionally, you can use memmap storage not only for vectors but also for HNSW indexes. To enable this feature, set the hnsw_config.on_disk parameter to true when creating the collection.

PUT /collections/{collection_name}

{
    "vectors": {
      "size": 768,
      "distance": "Cosine"
    },
    "optimizers_config": {
        "memmap_threshold": 20000
    },
    "hnsw_config": {
        "on_disk": true
    }
}

Payload Storage

Qdrant supports two types of payload storage: InMemory and OnDisk.

InMemory payload storage organizes payload data in the same way as in-memory vectors. The payload data is loaded into memory when the service starts, while the disk and RocksDB are used for persistence only. This type of storage is very fast but may require a large amount of memory space to store all data, especially if the payload includes large values (such as text summaries or even images).

For large payload values, it is preferable to use OnDisk payload storage. This type of storage directly reads and writes the payload to RocksDB, thus not requiring a large amount of memory for storage. However, the downside is access latency. If you need to query vectors based on payload conditions, checking the values stored on disk may take too long. In this case, we recommend creating a payload index for each field used for filtering conditions to avoid disk access. Once the field index is created, Qdrant will always keep all values of the indexed field in memory, regardless of the payload storage type.

You can specify the desired payload storage type through the configuration file or by using the collection parameter on_disk_payload when creating a collection.

Version Control

To ensure data integrity, Qdrant performs all data changes in two stages. First, the data is written to the Write-ahead-log (WAL), which sorts and assigns sequential numbers to all operations.

Once the changes are added to the WAL, they are not lost even in case of a power outage. Then, the changes enter the segment. Each segment stores the latest version of changes applied to it, as well as the version of each individual point. If the sequential number of a new change is less than the current version of a point, the updater will ignore that change. This mechanism allows Qdrant to efficiently recover storage from the WAL in case of abnormal shutdowns.