Filter
With Qdrant, you can set conditions for searching or retrieving points, which means you can filter by attribute in addition to similarity searches for vectors, similar to setting SQL where
conditions. For example, you can set conditions for the payload and id
of the point.
It is important to set additional conditions when not all the features of an object can be expressed in an embedding. For example, various business requirements such as inventory availability, user location, or expected price range.
Filter Conditions
Qdrant allows you to combine conditions in clauses. Clauses are different logical operations, such as OR
, AND
, and NOT
. Clauses can be recursively nested within each other, so you can recreate any boolean expression.
Let's take a look at the clauses implemented in Qdrant.
Suppose we have a set of points with payloads:
[
{ "id": 1, "city": "London", "color": "green" },
{ "id": 2, "city": "London", "color": "red" },
{ "id": 3, "city": "London", "color": "blue" },
{ "id": 4, "city": "Berlin", "color": "red" },
{ "id": 5, "city": "Moscow", "color": "green" },
{ "id": 6, "city": "Moscow", "color": "blue" }
]
Must
Example:
POST /collections/{collection_name}/points/scroll
{
"filter": {
"must": [
{ "key": "city", "match": { "value": "London" } },
{ "key": "color", "match": { "value": "red" } }
]
}
...
}
The filtered point will be:
[{ "id": 2, "city": "London", "color": "red" }]
When using must
, the clause is true
only if each condition listed in must
is satisfied. In this sense, must
is equivalent to the AND
operator.
Should
Should
is similar to the OR
operator in SQL.
Example:
POST /collections/{collection_name}/points/scroll
{
"filter": {
"should": [
{ "key": "city", "match": { "value": "London" } },
{ "key": "color", "match": { "value": "red" } }
]
}
...
}
The filtered points will be:
[
{ "id": 1, "city": "London", "color": "green" },
{ "id": 2, "city": "London", "color": "red" },
{ "id": 3, "city": "London", "color": "blue" },
{ "id": 4, "city": "Berlin", "color": "red" }
]
When using should
, the clause is true
as long as at least one condition listed in should
is satisfied. In this sense, should
is equivalent to the OR
operator.
must_not
Example:
POST /collections/{collection_name}/points/scroll
{
"filter": {
"must_not": [
{ "key": "city", "match": { "value": "London" } },
{ "key": "color", "match": { "value": "red" } }
]
}
...
}
The filtered points will be:
[
{ "id": 5, "city": "Moscow", "color": "green" },
{ "id": 6, "city": "Moscow", "color": "blue" }
]
When using must_not
, the subclause is true
only if none of the conditions listed in must_not
are satisfied. In this sense, must_not
is equivalent to the expression (NOT A) AND (NOT B) AND (NOT C)
.
Combination of Conditions
Using multiple conditions simultaneously is possible:
POST /collections/{collection_name}/points/scroll
{
"filter": {
"must": [
{ "key": "city", "match": { "value": "London" } }
],
"must_not": [
{ "key": "color", "match": { "value": "red" } }
]
}
...
}
The filtered points will be:
[
{ "id": 1, "city": "London", "color": "green" },
{ "id": 3, "city": "London", "color": "blue" }
]
In this case, the conditions are combined using AND
.
Additionally, conditions can be recursively nested. For example:
POST /collections/{collection_name}/points/scroll
{
"filter": {
"must_not": [
{
"must": [
{ "key": "city", "match": { "value": "London" } },
{ "key": "color", "match": { "value": "red" } }
]
}
]
}
...
}
The filtered points will be:
[
{ "id": 1, "city": "London", "color": "green" },
{ "id": 3, "city": "London", "color": "blue" },
{ "id": 4, "city": "Berlin", "color": "red" },
{ "id": 5, "city": "Moscow", "color": "green" },
{ "id": 6, "city": "Moscow", "color": "blue" }
]
Condition Filtering
In the payload, different types of values correspond to different types of queries that can be applied to them. Let's take a look at the existing condition variants and the data types they apply to.
Matching
{
"key": "color",
"match": {
"value": "red"
}
}
For other types, the matching conditions look exactly the same, just with different types used:
{
"key": "count",
"match": {
"value": 0
}
}
The simplest condition checks whether the stored value is equal to the given value. If multiple values are stored, at least one of them should satisfy the condition. This can be applied to keyword, integer, and boolean payloads.
Any Match
Available since v1.1.0
If you want to check if the stored value is one of multiple values, you can use the any match condition. The any match treats the given value as a logical OR operation. It can also be described as the IN
operator.
You can apply it to keywords and integer payloads.
Example:
{
"key": "color",
"match": {
"any": ["black", "yellow"]
}
}
In this example, if the stored value is black
or yellow
, then the condition will be satisfied.
If the stored value is an array, it should have at least one value that matches any of the given values. For example, if the stored value is ["black", "green"]
, then the condition will be satisfied because "black"
is in ["black", "yellow"]
.
Exclude Match
Available since v1.2.0
If you want to check if the stored value is none of multiple values, you can use the exclude match condition. The exclude match treats the given value as a logical NOR operation. It can also be described as the NOT IN
operator.
You can apply it to keywords and integer payloads.
Example:
{
"key": "color",
"match": {
"except": ["black", "yellow"]
}
}
In this example, if the stored value is neither black
nor yellow
, then the condition will be satisfied.
If the stored value is an array, it should have at least one value that does not match any of the given values. For example, if the stored value is ["black", "green"]
, then the condition will be satisfied because "green"
does not match "black"
or "yellow"
.
Nested Keys
Available from v1.1.0 onwards
As the payload is an arbitrary JSON object, you may need to filter nested fields.
For convenience, we use a syntax similar to the Jq project.
Suppose we have a set of points with the following payload:
[
{
"id": 1,
"country": {
"name": "Germany",
"cities": [
{
"name": "Berlin",
"population": 3.7,
"sightseeing": ["Brandenburg Gate", "Reichstag"]
},
{
"name": "Munich",
"population": 1.5,
"sightseeing": ["Marienplatz", "Olympiapark"]
}
]
}
},
{
"id": 2,
"country": {
"name": "Japan",
"cities": [
{
"name": "Tokyo",
"population": 9.3,
"sightseeing": ["Tokyo Tower", "Tokyo Skytree"]
},
{
"name": "Osaka",
"population": 2.7,
"sightseeing": ["Osaka Castle", "Universal Studios Japan"]
}
]
}
}
]
You can use dot notation to search for nested fields.
POST /collections/{collection_name}/points/scroll
{
"filter": {
"should": [
{
"key": "country.name",
"match": {
"value": "Germany"
}
}
]
}
}
You can also use [ ]
syntax to search the array by projecting inner values.
POST /collections/{collection_name}/points/scroll
{
"filter": {
"should": [
{
"key": "country.cities[].population",
"range": {
"gte": 9.0,
}
}
]
}
}
This query only outputs the point with id 2, as only Japan has a city with a population greater than 9.0.
Nested fields can also be an array.
POST /collections/{collection_name}/points/scroll
{
"filter": {
"should": [
{
"key": "country.cities[].sightseeing",
"match": {
"value": "Osaka Castle"
}
}
]
}
}
This query only outputs the point with id 2, as only Japan has a city with a sightseeing spot that includes "Osaka Castle".
Nested Object Filtering
Available since version 1.2.0
By default, the conditions take the entire payload of a point into consideration.
For example, given the two points in the payload below:
[
{
"id": 1,
"dinosaur": "t-rex",
"diet": [
{ "food": "leaves", "likes": false},
{ "food": "meat", "likes": true}
]
},
{
"id": 2,
"dinosaur": "diplodocus",
"diet": [
{ "food": "leaves", "likes": true},
{ "food": "meat", "likes": false}
]
}
]
The following query would match these two points:
POST /collections/{collection_name}/points/scroll
{
"filter": {
"must": [
{
"key": "diet[].food",
"match": {
"value": "meat"
}
},
{
"key": "diet[].likes",
"match": {
"value": true
}
}
]
}
}
The reason why the two points above are matched is because they both satisfy these two conditions:
- "t-rex" satisfies
diet[1].food
with food = meat anddiet[1].likes
with likes = true - "diplodocus" satisfies
diet[1].food
with food = meat anddiet[0].likes
with likes = true
To only obtain points that match the conditions for array elements, for example, the point with id 1 in this example, you need to use nested object filters.
Nested object filters allow querying object arrays independently.
This can be achieved by using the nested
condition type, which consists of the payload key of interest and the filter to apply.
The key should point to an object array and can optionally use bracket notation ("data" or "data[]").
POST /collections/{collection_name}/points/scroll
{
"filter": {
"must": [
"nested": {
{
"key": "diet",
"filter": {
"must": [
{
"key": "food",
"match": {
"value": "meat"
}
},
{
"key": "likes",
"match": {
"value": true
}
}
]
}
}
}
]
}
}
The matching logic is modified to apply at the array element level within the payload.
The nested filter works the same way as when applying a nested filter to a single element of an array. As long as at least one element of the array matches the nested filter, the parent document is considered to match the condition.
Limitation
Nested object filters do not support the has_id
condition. If you need to use it, place it in an adjacent must
clause.
POST /collections/{collection_name}/points/scroll
{
"filter": {
"must": [
"nested": {
{
"key": "diet",
"filter": {
"must": [
{
"key": "food",
"match": {
"value": "meat"
}
},
{
"key": "likes",
"match": {
"value": true
}
}
]
}
}
},
{ "has_id": [1] }
]
}
}
Exact Text Matching
Available since version 0.10.0
A special case of the match
condition is the text
matching condition. It allows you to search for specific substrings, tokens, or phrases within the text field.
The exact texts that satisfy this condition depend on the configuration of the full-text index. The configuration is defined when the index is created and is described within the full-text index.
If the field does not have a full-text index, this condition will work based on exact substring matching.
{
"key": "description",
"match": {
"text": "good and cheap"
}
}
If the query has several words, this condition will only be satisfied when all words appear in the text.
Range
{
"key": "price",
"range": {
"gt": null,
"gte": 100.0,
"lt": null,
"lte": 450.0
}
}
The range
condition sets the possible range of values for the stored payload. If multiple values are stored, at least one value should match the condition.
Available comparison operations include:
-
gt
- greater than -
gte
- greater than or equal to -
lt
- less than -
lte
- less than or equal to
It can be applied to floating-point numbers and integer payloads.
Geographical
Geographical Boundary Box
{
"key": "location",
"geo_bounding_box": {
"bottom_right": {
"lat": 52.495862,
"lon": 13.455868
},
"top_left": {
"lat": 52.520711,
"lon": 13.403683
}
}
}
It matches the location
within the rectangle with coordinates at the bottom right as bottom_right
and coordinates at the top left as top_left
.
Geographical Radius
{
"key": "location",
"geo_radius": {
"center": {
"lat": 52.520711,
"lon": 13.403683
},
"radius": 1000.0
}
}
It matches the location
within the circle with the center at center
and a radius of radius
meters.
If multiple values are stored, at least one value should match the condition. These conditions can only be applied to payloads matching the geographical data format.
Value Count
In addition to direct value comparison, filtering can also be based on the number of values.
For example, given the following data:
[
{ "id": 1, "name": "Product A", "comments": ["Very good!", "Excellent"] },
{ "id": 2, "name": "Product B", "comments": ["Fair", "Expecting more", "Good"] }
]
We can search only for items with more than two comments:
{
"key": "comments",
"values_count": {
"gt": 2
}
}
The result will be:
[{ "id": 2, "name": "Product B", "comments": ["Fair", "Expecting more", "Good"] }]
If the stored value is not an array, it is assumed that the value count is equal to 1.
Is Empty
Sometimes, it is useful to filter out records that lack certain values. The IsEmpty
condition can help you achieve this:
{
"is_empty": {
"key": "reports"
}
}
This condition will match all records where the reports
field does not exist or has a value of null
or []
.
IsEmpty is often very useful when used in conjunction with the logical negation must_not. In this case, it will select all non-empty values.
Is Null
The match condition cannot test for NULL
values. We must use the IsNull
condition:
{
"is_null": {
"key": "reports"
}
}
This condition will match all records where the reports
field exists and has a value of NULL
.
Has ID
This type of query is unrelated to payloads, but it is very useful in certain situations. For example, users may want to tag certain search results as irrelevant, or we may only want to search between specific points.
POST /collections/{collection_name}/points/scroll
{
"filter": {
"must": [
{ "has_id": [1,3,5,7,9,11] }
]
}
...
}
The filtered points will be:
[
{ "id": 1, "city": "London", "color": "green" },
{ "id": 3, "city": "London", "color": "blue" },
{ "id": 5, "city": "Moscow", "color": "green" }
]