Conditional Query

This topic introduces how to perform conditional queries.

Unlike similarity searches on vectors, conditional queries retrieve vectors using scalar filters through boolean expressions. Milvus supports querying on scalar fields and various boolean expressions. Boolean expressions can be used to filter scalar fields or primary key fields to retrieve all results that match the filtering conditions.

The following example demonstrates how to query a dataset of 2000 rows of books, including book ID (primary key), word count (scalar field), and book introduction (vector field), simulating the scenario of querying a specific book based on its ID.

Load Collection

Before conducting a query, the collection needs to be loaded into memory.

from pymilvus import Collection
collection = Collection("book")      # Obtain an existing collection.
collection.load()

Perform Query

The following example filters vectors based on specific book_id values and returns the book_id field and book_intro field in the results.

Milvus supports setting consistency levels for queries. The example in this topic sets the consistency level to Strong. You can also set the consistency level to Bounded, Session, or Eventually. For more information on the four consistency levels in Milvus, please refer to Consistency.

You can also use dynamic fields in the filter expression and specify output fields in the query request. For example, refer to Dynamic Schema.

res = collection.query(
  expr = "book_id in [2,4,6,8]",
  offset = 0,
  limit = 10, 
  output_fields = ["book_id", "book_intro"],
)

Parameter Description
expr Boolean expression used to filter properties. For more details on boolean expression rules, please refer to Boolean Expression Rules.
limit Number of most similar results to return. The sum of this value and offset should be less than 16384.
offset Number of results to skip in the collection. Only available when limit is specified, and the sum of this value and limit should be less than 16384. For example, if you want to query the 9th and 10th nearest neighbors of a vector, set limit to 2, and offset to 8.
output_fields (optional) List of field names to return.
partition_names (optional) List of partition names to query.
consistency_level (optional) Consistency level for the query.

Check the returned results.

sorted_res = sorted(res, key=lambda k: k['book_id'])
sorted_res

Count Entities

When performing a query, you can add count(*) in output_fields so that Milvus can return the number of entities in the collection. If you want to count the number of entities that meet specific conditions, use expr to define a boolean expression.

Count all entities in the collection:

res = collection.query(
  expr="", 
  output_fields = ["count(*)"],
)

print(res)
print(res[0])

Count the number of entities that meet specific filtering conditions:

res = collection.query(
  expr="book_id in [2,4,6,8]", 
  output_fields = ["count(*)"],
)

print(res)
print(res[0])

Limitations

When using count(*) in output_fields, the use of the limit parameter is prohibited.