Inserting Entities

This topic describes how to insert data into Milvus from the client side.

You can also use MilvusDM to migrate data to Milvus. MilvusDM is an open-source tool specifically designed for importing and exporting data with Milvus.

Milvus 2.1 supports the VARCHAR data type for scalar fields. When building an index for a scalar field of type VARCHAR, the default index type is a trie.

The following example inserts 2,000 rows of randomly generated data as sample data (the Milvus CLI example uses a pre-built remote CSV file containing similar data). Real applications may use higher-dimensional vectors than the example. You can prepare your own data to replace the example.

Prepare the Data

First, prepare the data to be inserted. The data type of the data to be inserted must match the schema of the collection, otherwise Milvus will throw an exception.

Milvus supports default values for scalar fields, excluding primary key fields. This means that during data insertion or update, certain fields can be left blank. For more information, refer to creating a collection.

After enabling dynamic schema, you can add dynamic fields to the data. For more information, refer to dynamic schema.

bookIDs := make([]int64, 0, 2000)
wordCounts := make([]int64, 0, 2000)
bookIntros := make([][]float32, 0, 2000)
for i := 0; i < 2000; i++ {
	bookIDs = append(bookIDs, int64(i))
	wordCounts = append(wordCounts, int64(i+10000))
	v := make([]float32, 0, 2)
	for j := 0; j < 2; j++ {
		v = append(v, rand.Float32())
	}
	bookIntros = append(bookIntros, v)
}
idColumn := entity.NewColumnInt64("book_id", bookIDs)
wordColumn := entity.NewColumnInt64("word_count", wordCounts)
introColumn := entity.NewColumnFloatVector("book_intro", 2, bookIntros)

Insert Data into Milvus

Insert the data into a collection.

By specifying the partition_name, you can choose which partition to insert the data into.

_, err = milvusClient.Insert(
	context.Background(), // ctx
	"book",               // CollectionName
	"",                   // partitionName
	idColumn,             // columnarData
	wordColumn,           // columnarData
	introColumn,          // columnarData
)
if err != nil {
	log.Fatal("failed to insert data:", err.Error())
}
Parameter Description
ctx Context used to control the API call process.
CollectionName Name of the collection to insert data into.
partitionName Name of the partition to insert the data into. If left blank, the data will be inserted into the default partition.
columnarData Data to be inserted into each field.

After inserting entities into a collection that has already been indexed, you do not need to reindex the collection because Milvus will automatically create an index for the newly inserted data.

Refreshing Data in Milvus

When data is inserted into Milvus, it is inserted into segments. Segments must reach a certain size to be sealed and indexed. Unsealed segments will be queried through brute force search. To avoid this situation, it's best to call flush(). The flush() method seals any remaining segments and sends them to the index. It is very important to call this method only when the insert session ends. Calling this method too frequently can lead to fragmented data, which will need to be cleaned up later.

Limitations

Feature Maximum Limit
Vector Dimension 32,768

Updating Data

Preparing Data

First, prepare the data to be updated. The data type to be updated must match the schema of the collection, otherwise Milvus will throw an exception.

Milvus supports default values for scalar fields, except for primary key fields. This means that during the data insertion or update process, certain fields can remain empty. For more information, please refer to the collection creation.

nEntities := 3000
dim := 8
idList := make([]int64, 0, nEntities)
randomList := make([]float64, 0, nEntities)
embeddingList := make([][]float32, 0, nEntities)

for i := 0; i < nEntities; i++ {
    idList = append(idList, int64(i))
}

for i := 0; i < nEntities; i++ {
    randomList = append(randomList, rand.Float64())
}

for i := 0; i < nEntities; i++ {
    vec := make([]float32, 0, dim)
    for j := 0; j < dim; j++ {
        vec = append(vec, rand.Float32())
    }
    embeddingList = append(embeddingList, vec)
}
idColData := entity.NewColumnInt64("ID", idList)
randomColData := entity.NewColumnDouble("random", randomList)
embeddingColData := entity.NewColumnFloatVector("embeddings", dim, embeddingList)

Updating Data

Update the data into the collection.

if _, err := c.Upsert(ctx, collectionName, "", idColData, embeddingColData);
err != nil {
    log.Fatalf("failed to upsert data, err: %v", err)
}