It’s a troublesome task to update mass data in production. One of the most used methods to do this task is:
collect($csvContacts)->each(function (array $row) {
$contact = Contact::updateOrCreate(
['email' => $row['email']],
['name' => $row['name'], 'address' => $row['address']]
);
});
updateorCreate()
method has two queries within it. At first, the query checks if any rows match the key/value in the first array. If the row exists, it will update it otherwise a new row is inserted in the table.
This method is suitable for databases having small-scale data. Imagine if we had more than 10000 rows which means 20000 queries will have to be executed. This will take a lot of time and make the system slow which is not efficient at all.
– This is where upsert()
comes in action which has higher efficiency than updateorCreate()
.
upsert()
method allows us to perform multiple “upserts” in a single query. It takes value to insert or update in the first argument and the second argument lists the column(s) that uniquely identify records within the associated table. The third argument consists of a column that should be updated if a matching record already exists.
Single queries containing all the rows are made while using Upsert which includes duplicate key updates MySQL and on conflict … do update set in Postgres. This command instructs the database to update the record if it already exists.
Another important thing to notice while using the upsert
method is to have a primary or unique key index. If no primary or unique index is found in the database then we will get an error.
For higher efficiency, you can also chunk
queries into blocks. We might hit the query size limit if it is not done.
Here is an example of proper upsert method usage:
collect($csvContacts)
->map(function (array $row) {
return Arr::only($row, ['email', 'name', 'address']);
})
->chunk(1000)
->each(function (Collection $chunk) {
Contact::upsert($chunk, 'email');
});
All Comments