ProductPromotion
Logo

Ruby

made by https://0x3d.site

GitHub - jamis/bulk_insert: Efficient bulk inserts with ActiveRecord
Efficient bulk inserts with ActiveRecord. Contribute to jamis/bulk_insert development by creating an account on GitHub.
Visit Site

GitHub - jamis/bulk_insert: Efficient bulk inserts with ActiveRecord

GitHub - jamis/bulk_insert: Efficient bulk inserts with ActiveRecord

BulkInsert

A little ActiveRecord extension for helping to insert lots of rows in a single insert statement.

Installation

Add it to your Gemfile:

gem 'bulk_insert'

Usage

BulkInsert adds a new class method to your ActiveRecord models:

class Book < ActiveRecord::Base
end

book_attrs = ... # some array of hashes, for instance
Book.bulk_insert do |worker|
  book_attrs.each do |attrs|
    worker.add(attrs)
  end
end

All of those #add calls will be accumulated into a single SQL insert statement, vastly improving the performance of multiple sequential inserts (think data imports and the like).

If you don't like using a block API, you can also simply pass an array of rows to be inserted:

book_attrs = ... # some array of hashes, for instance
Book.bulk_insert values: book_attrs

By default, the columns to be inserted will be all columns in the table, minus the id column, but if you want, you can explicitly enumerate the columns:

Book.bulk_insert(:title, :author) do |worker|
  # specify a row as an array of values...
  worker.add ["Eye of the World", "Robert Jordan"]

  # or as a hash
  worker.add title: "Lord of Light", author: "Roger Zelazny"
end

It will automatically set created_at/updated_at columns to the current date, as well.

Book.bulk_insert(:title, :author, :created_at, :updated_at) do |worker|
  # specify created_at/updated_at explicitly...
  worker.add ["The Chosen", "Chaim Potok", Time.now, Time.now]

  # or let BulkInsert set them by default...
  worker.add ["Hello Ruby", "Linda Liukas"]
end

Similarly, if a value is omitted, BulkInsert will use whatever default value is defined for that column in the database:

# create_table :books do |t|
#   ...
#   t.string "medium", default: "paper"
#   ...
# end

Book.bulk_insert(:title, :author, :medium) do |worker|
  worker.add title: "Ender's Game", author: "Orson Scott Card"
end

Book.first.medium #-> "paper"

By default, the batch is always saved when the block finishes, but you can explicitly save inside the block whenever you want, by calling #save! on the worker:

Book.bulk_insert do |worker|
  worker.add(...)
  worker.add(...)

  worker.save!

  worker.add(...)
  #...
end

That will save the batch as it has been defined to that point, and then empty the batch so that you can add more rows to it if you want. Note that all records saved together will have the same created_at/updated_at timestamp (unless one was explicitly set).

Batch Set Size

By default, the size of the insert is limited to 500 rows at a time. This is called the set size. If you add another row that causes the set to exceed the set size, the insert statement is automatically built and executed, and the batch is reset.

If you want a larger (or smaller) set size, you can specify it in two ways:

# specify set_size when initializing the bulk insert...
Book.bulk_insert(set_size: 100) do |worker|
  # ...
end

# specify it on the worker directly...
Book.bulk_insert do |worker|
  worker.set_size = 100
  # ...
end

Insert Ignore

By default, when an insert fails the whole batch of inserts fail. The ignore option ignores the inserts that would have failed (because of duplicate keys or a null in column with a not null constraint) and inserts the rest of the batch.

This is not the default because no errors are raised for the bad inserts in the batch.

destination_columns = [:title, :author]

# Ignore bad inserts in the batch
Book.bulk_insert(*destination_columns, ignore: true) do |worker|
  worker.add(...)
  worker.add(...)
  # ...
end

Update Duplicates (MySQL, PostgreSQL)

If you don't want to ignore duplicate rows but instead want to update them then you can use the update_duplicates option. Set this option to true (MySQL) or list unique column names (PostgreSQL) and when a duplicate row is found the row will be updated with your new values. Default value for this option is false.

destination_columns = [:title, :author]

# Update duplicate rows (MySQL)
Book.bulk_insert(*destination_columns, update_duplicates: true) do |worker|
  worker.add(...)
  worker.add(...)
  # ...
end

# Update duplicate rows (PostgreSQL)
Book.bulk_insert(*destination_columns, update_duplicates: %w[title]) do |worker|
  worker.add(...)
  # ...
end

Return Primary Keys (PostgreSQL, PostGIS)

If you want the worker to store primary keys of inserted records, then you can use the return_primary_keys option. The worker will store a result_sets array of ActiveRecord::Result objects. Each ActiveRecord::Result object will contain the primary keys of a batch of inserted records.

worker = Book.bulk_insert(*destination_columns, return_primary_keys: true) do
|worker|
  worker.add(...)
  worker.add(...)
  # ...
end

worker.result_sets

Ruby and Rails Versions Supported

:warning: The scope of this gem may be somehow covered natively by the .insert_all API introduced by Rails 6. This gem represents the state of art for rails version < 6 and it is still open to further developments for more recent versions.

The current CI prevents regressions on the following versions:

ruby / rails ~>3 ~>4 ~>5 ~>6
2.2 yes yes no no
2.3 yes yes yes no
2.4 no yes yes no
2.5 no no yes yes
2.6 no no yes yes
2.7 no no yes yes

The adapters covered in the CI are:

  • sqlite
  • mysql
  • postgresql

License

BulkInsert is released under the MIT license (see MIT-LICENSE) by Jamis Buck ([email protected]).

More Resources
to explore the angular.

mail [email protected] to add your project or resources here 🔥.

Related Articles
to learn about angular.

FAQ's
to learn more about Angular JS.

mail [email protected] to add more queries here 🔍.

More Sites
to check out once you're finished browsing here.

0x3d
https://www.0x3d.site/
0x3d is designed for aggregating information.
NodeJS
https://nodejs.0x3d.site/
NodeJS Online Directory
Cross Platform
https://cross-platform.0x3d.site/
Cross Platform Online Directory
Open Source
https://open-source.0x3d.site/
Open Source Online Directory
Analytics
https://analytics.0x3d.site/
Analytics Online Directory
JavaScript
https://javascript.0x3d.site/
JavaScript Online Directory
GoLang
https://golang.0x3d.site/
GoLang Online Directory
Python
https://python.0x3d.site/
Python Online Directory
Swift
https://swift.0x3d.site/
Swift Online Directory
Rust
https://rust.0x3d.site/
Rust Online Directory
Scala
https://scala.0x3d.site/
Scala Online Directory
Ruby
https://ruby.0x3d.site/
Ruby Online Directory
Clojure
https://clojure.0x3d.site/
Clojure Online Directory
Elixir
https://elixir.0x3d.site/
Elixir Online Directory
Elm
https://elm.0x3d.site/
Elm Online Directory
Lua
https://lua.0x3d.site/
Lua Online Directory
C Programming
https://c-programming.0x3d.site/
C Programming Online Directory
C++ Programming
https://cpp-programming.0x3d.site/
C++ Programming Online Directory
R Programming
https://r-programming.0x3d.site/
R Programming Online Directory
Perl
https://perl.0x3d.site/
Perl Online Directory
Java
https://java.0x3d.site/
Java Online Directory
Kotlin
https://kotlin.0x3d.site/
Kotlin Online Directory
PHP
https://php.0x3d.site/
PHP Online Directory
React JS
https://react.0x3d.site/
React JS Online Directory
Angular
https://angular.0x3d.site/
Angular JS Online Directory