=head1 Efficiently Create Multiple Related Rows in DBIx::Class This article will talk about something seemingly simple and boring - inserting new rows into your database. If you frequent C you probably have heard of the mysterious MultiCreate, yet you are not entirely sure if it can bring you fame and money. Are you watching closely? =head2 The Pledge Let's examine the dreaded Artist/CDs example: package MySchema::Artist; use base 'DBIx::Class::Core'; __PACKAGE__->table ('artist'); __PACKAGE__->add_columns ( id => { data_type => 'int', is_auto_increment => 1, }, name => { data_type => 'varchar', }, ); __PACKAGE__->set_primary_key ('id'); __PACKAGE__->has_many (cds => 'MySchema::CD', 'artist_id'); and package MySchema::CD; use base 'DBIx::Class::Core'; __PACKAGE__->table ('cd'); __PACKAGE__->add_columns ( id => { data_type => 'int', is_auto_increment => 1, }, title => { data_type => 'varchar', }, artist_id => { data_type => 'int', }, ); __PACKAGE__->set_primary_key ('id'); __PACKAGE__->belongs_to (artist => 'MySchema::Artist', 'artist_id'); =head2 The Turn Say some XML describing a modest CD collection needs to be parsed and stuffed in the above tables, while preserving relational integrity. A seasoned programmer will sigh, drink some coffee, and write the following code: for my $chunk (@data) { my $artist = $schema->resultset ('Artist')->create ({ name => $chunk->artist_name }); for my $cd_title (@{$chunk->{cds}}) { $schema->resultset ('CD')->create ({ title => $cd_title, artist_id => $artist->id, }); } } After the coffee kicks in, perhaps the programmer will remember that he already has declared relationships between Artist and CD. So he will reduce the second inner loop to: ... for my $cd_title (@{$chunk->{cds}}) { $artist->create_related ('cds', { title => $cd_title }); } ... This is better, but it suffers from a number of deficiencies: =over =item If the XML contained Tracks of individual CDs we would need a second nested loop, which may result in another loop, and so on. =item The entire operation probably needs to be wrapped in a transaction - that's yet more boilerplate to cargo cult from code you've already written elsewhere. =item B =back So the depressed developer sets out to write even more boring dull code. =head2 The Prestige An enlightened coworker arrives, looks at the code and changes it to: for my $chunk (@data) { my $artist_with_cds = $schema->resultset ('Artist')->create ($chunk); } Then looks closer at @data, makes sure it doesn't contain anything L wouldn't know how to handle and changes the entire loop to just: $schema->resultset ('Artist')->populate (\@data); =head1 The Trick Revealed Just like being able to L, L can turn such structures back to related rows. All you have to do is augment the hashref you normally pass to L with a structure representing the related row(s), keyed off the B. The structure will be either another hashref (for single-type relationships, i.e. L, L, L) or an arrayref of hashrefs for L (relationship of type multi). When the enlightened coworker did C<< ->create ($chunk) >> he effectively supplied: { name => 'somename', cds => [ { title => 'cd1', }, { title => 'cd2', }, ... ], } which was properly taken apart and inserted in the appropriate tables using the appropriate foreign keys as one would expect. In addition, since L is a single call, L takes care of making it atomic by dutifully wrapping the entire operation in a L. The same logic applies to L which happily accepts an arrayref of hashrefs as the creation argument. In fact, you can bootstrap a poor man's replication by simply dumping a L via L, and feeding the result straight back to L, using a L derived from the target L. =head2 But does it scale? The example above is rather simple, but what if we have a complex chain of relationships? For example: Artist->has_many ('cds', ...) CD->has_many ('cd2producer', ...) CD2Producer->belongs_to ('producer', ...) ...and so on Won't L attempt to create the same producer several times? Fortunately, no: any creation attempt over a L relationship will be executed via a L, thus allowing everything to Just Work. For the brave there is an L. =head2 But there must be *some* caveats...? =over =item Currently this works for L only. While L will be executed on L relationships, the final goal is nothing more than new row creation. This means, for example, that passing a nested structure to L will not do anything close to what you would expect. =item The returned object does not always have the proper related objects inserted at the expected slots (similar to the effect of L). We need a good amount of tests before we attempt to make this work as expected - patches welcome! =item The L use mentioned above can result in some strange behavior. Consider: my $cd_data = { artist_id => $some_artist->id, title => 'my cd', genre => { name => 'vague genre', cds => [ { title => 'oddball cd', artist_id => $some_other_artist->id, }, ], } }; If the I genre already exists, L will B descend to check if the I is in fact created. This has not yet caused anyone grief, thus there is no motivation for the (non-trivial) fix. =back =head1 What The Future Holds While the new row creation is mostly done and over with (in fact it delayed the release of the 0.081xx series by several months), there are more ways to expand the existing codebase. =over =item Port L to leverage the (substantial) infrastructure behind MultiCreate. While some conceptual differences exist between both modules, they can certainly be resolved during the test-writing phase. =item Merge create and update into the logical multi_update_or_create ! =back If you have tuits or simply ideas - you are always welcome to hang out on IRC, and start a discussion (or snatch a commit-bit and start hacking). Happy search()ing! =head1 Author Peter Rabbitson