NAME DBIx::Table::TestDataGenerator - Automatic test data creation, cross DBMS VERSION Version 0.0.4 SYNOPSIS use DBIx::Table::TestDataGenerator; my $generator = DBIx::Table::TestDataGenerator->new( dsn => $data_source_name, user => $db_user_name, password => $db_password, table => $target_table_name, ); #simple usage: $generator->create_testdata( target_size => $target_size, num_random => $num_random, ); #extended usage handling a self-reference of the target table: $generator->create_testdata( target_size => $target_size, num_random => $num_random, max_tree_depth => $max_tree_depth, min_children => $min_children, min_roots => $min_roots, ); DESCRIPTION There is often the need to create test data in database tables, e.g. to test database client performance. The existence of constraints on a table makes it non-trivial to come up with a way to add records to it. The current module inspects the tables' constraints and adds a desired number of records. The values of the fields either come from the table itself (possibly incremented to satisfy uniqueness constraints) or from tables referenced by foreign key constraints. The choice of the copied values is random for a number of runs the user can choose, afterwards the values are chosen randomly from a cache, reducing database traffic for performance reasons. One nice thing about this way to construct new records is that the additional data is similar to the data initially present in the table. A main goal of the module is to reduce configuration to the absolute minimum by automatically determining information about the target table, in particular its constraints. Another goal is to support as many DBMSs as possible, this has been achieved by basing it on DBIx::Class modules. In the synopsis, an extended usage has been mentioned. This refers to the common case of having a self-reference on a table, i.e. a one-column wide foreign key of a table to itself where the referenced column constitutes the primary key. Such a parent-child relationship defines a rootless tree and when generating test data it may be useful to have some control over the growth of this tree. One such case is when the parent-child relation represents a navigation tree and a client application processes this structure. In this case, one would like to have a meaningful, balanced tree structure since this corresponds to real-world examples. To control tree creation the parameters max_tree_depth, min_children and min_roots are provided. Note that the nodes are being added in a depth-first manner. SUBROUTINES/METHODS new Arguments: * dsn: required DBI data source name * user: required database user * password: required database user's password * table: required name of the target table Return value: A new TestDataGenerator object dsn Accessor for the DBI data source name. user Accessor for the database user. password Accessor for the database user's password. table Accessor for the name of the target table. create_testdata This is the main method, it creates and adds new records to the target table. In case one of the arguments max_tree_depth, min_children or min_roots has been provided, the other two must be provided as well. Arguments: * target_size The target number of rows to be reached. * num_random The first $num_random number of records use fresh random choices for their values taken from tables referenced by foreign key relations or the target table itself. These values are stored in a cache and re-used for the remaining (target_size - $num_random) records. Note that even for the remaining records there is some randomness since the combination of cached values coming from columns involved in different constraints is random. * max_tree_depth In case of a self-reference, the maximum depth at which new records will be inserted. The minimum value for this parameter is 2. * min_children In case of a self-reference, the minimum number of children each handled parent node will get. A possible exception is the last handled parent node if the execution stops before $min_children child nodes have been added to it. * min_roots In case of a self-reference, the minimum number of root elements existing after completion of the call to create_testdata. A record is considered to be a root element if the corresponding parent id is null or equal to the child id. Returns: Nothing, only called for the side-effect of adding new records to the target table. (This may change, see the section FURTHER DEVELOPMENT.) INSTALLATION AND CONFIGURATION To install this module, run the following commands: perl Build.PL ./Build ./Build test ./Build install LIMITATIONS * Currently, the module executes the inserts in one big transaction if the database handle has not set AutoCommit to true, but this will change, see the section FURTHER DEVELOPMENT. * Only uniqueness and foreign key constraints are taken into account. Constraints such as check constraints, which are very diverse and database specific, are not handled (and most probably will not be). * Uniqueness constraints involving only columns which the TableProbe class does not know how to increment cannot be handled. Typically, all string and numeric data types are supported and the set of supported data types is defined by the list provided by the TableProbe method get_type_preference_for_incrementing(). I am thinking about allowing date incrementation, too, it would be necessary then to at least add a configuration parameter defining what time incrementation step to use. * When calling create_testdata, max_tree_depth = 1 should be allowed, too, meaning that all new records will be root records. * Added records that are root node with respect to the self-reference always have the parent id equal to their pkey. It may be that in the case in question the convention is such that root nodes are identified by having the parent id set to NULL. FURTHER DEVELOPMENT * The current version handles uniqueness constraints by picking out a column involved in the constraint and incrementing it appropriately. This should be made customizable in future versions. * Support for transactions and specifying transaction sizes will be added. * It will be possible to get the SQL source of all generated inserts without having them executed on the database. * Currently one cannot specify a seed for the random selections used to define the generated records since the used class DBIx::Class::Helper::ResultSet::Random does not provide this. For reproducible tests this would be a nice feature. ACKNOWLEDGEMENTS * Version 0.001: A big thank you to all perl coders on the dbi-dev, DBIx-Class and perl-modules mailing lists and on PerlMonks who have patiently answered my questions and offered solutions, advice and encouragement, the Perl community is really outstanding. Special thanks go to Tim Bunce (module name / advice on keeping the module extensible), Jonathan Leffler (module naming discussion / relation to existing modules / multiple suggestions for features), brian d foy (module naming discussion / mailing lists / encouragement) and the following Perl monks (see the threads for user jds17 for details): chromatic, erix, technojosh, kejohm, Khen1950fx, salva, tobyink (3 of 4 discussion threads!), Your Mother. * Version 0.002: Martin J. Evans was the first developer giving me feedback and nice bug reports on Version 0.001, thanks a lot! AUTHOR Jose Diaz Seng, "" BUGS Please report any bugs or feature requests to "bug-dbix-table-testdatagenerator at rt.cpan.org", or through the web interface at . I will be notified, and then you'll automatically be notified of progress on your bug as I make changes. SUPPORT You can find documentation for this module with the perldoc command. perldoc DBIx::Table::TestDataGenerator You can also look for information at: * RT: CPAN's request tracker (report bugs here) * AnnoCPAN: Annotated CPAN documentation * CPAN Ratings * Search CPAN LICENSE AND COPYRIGHT Copyright 2012 Jose Diaz Seng. This program is free software; you can redistribute it and/or modify it under the terms of either: the GNU General Public License as published by the Free Software Foundation; or the Artistic License. See http://dev.perl.org/licenses/ for more information.