Re: sqlmaster "nowait" and "append" functionality?

On Tue, Dec 6, 2016 at 4:54 PM Ole Tange <ole@tange.dk> wrote:

On Tue, Dec 6, 2016 at 6:10 PM, Andy Loftus <aloftus@gmail.com> wrote:

> Currently, sqlmaster appears to populate a table, first dropping any
> existing table, and waits for jobs to complete. Wondering if there is a way
> to:
>
> 1. populate the database then exit immediately without waiting?
> My particular use case is parallelizing backup tasks that are expected to
> run a long time (several hours on average).

Hmmm... maybe we should change, so you need to add '--wait' if you
want '--sqlmaster' to wait. Seems like a reasonable change.

Even better! That was my first assumption about how it worked. Took me a little bit of time to figure out why --sqlmaster never exited, haha :)

> On a related note, why does the sqlworker command require the exact same
> input as the sqlmaster? Shouldn't it be sufficient that all necessary
> information is stored in the database and then the sqlworker can just pull
> tasks from the database?

--sqlmaster only inserts the values - not the command. The problem
comes with replacement strings like {%}. You will never know in
advance which job slot a job will be run as:

parallel echo {%} ::: {a..j}

So --sqlmaster cannot store the actual command to run.

It _could_ be changed so that --sqlmaster stores the template command
into the command column, and --sqlworker fetches it form here, and
replaces the command column with the actual command run when done.

It would, however, be a fairly big change of GNU Parallel: The
assumption has always been that the template command remains the same
for the whole run, and quite a bit of optimization depends on this.

On a similar note: What would you expect the table should look like
when you run:

# These do not work - but what would you expect them to do to the table?
parallel --sqlandworker $DBURL -X echo {%}: {} ::: {1..10}
parallel --sqlandworker $DBURL -N3 echo {%}: {} ::: {1..10}
parallel --sqlandworker $DBURL echo {%} '{= $_=total_jobs() =}' ::: {1..10}

In general, I think the best approach is "keep it simple" and try to retain as much flexibility as possible by putting the template in the command column. Some things could be replaced, such as {#}, but contexts that don't make sense until runtime, like {%}, would have to be inserted verbatim. Regarding the dynamic calculation of "total_jobs", several options come to mind. First is to leave it dynamic and each job run by a sqlworker will calculate it at runtime, perhaps by looking up the number of rows in the table. If appending new jobs is allowed, the number will change based on the time each job was run relative to when appends were made to the table. I think that's okay, if it's documented that it works that way. The other approach would be to auto-fill dynamic content at sqlmaster insert time; then "total_jobs" would be fixed at the number of jobs that were added by sqlmaster. I like the first approach better (leave as much stuff figured at runtime as makes sense.) In the end, I think sqlmaster and sqlworker are very well suited for some use cases but not for others, so if it works for the majority of cases where it makes sense, that's a good place to start. As your examples point out, they don't work, but do they need to? It's hard to tell without a good use case, but it seems to me that these examples (especially #2 and #3) are less well suited to usage by sqlmaster and sqlworker. Here's my thoughts of what the example commands might look like in the database after sqlmaster put them there but before a sqlworker executed anything:

# These do not work - but what would you expect them to do to the table?

parallel --sqlandworker $DBURL -X echo {%}: {} ::: {1..10}

Seq|...|Exitval|...|Command |V1|...

1| | -1000| |echo {%}: {}| 1|

2| | -1000| |echo {%}: {}| 2|

3| | -1000| |echo {%}: {}| 3|

...etc

10| | -1000| |echo {%}: {}|10|

parallel --sqlandworker $DBURL -N3 echo {%}: {} ::: {1..10}

Seq|...|Exitval|...|Command |V1|V2|V3|...

1| | -1000| |echo {%}: {}| 1| 2| 3|

2| | -1000| |echo {%}: {}| 4| 5| 6|

3| | -1000| |echo {%}: {}| 7| 8| 9|

4| | -1000| |echo {%}: {}|10| | |

4| | -1000| |echo {%}: {}|10| NULL | NULL |

Seq|...|Exitval|...|Command | V1|...

1| | -1000| |echo {%}: {}| 1 2 3|

2| | -1000| |echo {%}: {}| 4 5 6|

3| | -1000| |echo {%}: {}| 7 8 9|

4| | -1000| |echo {%}: {}|10 |

parallel --sqlandworker $DBURL echo {%} '{= $_=total_jobs() =}' ::: {1..10}

Seq|...|Exitval|...|Command |V1|...

1| | -1000| |echo {%} '{= $_=total_jobs() =} | 1|

2| | -1000| |echo {%} '{= $_=total_jobs() =} | 2|

3| | -1000| |echo {%} '{= $_=total_jobs() =} | 3|

...etc

10| | -1000| |echo {%} '{= $_=total_jobs() =} |10|

As mentioned earlier, let the sqlworker calculate the "total_jobs" at runtime. It may not make sense, but if it works consistently and is documented, that should be sufficient.

> 2. append new tasks to an existing database?
> I think this is more likely a feature request since the man page
> specifically says table will be clobbered. As I understand, the reason is
> that the table schema must/should match, especially the V* columns. But,
> really, isn't that ultimately a burden for the user (as opposed to the
> developer)?

I struggled with this decision, too. My reasoning was, that if you run:

parallel --sqlandmaster $DBURL echo ::: {1..3}

but really meant:

parallel --sqlandmaster $DBURL echo ::: {1..3} ::: {4..6}

then you would have to find a way to clean the database first.

That sounds entirely like user error to me and the user can then decide how best to proceed (ie: either sanitize the DB table or drop and start over.) As a matter of fact, that's actually more flexible, and what I, personally, would expect anyway. I would expect the tool to fail if the DB table already exists and/or has a mis-matched schema unless I specified that it's okay to clobber an existing table.

> Perhaps a specific flag allowing "append" operation so user can
> be duly warned and could still check that the number of V* columns matches.

Like having the DBURL start with '+': +pg://tange:mypass/tange/TBL8007

I believe this part is relatively simple to do - especially if we
allow it to die horribly if the columns do not match.

It should:

* Not drop table
* Find the max seq-number, and continue from there

> A sample use case is executing a bunch of generated bash scripts (per above,
> this is how the parallel backups are handled), so the V1 column is the
> absolute path to the script and the command is simply "bash".

I can definitely see it being useful, and I believe the easy changes
would accommodate this:

# Append backup-script* to the queue in $DBURL. Exit immediately (no --wait).
parallel --sqlmaster +$DBURL bash ::: backup-script*
# or even:
chmod 755 backup-script*
parallel --sqlmaster +$DBURL ::: backup-script*

# Do the work by grapping arguments from $DBURL
parallel --sqlworker $DBURL bash
# or even:
PATH=.:$PATH
parallel --sqlworker $DBURL

/Ole

From:	Andy Loftus
Subject:	Re: sqlmaster "nowait" and "append" functionality?
Date:	Wed, 07 Dec 2016 05:46:10 +0000