parallel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Max Number of Records


From: Ole Tange
Subject: Re: Max Number of Records
Date: Sun, 11 Jun 2017 13:26:39 +0200

On Fri, Jun 9, 2017 at 5:47 PM, Ling, Stephen *
<Stephen.Ling@fda.hhs.gov> wrote:

> I am currently using the program to split a database that is around Size
> 134,625,557,455 bytes. I’ve been trying to split the database into around
> 0.5g, 0.25g, and 0.125g pieces. The program however, has been unable to
> split the database completely evenly and I am just wondering if there’s a
> certain limitation to this problem

rand | head -c 134625557455 > database
parallel -j3 -a database --pipepart --block 0.125g wc
:
488108 2767530 125000180
487866 2770031 125000013
487224 2762455 125000532
  1117    6383  296473 <--- this is the last incomplete block
489535 2766571 125000417
488926 2768247 125000247

This is what we expect, as GNU Parallel only finds a splitpoint at a \n.

If you want exactly 125000000 bytes, use --recend '':

parallel -j3 -a database --pipepart --recend '' --block 0.125g wc
488671 2770097 125000000
487496 2763312 125000000
489431 2767600 125000000

/Ole



reply via email to

[Prev in Thread] Current Thread [Next in Thread]