How does gawk allocate memory for arrays?

help-gawk

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

How does gawk allocate memory for arrays?

From:	Ed Morton
Subject:	How does gawk allocate memory for arrays?
Date:	Mon, 30 May 2022 08:54:23 -0500
User-agent:	Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Thunderbird/91.9.1

A question came up recently that can be reduced to comparing the linenumbers associated with unique key values in 2 files that are over 10Glines each. While unique, those values only need to be compared withingroups of 10 contiguous lines (in reality there's other things going onirrelevant to this discussion).


So we had:

 * Approach 1:  while reading file1 create an array that's 10G of
   entries then while reading file2 just do a hash lookup.
 * Approach 2: while reading file1 for every 10 lines create an array
   that holds those 10 entries, then getline 10 entries from file2 into
   a second array of 10 entries, then loop through all the values in
   the file1 array to compare to the file2 array then delete both
   arrays and start over.

Obviously the 2nd approach was going to use far less memory butaccording to the OP it was also an order of magnitude faster than the1st approach so that got me wondering about how awk arrays areallocated, e.g. is there a default size that gets allocated initiallyand then new chunks of memory get allocated as needed? If so what isthat size? I expect I could find and read the code but I'm reallyhoping to not have to do that and the design is documented somewhere orsomeone can just tell me what it is.

Ed.

[Prev in Thread]

Current Thread

[Next in Thread]

How does gawk allocate memory for arrays?, Ed Morton <=
- Re: How does gawk allocate memory for arrays?, Andrew J. Schorr, 2022/05/30
  - Re: How does gawk allocate memory for arrays?, Ed Morton, 2022/05/30
    - Re: How does gawk allocate memory for arrays?, Andrew J. Schorr, 2022/05/30
    - Re: How does gawk allocate memory for arrays?, Ed Morton, 2022/05/30
- How does gawk allocate memory for arrays?, J Naman, 2022/05/30

Prev by Date: Re: ensure numeric comparison
Next by Date: How does gawk allocate memory for arrays?
Previous by thread: Re: network time service, non-blocking?
Next by thread: Re: How does gawk allocate memory for arrays?
Index(es):
- Date
- Thread