bug-gawk
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [bug-gawk] memory issues


From: Andrew J. Schorr
Subject: Re: [bug-gawk] memory issues
Date: Thu, 29 Aug 2019 15:46:16 -0400
User-agent: Mutt/1.5.21 (2010-09-15)

On Thu, Aug 29, 2019 at 01:47:37PM -0400, Andrew J. Schorr wrote:
> This is really tricky stuff. With my patch, the Node_dynregex re_exp
> node pointer is released immediately after the regexp search is done.
> Without my patch, the re_exp pointer is retained, and it's released
> the next time that a search is done. So in the interim, the code has
> to retain a copy. I'm not certain why in Finn's case, the nodes were
> never getting released. But it seems clear to me that releasing them
> proactively when we are done with them is a win, as it saves
> field.c:purge_record from having to copy the string and save the node.
> 
> To understand memory allocation issues better, I think the block allocation
> scheme gets in the way. So I came up with the attached patch for memory
> debugging.

I have a much simpler reproducer for Finn's memory leak. When I run:

valgrind --leak-check=full --log-file=vg.log ./gawk -f test2.awk -v 
mibfile=10.220.33.18_153125.log.gz -vsqlrefmapfile=sqlRefMap.txt

I see the following (this is master branch gawk with the memdebug patch):

...
==20735== 502 (448 direct, 54 indirect) bytes in 4 blocks are definitely lost 
in loss record 360 of 406
==20735==    at 0x4C29BC3: malloc (vg_replace_malloc.c:299)
==20735==    by 0x45A602: emalloc_real (awk.h:1989)
==20735==    by 0x45A602: r_getblock (node.c:1043)
==20735==    by 0x443C31: grow_fields_arr (field.c:122)
==20735==    by 0x443D7A: set_field (field.c:141)
==20735==    by 0x443243: def_parse_field (field.c:585)
==20735==    by 0x444C8B: get_field (field.c:906)
==20735==    by 0x438DA4: r_get_field (eval.c:1212)
==20735==    by 0x43A042: r_interpret (interpret.h:376)
==20735==    by 0x407ED5: main (main.c:522)
==20735== 
==20735== 2,611 (2,352 direct, 259 indirect) bytes in 21 blocks are definitely 
lost in loss record 389 of 406
==20735==    at 0x4C29BC3: malloc (vg_replace_malloc.c:299)
==20735==    by 0x45A602: emalloc_real (awk.h:1989)
==20735==    by 0x45A602: r_getblock (node.c:1043)
==20735==    by 0x443DE6: purge_record (field.c:367)
==20735==    by 0x44595B: set_record (field.c:269)
==20735==    by 0x451928: do_getline_redir (io.c:2843)
==20735==    by 0x43BB62: r_interpret (interpret.h:1224)
==20735==    by 0x407ED5: main (main.c:522)
...
==20735== LEAK SUMMARY:
==20735==    definitely lost: 2,912 bytes in 26 blocks
==20735==    indirectly lost: 313 bytes in 25 blocks
==20735==      possibly lost: 12,544 bytes in 8 blocks
==20735==    still reachable: 144,265 bytes in 1,554 blocks
==20735==         suppressed: 0 bytes in 0 blocks

Note that the gsub line is required to cause this leak, and possibly the
nested getlines as well.

A simpler test case may be possible, but this at least runs quickly and gives
a better idea of what's going on.

I still think my patch is the right fix, but I haven't yet had time to
dig into why the field copying patch also fixes the leak.

Regards,
Andy

Attachment: test2.awk
Description: Text document

Attachment: memdebug.patch
Description: Text document


reply via email to

[Prev in Thread] Current Thread [Next in Thread]