Re: [libmicrohttpd] Odd Stop Behavior

On Thu, Jun 26, 2014 at 3:07 PM, Christian Grothoff <address@hidden> wrote:

Found it, very simple off-by-one (character, not index):

address@hidden:~/svn/libmicrohttpd/src/microhttpd$ svn diff
Index: daemon.c
===================================================================
--- daemon.c (revision 33841)
+++ daemon.c (working copy)
@@ -3996,7 +3996,7 @@
if ( (0 != (daemon->options & MHD_USE_THREAD_PER_CONNECTION)) &&
(MHD_YES != MHD_mutex_lock_ (&daemon->cleanup_connection_mutex)) )
MHD_PANIC ("Failed to acquire cleanup mutex\n");
- for (pos = daemon->connections_head; NULL != pos; pos = pos->nextX)
+ for (pos = daemon->connections_head; NULL != pos; pos = pos->next)
shutdown (pos->socket_fd,
(pos->read_closed == MHD_YES) ? SHUT_WR : SHUT_RDWR);
if ( (0 != (daemon->options & MHD_USE_THREAD_PER_CONNECTION)) &&

'nextX' is the linked list with connections waiting on timeout,
'next' is the linked list with all connections; by going for 'nextX'
we fail to call shutdown on 'some' FDs, and thus the threads may
not terminate unless the client terminates them. In your case, the
browser keeps the connections open for 120s.

Fixed in SVN 33874.

Thanks for reporting & the testcase!

-Christian

On 06/26/2014 07:22 PM, Kenneth Mastro wrote:
> Hi Christian et all,
>
> It took me a few weeks to get back to this, but I managed to write up a
> test case today to demonstrate this 'delayed shutdown' problem. I sent
> Christian a PM with the test case rather then sending the entire list a
> tar.gz file (if it would even let me).
>
> The problem only seems to occur when using the 'thread per connection'
> threading option. It also has nothing to do with 'POST' processing or a
> 'request completed' callback - my test case doesn't have either of those
> things. Just simple 'GET's of images - not even any query string parsing.
> Otherwise, I have nothing to offer except that it might be more likely to
> occur under heavier loads. Having the browser request 30 images (specified
> in a single HTML file) seems to get it to happen every time on my dev
> machine.
>
> Oh - and I also updated to version 0.9.37, which made no difference (I was
> using 0.9.35 when I saw the behavior). I'm running Ubuntu 12.04 LTS.
>
> Hope this helps.
>
> Christian - if you didn't get my email, please let me know. I sent it just
> prior to sending this list email out.
>
>
> Thanks again,
> Ken
>
>
>
>
> On Sat, Jun 7, 2014 at 11:05 AM, Christian Grothoff <address@hidden>
> wrote:
>
>> On 06/07/2014 04:30 PM, Kenneth Mastro wrote:
>>> Hi Christian,
>>>
>>> Thanks for the quick reply.
>>>
>>> And erk! Sorry about the OS thing. I'm on Linux. My development
>>> environment is a stock install of Ubuntu 12.04 LTS VM on a Windows 7 PC.
>>> (I doubt the VM thing matters at all, but figured I'd mention it just in
>>> case.)
>>
>> I doubt it as well.
>>
>>> Here's my MHD_start_daemon code, nothing earth shattering ( I assume
>> that's
>>> where the per-connection timeout would be set):
>>> -----------------------
>>> myDaemon = MHD_start_daemon(MHD_USE_THREAD_PER_CONNECTION, // one thread
>>> per connection
>>> WEB_TEST_PORT, // port to
>>> listen on
>>> nullptr, // accept
>>> policy callback
>>> nullptr, // extra
>> arg
>>> to accept policy callback
>>> &connectionCallbackC, // main
>>> 'connection' callback
>>> this, // extra
>>> argument to the 'connection' callback
>>> MHD_OPTION_NOTIFY_COMPLETED, //
>> specifies
>>> that the next arg is the callback to call when the connection is done
>>> &requestCompletedCallbackC, // the
>>> callback to call when the connection is done
>>> this, // extra
>> arg
>>> to the callback when the connection is done
>>> MHD_OPTION_CONNECTION_LIMIT, //
>> specifies
>>> that the next arg is the max number of simultaneous connections
>>> (unsigned int)20, // the
>>> number of permitted simultaneous connections
>>> MHD_OPTION_END); // no more
>>> options in the arg list
>>> ----------------------
>>>
>>> So - no timeout changes for MHD. I haven't changed the default timeout
>> for
>>> the kernel, so I'm guessing that 115 (awfully close to 120, i.e., 2
>>> minutes) is something else? I agree completely that it's likely some
>> kind
>>> of timeout, though.
>>
>> You might want to check /proc/sys/net/ipv4/tcp_keepalive_time, just a
>> random guess, though.
>>
>>> I don't change any thread timeouts at all (although I have some other
>>> threads in my application that I set to real-time priority for some audio
>>> processing, but I can't imagine how that would adversely affect MHD).
>>>
>>> If I didn't mention it before, I did trace this down to the 'join' of the
>>> thread in 'close_all_connections'. I assume 'MHD_join_thread_' is
>> actually
>>> pthread_join since I'm on Linux. My gut was telling me that somehow the
>>> thread isn't being told to stop as part of the shutdown process and is
>>> eventually just timing out somewhere inside MHD (or via TCP as you
>>> suggested).
>>
>> Right, but the interesting question is not where MHD_stop_daemon hangs,
>> but what that other thread is doing at the time
>> (gdb threads, thread #NUM, ba; alternatively, strace -f BINARY to see
>> what system call returns after 120s might also help a bit...).
>>
>> netstat -ntpl to see if some connection somehow is still open might also
>> be useful (alternatively, lsof -p).
>>
>>> Before I noticed this problem, I was running with
>>> 'MHD_USE_SELECT_INTERNALLY' with a thread-pool, but you had suggested in
>> a
>>> previous post (a few weeks ago) that that won't work properly with
>>> 'comet-like' requests unless I do the suspend/resume functionality.
>> That's
>>> perfectly fine and good, but it seems worth mentioning that I didn't
>> notice
>>> this problem when using that thread mode. From looking at the
>>> 'close_all_connections' code in daemon.c, I can see why the behavior
>> could
>>> be different.
>>>
>>>
>>> Anyway - I'll play around with the HAVE_LISTEN_SHUTDOWN option, see if
>> that
>>> makes any difference. Failing that, I'll see if I can create a test
>> case.
>>> That make take a couple days, though - lots of code to strip out to keep
>> it
>>> simple. What should I do once I get a concise test case ready to go?
>> Send
>>> the code to the mailing list?
>>
>> List is fine (if it's not megabytes ;-)), PM also. Whichever you're
>> comfortable with.
>>
>> Happy hacking!
>>
>> Christian
>>
>>
>

From:	Kenneth Mastro
Subject:	Re: [libmicrohttpd] Odd Stop Behavior
Date:	Fri, 27 Jun 2014 07:47:02 -0400