bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#46342: 28.0.50; socks-send-command munges IP address bytes to UTF-8


From: J.P.
Subject: bug#46342: 28.0.50; socks-send-command munges IP address bytes to UTF-8
Date: Wed, 10 Feb 2021 05:16:58 -0800
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/27.1 (gnu/linux)

Eli Zaretskii <eliz@gnu.org> writes:

> what kind of string can this ADDRESS be? My reading of RFC 1928 is
> that it normally is an IP address, in which case encoding is not
> relevant, as it's an ASCII string. But it can also be a domain, right?

This patch only affects IP addresses, but I'm happy to look into the
domain name form as well.

> If so, what form can this domain take? If the domain has non-ASCII
> characters, shouldn't it be hex-encoded, or run through IDNA? I mean,
> are non-ASCII characters in that place at all allowed?

At first glance, both tor and ssh appear to call getaddrinfo() on the
remote end without accounting for the sender's locale or passing any
special IDN-related flags. But I'm still looking into these.

For now, if we're allowing anecdotal caveman logic, I'd wager the answer
is ASCII only. Here's why:

It seems feeding tor and ssh the hostname for Яндекс.рф (Yandex) as the
UTF-8 encoded byte string

  \xd0\xaf\xd0\xbd\xd0\xb4\xd0\xb5\xd0\xba\xd1\x81.\xd1\x80\xd1\x84

results in failure both when forwarding via CONNECT and when resolving
via tor's nonstandard RESOLVE command. (This is direct, no Emacs.)

However, passing the punified "xn--d1acpjx3f.xn--p1ai" works as
intended, forwarding to (or, in the case of RESOLVE, producing) an IP
from a Yandex-registered A record (for me, 77.88.55.66).

To try this at home (on separate ttys):

  $ ssh -TND 4711 my.sshd
  # tcpdump -i lo -nnX "port 4711"
  $ curl --verbose --proxy socks5h://localhost:4711 Яндекс.рф

Here's a trace for curl's actual call to the hostname conversion
function idn2_lookup_ul() [1], which is provided by GNU libidn2 [2].
It's hard to see without context, but this happens before any connection
is established (tcpdump will confirm this).

#0  Curl_idnconvert_hostname at lib/url.c:1566
#1  create_conn at lib/url.c:3583
#2  Curl_connect at lib/url.c:4027
#3  multi_runsingle at lib/multi.c:1671
#4  curl_multi_perform at lib/multi.c:2412
#5  easy_transfer at lib/easy.c:606
#6  easy_perform at lib/easy.c:696
#7  curl_easy_perform at lib/easy.c:715
#8  serial_transfers at src/tool_operate.c:2327
#9  run_all_transfers at src/tool_operate.c:2505
#10 operate at src/tool_operate.c:2621
#11 main at src/tool_main.c:277

On my machine, curl was configured to pass these flags to idn2_lookup_ul[3]:

  /* IDN2_NFC_INPUT: Normalize input string using normalization form C.
     IDN2_NONTRANSITIONAL: Perform Unicode TR46 non-transitional
     processing. */
  int flags = IDN2_NFC_INPUT | IDN2_NONTRANSITIONAL;

Apparently there are two IDNA standards: 2003 and 2008 [4]. Curl uses
the latter, but I'm not sure which, if any, puny.el favors. In the case
of Yandex,

  (puny-encode-domain "Яндекс.рф")

produces "xn--d1acpjx9e.xn--p1ai", which tor and ssh both reject (though
it's very possible I'm missing something.) Anyway, passing the version
above provided by libidn2 to socks-send-command works fine.

[1] 
https://github.com/curl/curl/blob/ec5d9b44a2e837fc7b82d1c60d5fae3f851620dc/lib/url.c#L1559
[2] 
https://www.gnu.org/software/libidn/libidn2/reference/libidn2-idn2.html#idn2-lookup-ul
[3] 
https://www.gnu.org/software/libidn/libidn2/reference/libidn2-idn2.html#idn2-flags
[4] https://www.unicode.org/reports/tr46/#Table_Example_Processing





reply via email to

[Prev in Thread] Current Thread [Next in Thread]