bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#38299: A bug while trying to decode a non encode base64


From: Erik Auerswald
Subject: bug#38299: A bug while trying to decode a non encode base64
Date: Thu, 21 Nov 2019 09:12:39 +0100
User-agent: Mutt/1.5.21 (2010-09-15)

Hi,

On Thu, Nov 21, 2019 at 12:04:11PM +0530, vardhaman narasagoudar wrote:
> On Thu, Nov 21, 2019 at 12:51 AM Paul Eggert <address@hidden> wrote:
> > On 11/20/19 6:22 AM, Martin Schulte wrote:
> > > vardhamanbn1 is a valid encoding
> >
> > Thanks for explaining; closing the bug report.
> 
> Thanks for replying the query, but if I check online (
> https://www.base64decode.org/) for decoding  the same in online .
> 
> I get  an error  message (which is valid) e.g:-
> 
> 1) if I try to decode "777799"  I get an error message
> 
> "No printable characters found, try another source charset, or upload your
> data as a file for binary decoding."

The error message says that the decoded data is not printable.  It does
not say anything about invalid input data, although the input data is
not correctly Base64 encoded.

> Similarly we got return code as 1 "invalid input" in the terminal.
> 
> 2) Now if I try to decode "vardhamanbn1" I get the error message  (any 12
> characters or multiple of 12 characters which is a non-encoded value, if
> try to decode)
> "No printable characters found, try another source charset, or upload your
> data as a file for binary decoding."

You get the same error message about the decoded data.  This is correct.
The site even tells you that the interface you use does not support
binary, i.e., non-printable data.

> But when we try the same in terminal , we get the return code as 0 the
> symbol as per inputs given
>  "UTF-8 and thus leads to �."
> 
> Now as a work around we are using

That is not a workaround, but the necessary check for valid output data
for your application, since you seem to require a Base64 encoding of
UTF-8 data.

> a) [vardhaman@oc6085028360 ~]$ echo -n "vardhamanbn1" | base64 -d | iconv
> -f utf8
> iconv: illegal input sequence at position 0

Base64 can encode any binary data, not just valid UTF-8 text.

> also we tried on another sample
> 
> b) [vardhaman@oc6085028360 ~]$ echo  -n '777799' | base64 -d | iconv -f utf8
> base64: invalid input
> iconv: illegal input sequence at position 0
> 
> without using "iconv -f utf8"
> 
> [vardhaman@oc6085028360 ~]$  echo  -n '777799' | base64 -d
> ����base64: invalid input
> 
> 
> So we feel its something still with 12 & multiple of 12 characters leading
> to the issue, when we try to decode a non-decode value.

The magic number is actually 4, because each symbol in a base64 encoded
string represents 6 bits, thus 4 symbols give you 3 bytes of encoded data.
Any combination of Base64 symbols that forms a string of a length
divisibale by 4 is a valid Base64 encoding.  This does not give any
guarantee about the data.

> Or should we think characters of multiple of 12 will be treated as a base64
> format

Yes. Actually, any multiple of 4 characters.

>  e.g when I tried decoding 24 non-encode character:-
>  [vardhaman@oc6085028360 ~]$ echo -n 'vardhamanbn1vardhamanbn1' | base64
> --decode
> ��݅�������݅�����[vardhaman@oc6085028360 ~]$ echo $?
> 0

Thanks,
Erik
-- 
The laws of mathematics are very commendable, but the only law that
applies in Australia is the law of Australia.
                        -- Australian Prime Minister Malcolm Turnbull





reply via email to

[Prev in Thread] Current Thread [Next in Thread]