bug-apl
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-apl] Question about behavior of ⍋


From: Elias Mårtenson
Subject: Re: [Bug-apl] Question about behavior of ⍋
Date: Tue, 8 Jul 2014 12:40:16 +0800

To clarify, I tried the following:

      (⎕UCS¨⍳1114111) ⍋ 'foo' 'bar' 'test'
DOMAIN ERROR
      (⎕UCS¨⍳1114111)⍋'foo' 'bar' 'test'
      ^              ^

Note of course that this is pretty insane, and there should be an easier way to do this.

Regards,
Elias


On 8 July 2014 12:38, Elias Mårtenson <address@hidden> wrote:
Right, but just having a "plain" collating order for Unicode would require me to pass a million-element array (⎕UCS¨⍳1114111) as left argument to grade.

That said, I can't even get dyadic grade to work at all, but that's a separate issue.

Regards,
Elias


On 8 July 2014 12:27, David B. Lamkins <address@hidden> wrote:
The problem with generating a permutation vector for an "arbitrary"
Unicode string is still a problems of collating order. There is no
inherent order in Unicode; someone has to decide on what makes sense as
a collating order for the subset of code points used by the application.

You should use ⎕ucs with a vector of code points to define your own
collating order for Unicode; any code points not explicitly specified in
the collating order will sort to the end.

For example (and this is an easy case) you can use this to specify a
default collating order (based upon ordinal value of the code points
themselves) for the 8-bit ASCII subset:

⎕ucs ⎕io-⍨⍳256



On Tue, 2014-07-08 at 12:09 +0800, Elias Mårtenson wrote:
> Dyadic grade doesn't make much sense in the context of Unicode though.
> How do you grade an arbitrary Unicode string?
>
>
> That issue is there even if we completely disregard all the
> other Unicode-related collating issues.
>
>
> Regards,
> Elias
>
>
> On 8 July 2014 12:00, David B. Lamkins <address@hidden> wrote:
>         Check my follow-up post.
>
>         I'm fairly certain that the issue is whether monadic grade
>         applied to a
>         list of strings should do anything but signal a domain error.
>         The ISO
>         spec says that monadic grade is defined only on numeric
>         arguments.
>
>         My test case appears to have monadic grade treating strings as
>         if they
>         encode numbers in a sufficiently large base.
>
>         If you want to sort strings, use dyadic grade. The left
>         argument
>         specifies a collating sequence.
>
>         On Tue, 2014-07-08 at 11:43 +0800, Elias Mårtenson wrote:
>         > Ordering by size first makes very little sense to me. It
>         makes it very
>         > hard to sort any list of strings.
>         >
>         >
>         > I was hoping that the following would have done so, but it
>         also
>         > suffers from the "length first" issue:
>         >
>         >
>         >       z[⍋ ⎕UCS¨ z←'aa' 'xx' 'aaa' 'xxx']
>         >  aa xx aaa xxx
>         >
>         >
>         > What is the proper way to sort strings given the existing
>         semantics of
>         > grade?
>         >
>         >
>         > Regards,
>         > Elias
>         >
>         >
>         > On 8 July 2014 02:34, David Lamkins <address@hidden>
>         wrote:
>         >         Looking at the spec, it seems that monadic grade is
>         defined
>         >         only for numeric data.
>         >
>         >
>         >         That leaves open the question of whether my example
>         should
>         >         have signaled a domain error.
>         >
>         >
>         >
>         >         On Mon, Jul 7, 2014 at 11:25 AM, David Lamkins
>         >         <address@hidden> wrote:
>         >                 Given a list of character vectors (and
>         scalars), grade
>         >                 appears to generate the permutation vector
>         first by
>         >                 length then by content.
>         >
>         >                       ⍋'aaa' 'xx' 'y' 'bbb' 'cc'
>         >                 3 5 2 1 4
>         >
>         >
>         >                 This seems counterintuitive. It seems as if
>         ⍋ treats
>         >                 character strings like numbers. Is this a
>         bug?
>         >
>         >                 --
>         >                 "The secret to creativity is knowing how to
>         hide your
>         >                 sources."
>         >                    Albert Einstein
>         >
>         >
>         >                 http://soundcloud.com/davidlamkins
>         >                 http://reverbnation.com/lamkins
>         >                 http://reverbnation.com/lcw
>         >                 http://lamkins-guitar.com/
>         >                 http://lamkins.net/
>         >                 http://successful-lisp.com/
>         >
>         >
>         >
>         >         --
>         >         "The secret to creativity is knowing how to hide
>         your
>         >         sources."
>         >            Albert Einstein
>         >
>         >
>         >         http://soundcloud.com/davidlamkins
>         >         http://reverbnation.com/lamkins
>         >         http://reverbnation.com/lcw
>         >         http://lamkins-guitar.com/
>         >         http://lamkins.net/
>         >         http://successful-lisp.com/
>         >
>         >
>
>
>
>
>





reply via email to

[Prev in Thread] Current Thread [Next in Thread]