[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Bug-apl] Question about behavior of ⍋
From: |
David B. Lamkins |
Subject: |
Re: [Bug-apl] Question about behavior of ⍋ |
Date: |
Mon, 07 Jul 2014 21:46:30 -0700 |
On Tue, 2014-07-08 at 12:38 +0800, Elias Mårtenson wrote:
> Right, but just having a "plain" collating order for Unicode would
> require me to pass a million-element array (⎕UCS¨⍳1114111) as left
> argument to grade.
>
I guess you could do that if you needed to impose a complete collating
order upon every code point. Most applications would be content, I
think, with sorting alphanumerics (in all the languages of interest)
plus common punctuation.
>
> That said, I can't even get dyadic grade to work at all, but that's a
> separate issue.
>
Here's a working example.
∇z←suffix CF⍙ls path;dir
⍝ Return a character matrix of directory entries. Left argument,
⍝ if present, filters entries by suffix.
z←0 0⍴''
dir←CF¯FILEIO[28] path
dir←(⍳↑⍴dir) ⎕io⌷dir
⍎(0≠⎕nc 'suffix')/'dir←((⊂,suffix)≡¨(-⍴,suffix)↑¨dir)/dir'
→(0=⍴dir)/0
dir←⊃dir
z←dir[(⎕ucs ⎕io-⍨⍳256)⍋dir;]
∇
CF¯FILEIO is the bound name of the lib_file_io native function.
On the last line, dir is a character matrix.
>
> Regards,
> Elias
>
>
> On 8 July 2014 12:27, David B. Lamkins <address@hidden> wrote:
> The problem with generating a permutation vector for an
> "arbitrary"
> Unicode string is still a problems of collating order. There
> is no
> inherent order in Unicode; someone has to decide on what makes
> sense as
> a collating order for the subset of code points used by the
> application.
>
> You should use ⎕ucs with a vector of code points to define
> your own
> collating order for Unicode; any code points not explicitly
> specified in
> the collating order will sort to the end.
>
> For example (and this is an easy case) you can use this to
> specify a
> default collating order (based upon ordinal value of the code
> points
> themselves) for the 8-bit ASCII subset:
>
> ⎕ucs ⎕io-⍨⍳256
>
>
>
> On Tue, 2014-07-08 at 12:09 +0800, Elias Mårtenson wrote:
> > Dyadic grade doesn't make much sense in the context of
> Unicode though.
> > How do you grade an arbitrary Unicode string?
> >
> >
> > That issue is there even if we completely disregard all the
> > other Unicode-related collating issues.
> >
> >
> > Regards,
> > Elias
> >
> >
> > On 8 July 2014 12:00, David B. Lamkins <address@hidden>
> wrote:
> > Check my follow-up post.
> >
> > I'm fairly certain that the issue is whether monadic
> grade
> > applied to a
> > list of strings should do anything but signal a
> domain error.
> > The ISO
> > spec says that monadic grade is defined only on
> numeric
> > arguments.
> >
> > My test case appears to have monadic grade treating
> strings as
> > if they
> > encode numbers in a sufficiently large base.
> >
> > If you want to sort strings, use dyadic grade. The
> left
> > argument
> > specifies a collating sequence.
> >
> > On Tue, 2014-07-08 at 11:43 +0800, Elias Mårtenson
> wrote:
> > > Ordering by size first makes very little sense to
> me. It
> > makes it very
> > > hard to sort any list of strings.
> > >
> > >
> > > I was hoping that the following would have done
> so, but it
> > also
> > > suffers from the "length first" issue:
> > >
> > >
> > > z[⍋ ⎕UCS¨ z←'aa' 'xx' 'aaa' 'xxx']
> > > aa xx aaa xxx
> > >
> > >
> > > What is the proper way to sort strings given the
> existing
> > semantics of
> > > grade?
> > >
> > >
> > > Regards,
> > > Elias
> > >
> > >
> > > On 8 July 2014 02:34, David Lamkins
> <address@hidden>
> > wrote:
> > > Looking at the spec, it seems that monadic
> grade is
> > defined
> > > only for numeric data.
> > >
> > >
> > > That leaves open the question of whether
> my example
> > should
> > > have signaled a domain error.
> > >
> > >
> > >
> > > On Mon, Jul 7, 2014 at 11:25 AM, David
> Lamkins
> > > <address@hidden> wrote:
> > > Given a list of character vectors
> (and
> > scalars), grade
> > > appears to generate the
> permutation vector
> > first by
> > > length then by content.
> > >
> > > ⍋'aaa' 'xx' 'y' 'bbb' 'cc'
> > > 3 5 2 1 4
> > >
> > >
> > > This seems counterintuitive. It
> seems as if
> > ⍋ treats
> > > character strings like numbers. Is
> this a
> > bug?
> > >
> > > --
> > > "The secret to creativity is
> knowing how to
> > hide your
> > > sources."
> > > Albert Einstein
> > >
> > >
> > > http://soundcloud.com/davidlamkins
> > > http://reverbnation.com/lamkins
> > > http://reverbnation.com/lcw
> > > http://lamkins-guitar.com/
> > > http://lamkins.net/
> > > http://successful-lisp.com/
> > >
> > >
> > >
> > > --
> > > "The secret to creativity is knowing how
> to hide
> > your
> > > sources."
> > > Albert Einstein
> > >
> > >
> > > http://soundcloud.com/davidlamkins
> > > http://reverbnation.com/lamkins
> > > http://reverbnation.com/lcw
> > > http://lamkins-guitar.com/
> > > http://lamkins.net/
> > > http://successful-lisp.com/
> > >
> > >
> >
> >
> >
> >
> >
>
>
>
>
>
- Re: [Bug-apl] Question about behavior of ⍋, (continued)
- Re: [Bug-apl] Question about behavior of ⍋, David B. Lamkins, 2014/07/08
- Re: [Bug-apl] Question about behavior of ⍋, Elias Mårtenson, 2014/07/08
- Re: [Bug-apl] Question about behavior of ⍋, Elias Mårtenson, 2014/07/08
- Re: [Bug-apl] Question about behavior of ⍋, Juergen Sauermann, 2014/07/08
- Re: [Bug-apl] Question about behavior of ⍋, Blake McBride, 2014/07/08
- Re: [Bug-apl] Question about behavior of ⍋, Blake McBride, 2014/07/08
- Re: [Bug-apl] Question about behavior of ⍋, Elias Mårtenson, 2014/07/08
- Re: [Bug-apl] Question about behavior of ⍋, Blake McBride, 2014/07/08
- Re: [Bug-apl] Question about behavior of ⍋, Elias Mårtenson, 2014/07/08
- Re: [Bug-apl] Question about behavior of ⍋, Jay Foad, 2014/07/09
- Re: [Bug-apl] Question about behavior of ⍋,
David B. Lamkins <=
Re: [Bug-apl] Question about behavior of ⍋, Juergen Sauermann, 2014/07/08