bug-apl
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-apl] Regex support


From: Hans-Peter Sorge
Subject: Re: [Bug-apl] Regex support
Date: Fri, 29 Sep 2017 11:41:25 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.3.0

Hi Jürgen,

The construct  regex ⎕Regex string  looks OK to me.

However having the following regex patterns

match:       'regexm' ['modifier'] ⎕Regex string  and
substitute:  'regexs' 'regexr'  ['modifier'] ⎕Regex string

the patterns
'regexm' 'modifier' ⎕Regex string and
'regexs' 'regexr'   ⎕Regex string
are contradictory.

Either
'm' 'regexm' ['modifier']  ⎕Regex string and
's' 'regexs' 'regexr'  ['modifier'] ⎕Regex string

or
'regexm' '' ⎕Regex string  and
'regexs' 'regexr'  '' ⎕Regex string
would solve this syntactical problem.  But typing is a bit tedious.


So I would rather go with regex =^= 'm/.../mod' and  's/..../..../mod'

which makes expressions like
(⊂'s/..../..../mod') ⎕Regex ¨ string string string
easier to read.

(⊂'m/..../mod') ⎕Regex ¨ string string string
should return 1 for match and 0 for non match to be used in a subsequent
scan.

...... (⊂'m/..../mod') ⎕Regexi ¨ string string string
could return the indexes as vector of vectors using selective
specification:  (matching_index  non_matching_index) ← .......

....... (⊂'m/..../mod') ⎕Regexc ¨ string string string
should return the content as vector of vectors using selective
specification:
(matching_content  non_matching_content) ← .......

and further:
dates ← '2017-01-02' '2017-01-03'
(⊂'s/([0-9]+)-([0-9]+)-([0-9]+)/\1 \2 \3/') ⎕Regex ¨ dates
results in
('2017' '01' '02') ('2017' '01' '03')

and
dates ← ⊃ '2017-01-02' '2017-01-03'
's/([0-9]+)-([0-9]+)-([0-9]+)/\1 \2 \3/' ⎕Regex dates
results in
'2017' '01' '02'
'2017' '01' '03'


My be I prefer ⎕Regex['i'] over ⎕Regexi ->>  ⎕Regex['option' 'option']
to handle various transform alternatives from regex results to apl.

FWIIW

Hans-Peter Sorge


Am 22.09.2017 um 23:55 schrieb Peter Teeson:
> Hi Jürgen:
> Thanks for your usual gracious reply. I understand the points you present.
> 
> Perhaps my perspective is too narrow? The way I see it the key “module” is 
> the interpreter of the language.
> IMHO display of the results, means to enter and store data of various types, 
> providing an environment where the interpreter executes
> are really separate, but necessary, components.
> 
> You mentioned that rationals need to be explicitly configured. Personally I 
> would prefer that approach rather than encrusting the interpreter.
> Each capability added to the interpreter just complicates it - of course not 
> for you as the author but for us lesser mortals.
> 
> As you may recall I am on a Macintosh. One project I pickup and work on from 
> time to time is to try and
> extract only the interpreter and then use the Mac OS facilities for the rest. 
> Of course that is only of use to other Mac users (if at all).
> Separating the interpreter from the rest allows for different “models” - 
> OS’s. 
> 
> What we have right now is a monolithic code base which becomes more fragile 
> with each added feature, version of GCC, or HW box
>  - desirable as that might be.
> 
> I suppose what I am suggesting is that perhaps it’s time to take a fresh look 
> at the project architecture and ask ourselves if we can improve.
> 
> FWIW
> 
> respect….
> 
> Peter
> 
>> On Sep 22, 2017, at 11:48 AM, Juergen Sauermann <address@hidden> wrote:
>> Hi Peter,
>>
>> I mostly agree with your concerns. As you may have noticed, I already 
>> regretted some of the things that I implemented earlier
>> in GNU APL. On the other hand, you also see on the GNU APL mailing list the 
>> proposals of other GNU APL users to implement
>> certain things. I haven't really found a way out of this dilemma.
>>
>> My current thinking is this:
>>
>> 1. If a feature affects the APL language itself then it is probably a bad 
>> thing to do. Examples for this are, IMHO, changing the scoping
>>     of variables, lexical binding and stuff like that. As useful as these 
>> may be in other languages, my feeling is that they would turn GNU
>>    APL into something else which is no longer APL. For example, I am a big 
>> fan of the powerful matching capabilities in Erlang but I
>>    believe as useful as they may be, they simply do not belong into GNU APL 
>> (or any APL for that matter). Those who really need that (as
>>    opposed to only believing it would improve GNU APL) might be better off 
>> with one of the successors of APL.
>>
>> 2. Some areas, most notably FILE I/O have traditionally not been part of the 
>> APL language itself, but are unfortunately needed in the
>>     real world. I am equally concerned about a proliferation of quad 
>> functions (and most other APLs are more keen than GNU APL to
>>    move in that direction). However, regular expressions are a more 
>> fundamental concept than other "nice to have but never used"
>>    features, so that adding them as a ⎕-function should not do too much 
>> harm. Nobody is forced to use a ⎕-function that he or she
>>    does not know or like. And the only thing that gets more complicated when 
>> a ⎕ function is added is the implementation and not
>>    the language.
>>
>> Rational number, BTW, have to be explicitly ./configured and are not present 
>> in the default GNU APL. Same for parallel APL. I have
>> seen that some users are experimenting with these features and I believe we 
>> should allow that because chances are that these
>> experiments result in something valuable some day. Who knows? 
>>
>> Best Regards,
>> /// Jürgen
>>
>>
>> On 09/21/2017 04:19 AM, Peter Teeson wrote:
>>> It so happens that 2 of my former colleagues from I.P.Sharp came visiting 
>>> today and we were chatting about this.
>>> Ken was not in favour of making APL complicated. When I worked at IPSA my 
>>> office was next to Ken’s 
>>> and when someone suggested some form of addition to the language he would 
>>> usually ask 
>>> why we could not do it with an APL function. (These days performance can 
>>> hardly be a compelling argument
>>> with multiple many-core CPU chips.)
>>>
>>> Right now we already have a proliferation of Quad functions not to mention 
>>> lambdas and native functions.
>>> We also have divergent APLs such as Dyalog (good as it is) and so on.
>>>
>>> Complex numbers, rationals and file systems are good additions.  
>>> But IMHO we should have one simple mechanism - i.e. the libapl APL API
>>> and all the rest go through that as native functions.
>>>
>>> Jurgen’s guiding light is to make GNUAPL an implementation that met the ISO 
>>> and APL2 definitions.
>>> We have already wondered away from that. Pity.  When will it stop?
>>>
>>> Just my 02¢
>>>
>>> respect
>>>
>>> Peter
>>>> On Sep 20, 2017, at 4:30 PM, address@hidden <mailto:address@hidden> wrote:
>>>>
>>>> <mumble> anyone who loves grep and hates perl (and i hope java too) can't 
>>>> be all bad </mumble>
>>>>
>>>> using apl like syntax is good    aaa' ⎕REX['s'] 'bbb'      what would 
>>>> monadic   ⎕REX['s'] 'bbb'      return?
>>>>
>>>> On Wed, 20 Sep 2017 21:47:29 +0200
>>>> Juergen Sauermann <address@hidden> <mailto:address@hidden> wrote:
>>>>
>>>>> Hi Elias,
>>>>>
>>>>> I am generally in favour of supporting regular expressions in GNU APL.
>>>>>
>>>>> We should do that in a way that is compatible with the way in which the 
>>>>> most commonly used libraries
>>>>> do that (even if they are lacking some features that more exotic 
>>>>> libraries may have. Unfortunately I do not
>>>>> have a full overview of all (or even any) existing libraries. I 
>>>>> personally love grep and hate perl (the latter not
>>>>> only because of their regexes).
>>>>>
>>>>> I would like to avoid constructs like s/aaa/bbb/ where operations are 
>>>>> kind of text-encoded into strings.
>>>>> That is, IMHO, a  hack-ish programming style and should be replaced by a 
>>>>> more APL-alike syntax such as
>>>>> 'aaa' ⎕REX['s'] 'bbb' or maybe 's' ⎕REX 'aaa' 'bbb'.
>>>>>
>>>>> Or, if the number of operations is small (perl seems to have only 2, not 
>>>>> counting the translate which is already
>>>>> covered by other APL functions), then we could also have different 
>>>>> ⎕-functions for them and thus avoiding a
>>>>> third argument.
>>>>>
>>>>> Everybody else, please feel invited to join the discussion.
>>>>>
>>>>> Best Regards,
>>>>> Jürgen Sauermann
>>>>>
>>>>>
>>>>> On 09/20/2017 05:59 AM, Elias Mårtenson wrote:
>>>>> On several occasions, I have felt that built-in regex support in GNU APL 
>>>>> would be very helpful.
>>>>>
>>>>> Implementing it should be rather simple, but I'd like to discuss how such 
>>>>> an API should look in order for it to be as useful as possible.
>>>>>
>>>>> I was thinking of the following form:
>>>>>
>>>>>       regex ⎕Regex string
>>>>>
>>>>> The way I envision this to work, is to have the function return ⍬ if 
>>>>> there is no match, or a string containing the match, if there is one:
>>>>>
>>>>>       'f..' ⎕Regex 'xzooy'
>>>>> ┏⊖┓
>>>>> ┃0┃
>>>>> ┗━┛
>>>>>       'f..' ⎕Regex 'xfooy'
>>>>> 'foo'
>>>>>
>>>>> If the regex has subexpressions, those matches should be returned as 
>>>>> individual strings:
>>>>>
>>>>>       '([0-9]+)-([0-9]+)-([0-9]+) '⎕Regex '2017-01-02'
>>>>> ┏→━━━━━━━━━━━━━━━┓
>>>>> ┃"2017" "01" "02"┃
>>>>> ┗∊━━━━━━━━━━━━━━━┛
>>>>>
>>>>> This would be a very useful API, and reasonably easy to implement by 
>>>>> simply calling into the standard regcomp() call: 
>>>>> http://pubs.opengroup.org/onlinepubs/009695399/functions/regcomp.html 
>>>>> <http://pubs.opengroup.org/onlinepubs/009695399/functions/regcomp.html>
>>>>>
>>>>> What do you think? Is this a reasonable way to implement it? Any 
>>>>> suggestions about alternative API's?
>>>>>
>>>>> Regards,
>>>>> Elias
>>>>>
>>>
>>>
>>
> 
> 




reply via email to

[Prev in Thread] Current Thread [Next in Thread]