bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#32372: [PATCH] Add "uuid" to thing-at-point.el


From: Noam Postavsky
Subject: bug#32372: [PATCH] Add "uuid" to thing-at-point.el
Date: Sun, 05 Aug 2018 22:31:43 -0400
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/26.1 (gnu/linux)

severity 32372 wishlist
quit

Raimon Grau <raimon@konghq.com> writes:

> Subject: [PATCH] Add uuid as allowed thingatpt symbol
>
> * lisp/thingatpt.el (thing-at-point-uuid-regexp): Add regexp for uuid.

I guess you should mention something about the ops as well here.  Though
it's not 100% clear what kind of format you should use for those.  Maybe
just (top-level): Add 'bounds-of-thing-at-point' operation for 'uuid'.

> +;; UUID
> +
> +(defvar thing-at-point-uuid-regexp
> +  (rx (and bow

Using rx is okay, I think.  There was some discussion about it on
emacs-devel a little time ago, with most people saying the increased
verbosity made them not want to use it, but I kind of like it myself.
However, Stefan made the point that `and' is potentially a bit
confusing, because it could be misread as intersection.  It's better to
use one of the synonyms `seq' or `:'.

> +           (or
> +            "00000000-0000-0000-0000-000000000000"
> +            (and
> +             (repeat 8 hex-digit) "-"
> +             (repeat 4 hex-digit) "-"
> +             (or "1" "2" "3" "4" "5")
> +             (repeat 3 hex-digit) "-"
> +             (or "8" "9" "a" "b" "A" "B")
> +             (repeat 3 hex-digit) "-"
> +             (repeat 12 hex-digit)))
> +           eow))
> +  "A regular expression matching a UUID from versions 1 to 5.
> +
> +  More info on uuid's format in
> +  https://tools.ietf.org/html/rfc4122."; )

So, in that RFC I see this grammar

      UUID                   = time-low "-" time-mid "-"
                               time-high-and-version "-"
                               clock-seq-and-reserved
                               clock-seq-low "-" node
      time-low               = 4hexOctet
      time-mid               = 2hexOctet
      time-high-and-version  = 2hexOctet
      clock-seq-and-reserved = hexOctet
      clock-seq-low          = hexOctet
      node                   = 6hexOctet
      hexOctet               = hexDigit hexDigit
      hexDigit =
            "0" / "1" / "2" / "3" / "4" / "5" / "6" / "7" / "8" / "9" /
            "a" / "b" / "c" / "d" / "e" / "f" /
            "A" / "B" / "C" / "D" / "E" / "F"

It looks like you crafted a regexp which is a tighter match for just the
UUID versions currently in use.  I think we're better off with the
looser definition though, that way it will continue to be correct even
as new versions come out.

Furthermore, I would guess a human user is going to be surprised if
(thing-at-point 'uuid) picks up this

    12345678-1234-1234-8123-123456789012

but not this:

    12345678-1234-1234-5123-123456789012


> +(put 'uuid 'thing-at-point
> +     (lambda ()
> +       (let ((boundary-pair (bounds-of-thing-at-point 'uuid)))
> +         (if boundary-pair
> +             (buffer-substring-no-properties
> +              (car boundary-pair) (cdr boundary-pair))))))

I think this isn't needed, because the `thing-at-point' function already
does this for you:

  (let ((text
         (if (get thing 'thing-at-point)
             (funcall (get thing 'thing-at-point))
           (let ((bounds (bounds-of-thing-at-point thing)))
             (when bounds
               (buffer-substring (car bounds) (cdr bounds)))))))





reply via email to

[Prev in Thread] Current Thread [Next in Thread]