Re: Help with sxml simple parser for the quicklisp importer

guix-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Help with sxml simple parser for the quicklisp importer

From:	Ricardo Wurmus
Subject:	Re: Help with sxml simple parser for the quicklisp importer
Date:	Wed, 23 Jan 2019 17:41:32 +0100
User-agent:	mu4e 1.0; emacs 26.1

swedebugia <address@hidden> writes:

> On 2019-01-23 16:58, Ricardo Wurmus wrote:
>>
>> swedebugia <address@hidden> writes:
>>
>>>> The second “link” tag opens but is never closed.  This may be valid
>>>> HTML, but it is not valid XML, which is what xml->sxml expects.
>>>
>>> Thanks for the quick answer!
>>> I will try to remove this line before handling over to the parser.
>>
>> I would recommend looking for a better source of package information.
>> Parsing HTML is not fun and is often brittle.
>
> I understand. Hm. Will try asking the author.
>
> Got a little further. Added this:
>
> (define (sanitize-html html)
>   "Correct an offending invalid line from the html source"
>   (let* ((html1 (regexp-substitute #f (string-match "main.css\">" html)
>                                    'pre "main.css\" />" 'post))
>          (result (regexp-substitute #f (string-match "utf-8\">" html1)
>                                     'pre "utf-8\" />" 'post)))
>     result))

It’s generally a bad idea to use regular expressions on HTML or XML.  Be
careful.

> sxml/simple.scm:143:4: In procedure loop:
> Throw to key `parser-error' with args `(#<input: string 24fdaf0>
> "[wf-entdeclared] broken for " copy)'.

I guess this is about the &copy; entity.  You may have to tell xml->sxml
about these HTML entities.

--
Ricardo

[Prev in Thread]

Current Thread

[Next in Thread]

Help with sxml simple parser for the quicklisp importer, swedebugia, 2019/01/23
- Re: Help with sxml simple parser for the quicklisp importer, Ricardo Wurmus, 2019/01/23
  - Re: Help with sxml simple parser for the quicklisp importer, swedebugia, 2019/01/23
    - Re: Help with sxml simple parser for the quicklisp importer, Ricardo Wurmus, 2019/01/23
    - Re: Help with sxml simple parser for the quicklisp importer, swedebugia, 2019/01/23
    - Re: Help with sxml simple parser for the quicklisp importer, Ricardo Wurmus <=
    - Re: Help with sxml simple parser for the quicklisp importer, Pierre Neidhardt, 2019/01/23
    - Re: Help with sxml simple parser for the quicklisp importer, swedebugia, 2019/01/23
    - Re: Help with sxml simple parser for the quicklisp importer, Katherine Cox-Buday, 2019/01/23

Prev by Date: Re: Help with sxml simple parser for the quicklisp importer
Next by Date: Re: Migrating to Guile-JSON 3.x
Previous by thread: Re: Help with sxml simple parser for the quicklisp importer
Next by thread: Re: Help with sxml simple parser for the quicklisp importer
Index(es):
- Date
- Thread