[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: automatic URLs for plain text format?

From: Dmitry Borodaenko
Subject: Re: automatic URLs for plain text format?
Date: Mon, 4 Dec 2006 13:37:14 +0000

On 12/4/06, boud <address@hidden> wrote:
There remains a bug for URLs of the type:


i.e. if a ":" is anywhere in the URL, but i think this probably happens
elsewhere when further "cleaning" up the content. The resulting html is


i'm not sure if this is a bug or a feature, since URLs with port
numbers are not very common any more and are probably not recommended
for usability, and i don't know whether the colon is considered to be
a standard character to be allowed in URLs.

It is a bug in Samizdat::Sanitize. This fragment of xhtml.yaml is to blame:

&uri !ruby/regexp /\A(http:|https:|ftp:|mailto:)?[^:]+\z/i

This needs to remain more restrictive than regexps in uri/common.rb
(to make sure no JavaScript invokation can creep in), but colon is
certainly allowed within URLs so [^:] part should be replaced. Any

BTW: Quite a big discussion on looking at cms'es for the "IMC Alternatives"
collective/website is going on at:

I know that spam filtering is a major concern for Chuck, so I think we
should get anti-abuse measures working in Samizdat before approaching

My guess is the actual imc-cms discussion process is de facto
suspended since looking for servers is a huge priority issue right
now. On the other hand, people wanting a new cms are not going to wait
for the imc-cms group to come up with a formal, structured decision,
and IMHO they're not going to try samizdat unless someone "techie"
helps them.  i'm unlikely to have time, but i thought i'd mention it

Setting up another Samizdat site on a machine that already has the
software installed is really trivial, I can do that in 15-30 minutes.
The hard part that I didn't fully automate yet is the backup/mirroring
scripts, that's something that needs some arrangement on the receiving
end of backups. I also wonder how Samizdat will cope with a real-world
high-traffic sites: synthetic tests can only get you to a certain
point, real world is always different...

BTW(2): i made my very first samizdat cvs commit - on the file
about.html .  Maybe cvs is not so complicated anyway, it's just

info cvs

which gives a huge amount of info and warnings. :)

CVS is not complicated at all, until you get file conflicts and
branching, you only need to care about 'cvs up' and 'cvs ci' commands.

Dmitry Borodaenko

reply via email to

[Prev in Thread] Current Thread [Next in Thread]