www-commits
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

www/server/source/sitemap-generator sitemap-gen...


From: Pavel Kharitonov
Subject: www/server/source/sitemap-generator sitemap-gen...
Date: Fri, 11 Apr 2014 17:48:07 +0000

CVSROOT:        /web/www
Module name:    www
Changes by:     Pavel Kharitonov <ineiev>       14/04/11 17:48:07

Modified files:
        server/source/sitemap-generator: sitemap-generator.py 

Log message:
        Skip HTML comments when extracting titles.

CVSWeb URLs:
http://web.cvs.savannah.gnu.org/viewcvs/www/server/source/sitemap-generator/sitemap-generator.py?cvsroot=www&r1=1.9&r2=1.10

Patches:
Index: sitemap-generator.py
===================================================================
RCS file: /web/www/www/server/source/sitemap-generator/sitemap-generator.py,v
retrieving revision 1.9
retrieving revision 1.10
diff -u -b -r1.9 -r1.10
--- sitemap-generator.py        2 Sep 2013 12:38:50 -0000       1.9
+++ sitemap-generator.py        11 Apr 2014 17:48:06 -0000      1.10
@@ -2,7 +2,7 @@
 #
 # Sitemap generator
 # Copyright © 2011-2012 Wacław Jacek
-# Copyright © 2013 Free Software Foundation, Inc.
+# Copyright © 2014 Free Software Foundation, Inc.
 #
 # This program is free software: you can redistribute it and/or modify
 # it under the terms of the GNU General Public License as published by
@@ -233,8 +233,17 @@
        if match:
                return replacement_titles[match.re.pattern]
        text = read_file(os.path.join(TOP_DIRECTORY, path))
-       title = extract_tags(text, ['h1', 'h2', 'h3'])
        encoding = determine_file_encoding(text, path)
+       idx = text.find('<!--')
+       while idx >= 0:
+               head = text[:idx]
+               tail = text[idx:]
+               idx = tail.find('-->')
+               if idx >= 0:
+                       tail = tail[idx:]
+                       text = head + tail
+                       idx = text.find('<!--')
+       title = extract_tags(text, ['h1', 'h2', 'h3'])
        if title:
                title = re.sub('<CODE>', '<code>', title)
                title = re.sub('</CODE>', '</code>', title)



reply via email to

[Prev in Thread] Current Thread [Next in Thread]