eliot-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Eliot-dev] eliot/dic compdic.cpp


From: eliot-dev
Subject: [Eliot-dev] eliot/dic compdic.cpp
Date: Fri, 22 Feb 2008 20:30:01 +0000

CVSROOT:        /cvsroot/eliot
Module name:    eliot
Changes by:     Olivier Teulière <ipkiss>      08/02/22 20:30:01

Modified files:
        dic            : compdic.cpp 

Log message:
         - Handle the BOM properly if it is present
         - Ignore empty lines even if the file format is not the native one

CVSWeb URLs:
http://cvs.savannah.gnu.org/viewcvs/eliot/dic/compdic.cpp?cvsroot=eliot&r1=1.2&r2=1.3

Patches:
Index: compdic.cpp
===================================================================
RCS file: /cvsroot/eliot/eliot/dic/compdic.cpp,v
retrieving revision 1.2
retrieving revision 1.3
diff -u -b -r1.2 -r1.3
--- compdic.cpp 8 Jan 2008 13:52:33 -0000       1.2
+++ compdic.cpp 22 Feb 2008 20:30:01 -0000      1.3
@@ -124,6 +124,10 @@
     string line;
     while (getline(in, line))
     {
+        // Ignore empty lines
+        if (line == "" || line == "\r" || line == "\n")
+            continue;
+
         // Split the lines on space characters
         vector<string> tokens;
         boost::char_separator<char> sep(" ");
@@ -134,17 +138,13 @@
             tokens.push_back(*it);
         }
 
-        // Ignore empty lines
-        if (tokens.empty())
-            continue;
-
         // We expect 5 fields on the line, and the first one is a letter, so
         // it cannot exceed 4 bytes
         if (tokens.size() != 5 || tokens[0].size() > 4)
         {
             ostringstream ss;
             ss << "readLetters: Invalid line in " << iFileName;
-            ss << " (line " << lineNb;
+            ss << " (line " << lineNb << ")";
             throw DicException(ss.str());
         }
 
@@ -156,10 +156,22 @@
 
         if (letter.size() != 1)
         {
+            // On the first line, there could be the BOM...
+            if (lineNb == 1 && tokens[0].size() > 3 &&
+                (uint8_t)tokens[0][0] == 0xEF &&
+                (uint8_t)tokens[0][1] == 0xBB &&
+                (uint8_t)tokens[0][2] == 0xBF)
+            {
+                // BOM detected, remove the first char in the wide string
+                letter.erase(0, 1);
+            }
+            else
+            {
             ostringstream ss;
             ss << "readLetters: Invalid letter at line " << lineNb;
             throw DicException(ss.str());
         }
+        }
 #undef MAX_SIZE
 
         ioHeaderInfo.letters += towupper(letter[0]);




reply via email to

[Prev in Thread] Current Thread [Next in Thread]