www/software/perl/manual index.html perldoc-all...

www-commits
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
www/software/perl/manual index.html perldoc-all...

From:	karl
Subject:	www/software/perl/manual index.html perldoc-all...
Date:	Sun, 14 Jun 2015 22:49:58 +0000
CVSROOT:        /web/www
Module name:    www
Changes by:     karl <karl>     15/06/14 22:49:57

Modified files:
        software/perl/manual: index.html perldoc-all.dvi.gz 
                              perldoc-all.html perldoc-all.html.gz 
                              perldoc-all.html_chapter.tar.gz 
                              perldoc-all.info.tar.gz perldoc-all.pdf 
                              perldoc-all.texi.tar.gz 

Log message:
        perl 5.22.0 (from contrib/perldoc-all in texinfo)

CVSWeb URLs:
http://web.cvs.savannah.gnu.org/viewcvs/www/software/perl/manual/index.html?cvsroot=www&r1=1.9&r2=1.10
http://web.cvs.savannah.gnu.org/viewcvs/www/software/perl/manual/perldoc-all.dvi.gz?cvsroot=www&rev=1.9
http://web.cvs.savannah.gnu.org/viewcvs/www/software/perl/manual/perldoc-all.html?cvsroot=www&r1=1.8&r2=1.9
http://web.cvs.savannah.gnu.org/viewcvs/www/software/perl/manual/perldoc-all.html.gz?cvsroot=www&rev=1.9
http://web.cvs.savannah.gnu.org/viewcvs/www/software/perl/manual/perldoc-all.html_chapter.tar.gz?cvsroot=www&rev=1.9
http://web.cvs.savannah.gnu.org/viewcvs/www/software/perl/manual/perldoc-all.info.tar.gz?cvsroot=www&rev=1.9
http://web.cvs.savannah.gnu.org/viewcvs/www/software/perl/manual/perldoc-all.pdf?cvsroot=www&rev=1.9
http://web.cvs.savannah.gnu.org/viewcvs/www/software/perl/manual/perldoc-all.texi.tar.gz?cvsroot=www&rev=1.9

Patches:
Index: index.html
===================================================================
RCS file: /web/www/www/software/perl/manual/index.html,v
retrieving revision 1.9
retrieving revision 1.10
diff -u -b -r1.9 -r1.10
--- index.html  4 Apr 2015 16:14:51 -0000       1.9
+++ index.html  14 Jun 2015 22:49:49 -0000      1.10
@@ -4,7 +4,7 @@
 <h2>Perl documentation in Texinfo</h2>
 
 <address>GNU Project</address>
-<address>last updated April 04, 2015</address>
+<address>last updated June 14, 2015</address>
 
 <p>This translation of the <a href="http://perldoc.perl.org/";>Perl
 documentation</a> from POD to Texinfo is not official, and not endorsed
@@ -21,23 +21,23 @@
 
 <ul>
 <li><a href="perldoc-all.html">HTML
-    (5468K bytes)</a> - entirely on one web page.</li>
+    (5600K bytes)</a> - entirely on one web page.</li>
 <li><a href="html_chapter/index.html">HTML</a> - with one web page per
     chapter.</li>
 <li><a href="perldoc-all.html.gz">HTML compressed
-    (1324K gzipped characters)</a> - entirely on
+    (1360K gzipped characters)</a> - entirely on
     one web page.</li>
 <li><a href="perldoc-all.html_chapter.tar.gz">HTML compressed
-    (1768K gzipped tar file)</a> -
+    (1820K gzipped tar file)</a> -
     with one web page per chapter.</li>
 <li><a href="perldoc-all.info.tar.gz">Info document
-    (1188K bytes gzipped tar file)</a>.</li>
+    (1220K bytes gzipped tar file)</a>.</li>
 <li><a href="perldoc-all.dvi.gz">TeX dvi file
-    (1752K bytes gzipped)</a>.</li>
+    (1800K bytes gzipped)</a>.</li>
 <li><a href="perldoc-all.pdf">PDF file
-    (3988K bytes)</a>.</li>
+    (4100K bytes)</a>.</li>
 <li><a href="perldoc-all.texi.tar.gz">Texinfo source
-    (1132K bytes gzipped tar file).</a></li>
+    (1164K bytes gzipped tar file).</a></li>
 </ul>
 
 <p>You can <a href="http://shop.fsf.org/";>buy printed copies of

Index: perldoc-all.dvi.gz
===================================================================
RCS file: /web/www/www/software/perl/manual/perldoc-all.dvi.gz,v
retrieving revision 1.8
retrieving revision 1.9
diff -u -b -r1.8 -r1.9
Binary files /tmp/cvs63uiZj and /tmp/cvsigUJV1 differ

Index: perldoc-all.html
===================================================================
RCS file: /web/www/www/software/perl/manual/perldoc-all.html,v
retrieving revision 1.8
retrieving revision 1.9
diff -u -b -r1.8 -r1.9
--- perldoc-all.html    4 Apr 2015 16:14:52 -0000       1.8
+++ perldoc-all.html    14 Jun 2015 22:49:50 -0000      1.9
@@ -1,6 +1,6 @@
 <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" 
"http://www.w3.org/TR/html4/loose.dtd";>
 <html>
-<!-- Created by Texinfo 5.9.90+dev, http://www.gnu.org/software/texinfo/ -->
+<!-- Created by Texinfo 5.9.93+dev, http://www.gnu.org/software/texinfo/ -->
 <head>
 <title>Perl pod documentation</title>
 
@@ -41,6 +41,7 @@
 ul.no-bullet {list-style: none}
 -->
 </style>
+<link rel="stylesheet" type="text/css" href="/software/gnulib/manual.css">
 
 
 </head>
@@ -322,10 +323,11 @@
       <li><a name="toc-Scalar-values" href="#perldata-Scalar-values">11.2.4 
Scalar values</a></li>
       <li><a name="toc-Scalar-value-constructors" 
href="#perldata-Scalar-value-constructors">11.2.5 Scalar value constructors</a>
       <ul class="no-bullet">
-        <li><a name="toc-Version-Strings" 
href="#perldata-Version-Strings">11.2.5.1 Version Strings</a></li>
-        <li><a name="toc-Special-Literals" 
href="#perldata-Special-Literals">11.2.5.2 Special Literals</a></li>
-        <li><a name="toc-Barewords" href="#perldata-Barewords">11.2.5.3 
Barewords</a></li>
-        <li><a name="toc-Array-Interpolation" 
href="#perldata-Array-Interpolation">11.2.5.4 Array Interpolation</a></li>
+        <li><a 
name="toc-Special-floating-point_003a-infinity-_0028Inf_0029-and-not_002da_002dnumber-_0028NaN_0029"
 
href="#perldata-Special-floating-point_003a-infinity-_0028Inf_0029-and-not_002da_002dnumber-_0028NaN_0029">11.2.5.1
 Special floating point: infinity (Inf) and not-a-number (NaN)</a></li>
+        <li><a name="toc-Version-Strings" 
href="#perldata-Version-Strings">11.2.5.2 Version Strings</a></li>
+        <li><a name="toc-Special-Literals" 
href="#perldata-Special-Literals">11.2.5.3 Special Literals</a></li>
+        <li><a name="toc-Barewords" href="#perldata-Barewords">11.2.5.4 
Barewords</a></li>
+        <li><a name="toc-Array-Interpolation" 
href="#perldata-Array-Interpolation">11.2.5.5 Array Interpolation</a></li>
       </ul></li>
       <li><a name="toc-List-value-constructors" 
href="#perldata-List-value-constructors">11.2.6 List value constructors</a></li>
       <li><a name="toc-Subscripts" href="#perldata-Subscripts">11.2.7 
Subscripts</a></li>
@@ -491,13 +493,16 @@
       <li><a name="toc-EBCDIC" href="#perlebcdic-EBCDIC">19.3.4 EBCDIC</a>
       <ul class="no-bullet">
         <li><a name="toc-The-13-variant-characters" 
href="#perlebcdic-The-13-variant-characters">19.3.4.1 The 13 variant 
characters</a></li>
+        <li><a name="toc-EBCDIC-code-sets-recognized-by-Perl" 
href="#perlebcdic-EBCDIC-code-sets-recognized-by-Perl">19.3.4.2 EBCDIC code 
sets recognized by Perl</a></li>
       </ul></li>
       <li><a name="toc-Unicode-code-points-versus-EBCDIC-code-points" 
href="#perlebcdic-Unicode-code-points-versus-EBCDIC-code-points">19.3.5 Unicode 
code points versus EBCDIC code points</a></li>
-      <li><a name="toc-Remaining-Perl-Unicode-problems-in-EBCDIC" 
href="#perlebcdic-Remaining-Perl-Unicode-problems-in-EBCDIC">19.3.6 Remaining 
Perl Unicode problems in EBCDIC</a></li>
-      <li><a name="toc-Unicode-and-UTF" 
href="#perlebcdic-Unicode-and-UTF">19.3.7 Unicode and UTF</a></li>
-      <li><a name="toc-Using-Encode" href="#perlebcdic-Using-Encode">19.3.8 
Using Encode</a></li>
+      <li><a name="toc-Unicode-and-UTF" 
href="#perlebcdic-Unicode-and-UTF">19.3.6 Unicode and UTF</a></li>
+      <li><a name="toc-Using-Encode" href="#perlebcdic-Using-Encode">19.3.7 
Using Encode</a></li>
+    </ul></li>
+    <li><a name="toc-SINGLE-OCTET-TABLES" 
href="#perlebcdic-SINGLE-OCTET-TABLES">19.4 SINGLE OCTET TABLES</a>
+    <ul class="no-bullet">
+      <li><a name="toc-Table-in-hex_002c-sorted-in-1047-order" 
href="#perlebcdic-Table-in-hex_002c-sorted-in-1047-order">19.4.1 Table in hex, 
sorted in 1047 order</a></li>
     </ul></li>
-    <li><a name="toc-SINGLE-OCTET-TABLES" 
href="#perlebcdic-SINGLE-OCTET-TABLES">19.4 SINGLE OCTET TABLES</a></li>
     <li><a name="toc-IDENTIFYING-CHARACTER-CODE-SETS" 
href="#perlebcdic-IDENTIFYING-CHARACTER-CODE-SETS">19.5 IDENTIFYING CHARACTER 
CODE SETS</a></li>
     <li><a name="toc-CONVERSIONS" href="#perlebcdic-CONVERSIONS">19.6 
CONVERSIONS</a>
     <ul class="no-bullet">
@@ -513,8 +518,8 @@
     <li><a name="toc-SORTING" href="#perlebcdic-SORTING">19.11 SORTING</a>
     <ul class="no-bullet">
       <li><a name="toc-Ignore-ASCII-vs_002e-EBCDIC-sort-differences_002e" 
href="#perlebcdic-Ignore-ASCII-vs_002e-EBCDIC-sort-differences_002e">19.11.1 
Ignore ASCII vs. EBCDIC sort differences.</a></li>
-      <li><a name="toc-MONO-CASE-then-sort-data_002e" 
href="#perlebcdic-MONO-CASE-then-sort-data_002e">19.11.2 MONO CASE then sort 
data.</a></li>
-      <li><a name="toc-Convert_002c-sort-data_002c-then-re-convert_002e" 
href="#perlebcdic-Convert_002c-sort-data_002c-then-re-convert_002e">19.11.3 
Convert, sort data, then re convert.</a></li>
+      <li><a name="toc-Use-a-sort-helper-function" 
href="#perlebcdic-Use-a-sort-helper-function">19.11.2 Use a sort helper 
function</a></li>
+      <li><a 
name="toc-MONO-CASE-then-sort-data-_0028for-non_002ddigits_002c-non_002dunderscore_0029"
 
href="#perlebcdic-MONO-CASE-then-sort-data-_0028for-non_002ddigits_002c-non_002dunderscore_0029">19.11.3
 MONO CASE then sort data (for non-digits, non-underscore)</a></li>
       <li><a name="toc-Perform-sorting-on-one-type-of-platform-only_002e" 
href="#perlebcdic-Perform-sorting-on-one-type-of-platform-only_002e">19.11.4 
Perform sorting on one type of platform only.</a></li>
     </ul></li>
     <li><a name="toc-TRANSFORMATION-FORMATS" 
href="#perlebcdic-TRANSFORMATION-FORMATS">19.12 TRANSFORMATION FORMATS</a>
@@ -590,10 +595,11 @@
     <li><a name="toc-WRITING-A-SOURCE-FILTER-IN-PERL" 
href="#perlfilter-WRITING-A-SOURCE-FILTER-IN-PERL">22.8 WRITING A SOURCE FILTER 
IN PERL</a></li>
     <li><a name="toc-USING-CONTEXT_003a-THE-DEBUG-FILTER" 
href="#perlfilter-USING-CONTEXT_003a-THE-DEBUG-FILTER">22.9 USING CONTEXT: THE 
DEBUG FILTER</a></li>
     <li><a name="toc-CONCLUSION" href="#perlfilter-CONCLUSION">22.10 
CONCLUSION</a></li>
-    <li><a name="toc-THINGS-TO-LOOK-OUT-FOR" 
href="#perlfilter-THINGS-TO-LOOK-OUT-FOR">22.11 THINGS TO LOOK OUT FOR</a></li>
-    <li><a name="toc-REQUIREMENTS" href="#perlfilter-REQUIREMENTS">22.12 
REQUIREMENTS</a></li>
-    <li><a name="toc-AUTHOR-9" href="#perlfilter-AUTHOR">22.13 AUTHOR</a></li>
-    <li><a name="toc-Copyrights" href="#perlfilter-Copyrights">22.14 
Copyrights</a></li>
+    <li><a name="toc-LIMITATIONS" href="#perlfilter-LIMITATIONS">22.11 
LIMITATIONS</a></li>
+    <li><a name="toc-THINGS-TO-LOOK-OUT-FOR" 
href="#perlfilter-THINGS-TO-LOOK-OUT-FOR">22.12 THINGS TO LOOK OUT FOR</a></li>
+    <li><a name="toc-REQUIREMENTS" href="#perlfilter-REQUIREMENTS">22.13 
REQUIREMENTS</a></li>
+    <li><a name="toc-AUTHOR-9" href="#perlfilter-AUTHOR">22.14 AUTHOR</a></li>
+    <li><a name="toc-Copyrights" href="#perlfilter-Copyrights">22.15 
Copyrights</a></li>
   </ul></li>
   <li><a name="toc-perlfork-1" href="#perlfork">23 perlfork</a>
   <ul class="no-bullet">
@@ -770,7 +776,8 @@
       <li><a name="toc-How-does-UTF_002d8-represent-Unicode-characters_003f" 
href="#perlguts-How-does-UTF_002d8-represent-Unicode-characters_003f">28.11.3 
How does UTF-8 represent Unicode characters?</a></li>
       <li><a name="toc-How-does-Perl-store-UTF_002d8-strings_003f" 
href="#perlguts-How-does-Perl-store-UTF_002d8-strings_003f">28.11.4 How does 
Perl store UTF-8 strings?</a></li>
       <li><a name="toc-How-do-I-convert-a-string-to-UTF_002d8_003f" 
href="#perlguts-How-do-I-convert-a-string-to-UTF_002d8_003f">28.11.5 How do I 
convert a string to UTF-8?</a></li>
-      <li><a name="toc-Is-there-anything-else-I-need-to-know_003f" 
href="#perlguts-Is-there-anything-else-I-need-to-know_003f">28.11.6 Is there 
anything else I need to know?</a></li>
+      <li><a name="toc-How-do-I-compare-strings_003f" 
href="#perlguts-How-do-I-compare-strings_003f">28.11.6 How do I compare 
strings?</a></li>
+      <li><a name="toc-Is-there-anything-else-I-need-to-know_003f" 
href="#perlguts-Is-there-anything-else-I-need-to-know_003f">28.11.7 Is there 
anything else I need to know?</a></li>
     </ul></li>
     <li><a name="toc-Custom-Operators" href="#perlguts-Custom-Operators">28.12 
Custom Operators</a></li>
     <li><a name="toc-AUTHORS-2" href="#perlguts-AUTHORS">28.13 AUTHORS</a></li>
@@ -835,6 +842,7 @@
       <ul class="no-bullet">
         <li><a name="toc-Other-environment-variables-that-may-influence-tests" 
href="#perlhack-Other-environment-variables-that-may-influence-tests">29.8.4.1 
Other environment variables that may influence tests</a></li>
       </ul></li>
+      <li><a name="toc-Performance-testing" 
href="#perlhack-Performance-testing">29.8.5 Performance testing</a></li>
     </ul></li>
     <li><a name="toc-MORE-READING-FOR-GUTS-HACKERS" 
href="#perlhack-MORE-READING-FOR-GUTS-HACKERS">29.9 MORE READING FOR GUTS 
HACKERS</a></li>
     <li><a name="toc-CPAN-TESTERS-AND-PERL-SMOKERS" 
href="#perlhack-CPAN-TESTERS-AND-PERL-SMOKERS">29.10 CPAN TESTERS AND PERL 
SMOKERS</a></li>
@@ -888,10 +896,11 @@
       <li><a name="toc-PERL_005fDESTRUCT_005fLEVEL" 
href="#perlhacktips-PERL_005fDESTRUCT_005fLEVEL">30.8.1 
PERL_DESTRUCT_LEVEL</a></li>
       <li><a name="toc-PERL_005fMEM_005fLOG" 
href="#perlhacktips-PERL_005fMEM_005fLOG">30.8.2 PERL_MEM_LOG</a></li>
       <li><a name="toc-DDD-over-gdb" href="#perlhacktips-DDD-over-gdb">30.8.3 
DDD over gdb</a></li>
-      <li><a name="toc-Poison" href="#perlhacktips-Poison">30.8.4 
Poison</a></li>
-      <li><a name="toc-Read_002donly-optrees" 
href="#perlhacktips-Read_002donly-optrees">30.8.5 Read-only optrees</a></li>
-      <li><a name="toc-When-is-a-bool-not-a-bool_003f" 
href="#perlhacktips-When-is-a-bool-not-a-bool_003f">30.8.6 When is a bool not a 
bool?</a></li>
-      <li><a name="toc-The-_002ei-Targets" 
href="#perlhacktips-The-_002ei-Targets">30.8.7 The .i Targets</a></li>
+      <li><a name="toc-C-backtrace" href="#perlhacktips-C-backtrace">30.8.4 C 
backtrace</a></li>
+      <li><a name="toc-Poison" href="#perlhacktips-Poison">30.8.5 
Poison</a></li>
+      <li><a name="toc-Read_002donly-optrees" 
href="#perlhacktips-Read_002donly-optrees">30.8.6 Read-only optrees</a></li>
+      <li><a name="toc-When-is-a-bool-not-a-bool_003f" 
href="#perlhacktips-When-is-a-bool-not-a-bool_003f">30.8.7 When is a bool not a 
bool?</a></li>
+      <li><a name="toc-The-_002ei-Targets" 
href="#perlhacktips-The-_002ei-Targets">30.8.8 The .i Targets</a></li>
     </ul></li>
     <li><a name="toc-AUTHOR-12" href="#perlhacktips-AUTHOR">30.9 
AUTHOR</a></li>
   </ul></li>
@@ -1046,7 +1055,7 @@
     <li><a name="toc-PREPARING-TO-USE-LOCALES" 
href="#perllocale-PREPARING-TO-USE-LOCALES">38.4 PREPARING TO USE 
LOCALES</a></li>
     <li><a name="toc-USING-LOCALES" href="#perllocale-USING-LOCALES">38.5 
USING LOCALES</a>
     <ul class="no-bullet">
-      <li><a name="toc-The-use-locale-pragma" 
href="#perllocale-The-use-locale-pragma">38.5.1 The use locale pragma</a></li>
+      <li><a name="toc-The-_0022use-locale_0022-pragma" 
href="#perllocale-The-_0022use-locale_0022-pragma">38.5.1 The <code>&quot;use 
locale&quot;</code> pragma</a></li>
       <li><a name="toc-The-setlocale-function" 
href="#perllocale-The-setlocale-function">38.5.2 The setlocale function</a></li>
       <li><a name="toc-Finding-locales" 
href="#perllocale-Finding-locales">38.5.3 Finding locales</a></li>
       <li><a name="toc-LOCALE-PROBLEMS" 
href="#perllocale-LOCALE-PROBLEMS">38.5.4 LOCALE PROBLEMS</a></li>
@@ -1147,6 +1156,7 @@
       <li><a name="toc-Has-it-been-done-before_003f" 
href="#perlmodstyle-Has-it-been-done-before_003f">42.4.1 Has it been done 
before?</a></li>
       <li><a name="toc-Do-one-thing-and-do-it-well" 
href="#perlmodstyle-Do-one-thing-and-do-it-well">42.4.2 Do one thing and do it 
well</a></li>
       <li><a name="toc-What_0027s-in-a-name_003f" 
href="#perlmodstyle-What_0027s-in-a-name_003f">42.4.3 What&rsquo;s in a 
name?</a></li>
+      <li><a name="toc-Get-feedback-before-publishing" 
href="#perlmodstyle-Get-feedback-before-publishing">42.4.4 Get feedback before 
publishing</a></li>
     </ul></li>
     <li><a name="toc-DESIGNING-AND-WRITING-YOUR-MODULE" 
href="#perlmodstyle-DESIGNING-AND-WRITING-YOUR-MODULE">42.5 DESIGNING AND 
WRITING YOUR MODULE</a>
     <ul class="no-bullet">
@@ -1752,7 +1762,8 @@
     <ul class="no-bullet">
       <li><a name="toc-Postfix-Reference-Slicing" 
href="#perlref-Postfix-Reference-Slicing">62.5.1 Postfix Reference 
Slicing</a></li>
     </ul></li>
-    <li><a name="toc-SEE-ALSO-31" href="#perlref-SEE-ALSO">62.6 SEE 
ALSO</a></li>
+    <li><a name="toc-Assigning-to-References" 
href="#perlref-Assigning-to-References">62.6 Assigning to References</a></li>
+    <li><a name="toc-SEE-ALSO-31" href="#perlref-SEE-ALSO">62.7 SEE 
ALSO</a></li>
   </ul></li>
   <li><a name="toc-perlreftut-1" href="#perlreftut">63 perlreftut</a>
   <ul class="no-bullet">
@@ -1847,6 +1858,7 @@
       <li><a name="toc-More-matching" href="#perlrequick-More-matching">66.3.7 
More matching</a></li>
       <li><a name="toc-Search-and-replace" 
href="#perlrequick-Search-and-replace">66.3.8 Search and replace</a></li>
       <li><a name="toc-The-split-operator" 
href="#perlrequick-The-split-operator">66.3.9 The split operator</a></li>
+      <li><a name="toc-use-re-_0027strict_0027" 
href="#perlrequick-use-re-_0027strict_0027">66.3.10 <code>use re 
'strict'</code></a></li>
     </ul></li>
     <li><a name="toc-BUGS-7" href="#perlrequick-BUGS">66.4 BUGS</a></li>
     <li><a name="toc-SEE-ALSO-33" href="#perlrequick-SEE-ALSO">66.5 SEE 
ALSO</a></li>
@@ -2163,40 +2175,38 @@
     <ul class="no-bullet">
       <li><a name="toc-Important-Caveats" 
href="#perlunicode-Important-Caveats">81.2.1 Important Caveats</a></li>
       <li><a name="toc-Byte-and-Character-Semantics" 
href="#perlunicode-Byte-and-Character-Semantics">81.2.2 Byte and Character 
Semantics</a></li>
-      <li><a name="toc-Effects-of-Character-Semantics" 
href="#perlunicode-Effects-of-Character-Semantics">81.2.3 Effects of Character 
Semantics</a></li>
-      <li><a name="toc-Unicode-Character-Properties" 
href="#perlunicode-Unicode-Character-Properties">81.2.4 Unicode Character 
Properties</a>
-      <ul class="no-bullet">
-        <li><a name="toc-General_005fCategory" 
href="#perlunicode-General_005fCategory">81.2.4.1 
<strong>General_Category</strong></a></li>
-        <li><a name="toc-Bidirectional-Character-Types" 
href="#perlunicode-Bidirectional-Character-Types">81.2.4.2 
<strong>Bidirectional Character Types</strong></a></li>
-        <li><a name="toc-Scripts" href="#perlunicode-Scripts">81.2.4.3 
<strong>Scripts</strong></a></li>
-        <li><a name="toc-Use-of-the-_0022Is_0022-Prefix" 
href="#perlunicode-Use-of-the-_0022Is_0022-Prefix">81.2.4.4 <strong>Use of the 
<code>&quot;Is&quot;</code> Prefix</strong></a></li>
-        <li><a name="toc-Blocks" href="#perlunicode-Blocks">81.2.4.5 
<strong>Blocks</strong></a></li>
-        <li><a name="toc-Other-Properties" 
href="#perlunicode-Other-Properties">81.2.4.6 <strong>Other 
Properties</strong></a></li>
-      </ul></li>
-      <li><a name="toc-User_002dDefined-Character-Properties" 
href="#perlunicode-User_002dDefined-Character-Properties">81.2.5 User-Defined 
Character Properties</a></li>
-      <li><a 
name="toc-User_002dDefined-Case-Mappings-_0028for-serious-hackers-only_0029" 
href="#perlunicode-User_002dDefined-Case-Mappings-_0028for-serious-hackers-only_0029">81.2.6
 User-Defined Case Mappings (for serious hackers only)</a></li>
-      <li><a name="toc-Character-Encodings-for-Input-and-Output" 
href="#perlunicode-Character-Encodings-for-Input-and-Output">81.2.7 Character 
Encodings for Input and Output</a></li>
-      <li><a name="toc-Unicode-Regular-Expression-Support-Level" 
href="#perlunicode-Unicode-Regular-Expression-Support-Level">81.2.8 Unicode 
Regular Expression Support Level</a></li>
-      <li><a name="toc-Unicode-Encodings" 
href="#perlunicode-Unicode-Encodings">81.2.9 Unicode Encodings</a></li>
-      <li><a name="toc-Non_002dcharacter-code-points" 
href="#perlunicode-Non_002dcharacter-code-points">81.2.10 Non-character code 
points</a></li>
-      <li><a name="toc-Beyond-Unicode-code-points" 
href="#perlunicode-Beyond-Unicode-code-points">81.2.11 Beyond Unicode code 
points</a></li>
-      <li><a name="toc-Security-Implications-of-Unicode" 
href="#perlunicode-Security-Implications-of-Unicode">81.2.12 Security 
Implications of Unicode</a></li>
-      <li><a name="toc-Unicode-in-Perl-on-EBCDIC" 
href="#perlunicode-Unicode-in-Perl-on-EBCDIC">81.2.13 Unicode in Perl on 
EBCDIC</a></li>
-      <li><a name="toc-Locales" href="#perlunicode-Locales">81.2.14 
Locales</a></li>
-      <li><a name="toc-When-Unicode-Does-Not-Happen" 
href="#perlunicode-When-Unicode-Does-Not-Happen">81.2.15 When Unicode Does Not 
Happen</a></li>
-      <li><a name="toc-The-_0022Unicode-Bug_0022" 
href="#perlunicode-The-_0022Unicode-Bug_0022">81.2.16 The &quot;Unicode 
Bug&quot;</a></li>
-      <li><a 
name="toc-Forcing-Unicode-in-Perl-_0028Or-Unforcing-Unicode-in-Perl_0029" 
href="#perlunicode-Forcing-Unicode-in-Perl-_0028Or-Unforcing-Unicode-in-Perl_0029">81.2.17
 Forcing Unicode in Perl (Or Unforcing Unicode in Perl)</a></li>
-      <li><a name="toc-Using-Unicode-in-XS" 
href="#perlunicode-Using-Unicode-in-XS">81.2.18 Using Unicode in XS</a></li>
-      <li><a 
name="toc-Hacking-Perl-to-work-on-earlier-Unicode-versions-_0028for-very-serious-hackers-only_0029"
 
href="#perlunicode-Hacking-Perl-to-work-on-earlier-Unicode-versions-_0028for-very-serious-hackers-only_0029">81.2.19
 Hacking Perl to work on earlier Unicode versions (for very serious hackers 
only)</a></li>
+      <li><a name="toc-ASCII-Rules-versus-Unicode-Rules" 
href="#perlunicode-ASCII-Rules-versus-Unicode-Rules">81.2.3 ASCII Rules versus 
Unicode Rules</a></li>
+      <li><a 
name="toc-Extended-Grapheme-Clusters-_0028Logical-characters_0029" 
href="#perlunicode-Extended-Grapheme-Clusters-_0028Logical-characters_0029">81.2.4
 Extended Grapheme Clusters (Logical characters)</a></li>
+      <li><a name="toc-Unicode-Character-Properties" 
href="#perlunicode-Unicode-Character-Properties">81.2.5 Unicode Character 
Properties</a>
+      <ul class="no-bullet">
+        <li><a name="toc-General_005fCategory" 
href="#perlunicode-General_005fCategory">81.2.5.1 
<strong>General_Category</strong></a></li>
+        <li><a name="toc-Bidirectional-Character-Types" 
href="#perlunicode-Bidirectional-Character-Types">81.2.5.2 
<strong>Bidirectional Character Types</strong></a></li>
+        <li><a name="toc-Scripts" href="#perlunicode-Scripts">81.2.5.3 
<strong>Scripts</strong></a></li>
+        <li><a name="toc-Use-of-the-_0022Is_0022-Prefix" 
href="#perlunicode-Use-of-the-_0022Is_0022-Prefix">81.2.5.4 <strong>Use of the 
<code>&quot;Is&quot;</code> Prefix</strong></a></li>
+        <li><a name="toc-Blocks" href="#perlunicode-Blocks">81.2.5.5 
<strong>Blocks</strong></a></li>
+        <li><a name="toc-Other-Properties" 
href="#perlunicode-Other-Properties">81.2.5.6 <strong>Other 
Properties</strong></a></li>
+      </ul></li>
+      <li><a name="toc-User_002dDefined-Character-Properties" 
href="#perlunicode-User_002dDefined-Character-Properties">81.2.6 User-Defined 
Character Properties</a></li>
+      <li><a 
name="toc-User_002dDefined-Case-Mappings-_0028for-serious-hackers-only_0029" 
href="#perlunicode-User_002dDefined-Case-Mappings-_0028for-serious-hackers-only_0029">81.2.7
 User-Defined Case Mappings (for serious hackers only)</a></li>
+      <li><a name="toc-Character-Encodings-for-Input-and-Output" 
href="#perlunicode-Character-Encodings-for-Input-and-Output">81.2.8 Character 
Encodings for Input and Output</a></li>
+      <li><a name="toc-Unicode-Regular-Expression-Support-Level" 
href="#perlunicode-Unicode-Regular-Expression-Support-Level">81.2.9 Unicode 
Regular Expression Support Level</a></li>
+      <li><a name="toc-Unicode-Encodings" 
href="#perlunicode-Unicode-Encodings">81.2.10 Unicode Encodings</a></li>
+      <li><a name="toc-Noncharacter-code-points" 
href="#perlunicode-Noncharacter-code-points">81.2.11 Noncharacter code 
points</a></li>
+      <li><a name="toc-Beyond-Unicode-code-points" 
href="#perlunicode-Beyond-Unicode-code-points">81.2.12 Beyond Unicode code 
points</a></li>
+      <li><a name="toc-Security-Implications-of-Unicode" 
href="#perlunicode-Security-Implications-of-Unicode">81.2.13 Security 
Implications of Unicode</a></li>
+      <li><a name="toc-Unicode-in-Perl-on-EBCDIC" 
href="#perlunicode-Unicode-in-Perl-on-EBCDIC">81.2.14 Unicode in Perl on 
EBCDIC</a></li>
+      <li><a name="toc-Locales" href="#perlunicode-Locales">81.2.15 
Locales</a></li>
+      <li><a name="toc-When-Unicode-Does-Not-Happen" 
href="#perlunicode-When-Unicode-Does-Not-Happen">81.2.16 When Unicode Does Not 
Happen</a></li>
+      <li><a name="toc-The-_0022Unicode-Bug_0022" 
href="#perlunicode-The-_0022Unicode-Bug_0022">81.2.17 The &quot;Unicode 
Bug&quot;</a></li>
+      <li><a 
name="toc-Forcing-Unicode-in-Perl-_0028Or-Unforcing-Unicode-in-Perl_0029" 
href="#perlunicode-Forcing-Unicode-in-Perl-_0028Or-Unforcing-Unicode-in-Perl_0029">81.2.18
 Forcing Unicode in Perl (Or Unforcing Unicode in Perl)</a></li>
+      <li><a name="toc-Using-Unicode-in-XS" 
href="#perlunicode-Using-Unicode-in-XS">81.2.19 Using Unicode in XS</a></li>
+      <li><a 
name="toc-Hacking-Perl-to-work-on-earlier-Unicode-versions-_0028for-very-serious-hackers-only_0029"
 
href="#perlunicode-Hacking-Perl-to-work-on-earlier-Unicode-versions-_0028for-very-serious-hackers-only_0029">81.2.20
 Hacking Perl to work on earlier Unicode versions (for very serious hackers 
only)</a></li>
+      <li><a name="toc-Porting-code-from-perl_002d5_002e6_002eX" 
href="#perlunicode-Porting-code-from-perl_002d5_002e6_002eX">81.2.21 Porting 
code from perl-5.6.X</a></li>
     </ul></li>
     <li><a name="toc-BUGS-10" href="#perlunicode-BUGS">81.3 BUGS</a>
     <ul class="no-bullet">
-      <li><a name="toc-Interaction-with-Locales" 
href="#perlunicode-Interaction-with-Locales">81.3.1 Interaction with 
Locales</a></li>
-      <li><a 
name="toc-Problems-with-characters-in-the-Latin_002d1-Supplement-range" 
href="#perlunicode-Problems-with-characters-in-the-Latin_002d1-Supplement-range">81.3.2
 Problems with characters in the Latin-1 Supplement range</a></li>
-      <li><a name="toc-Interaction-with-Extensions" 
href="#perlunicode-Interaction-with-Extensions">81.3.3 Interaction with 
Extensions</a></li>
-      <li><a name="toc-Speed" href="#perlunicode-Speed">81.3.4 Speed</a></li>
-      <li><a name="toc-Problems-on-EBCDIC-platforms" 
href="#perlunicode-Problems-on-EBCDIC-platforms">81.3.5 Problems on EBCDIC 
platforms</a></li>
-      <li><a name="toc-Porting-code-from-perl_002d5_002e6_002eX" 
href="#perlunicode-Porting-code-from-perl_002d5_002e6_002eX">81.3.6 Porting 
code from perl-5.6.X</a></li>
+      <li><a name="toc-Interaction-with-Extensions" 
href="#perlunicode-Interaction-with-Extensions">81.3.1 Interaction with 
Extensions</a></li>
+      <li><a name="toc-Speed" href="#perlunicode-Speed">81.3.2 Speed</a></li>
     </ul></li>
     <li><a name="toc-SEE-ALSO-40" href="#perlunicode-SEE-ALSO">81.4 SEE 
ALSO</a></li>
   </ul></li>
@@ -2244,7 +2254,10 @@
       <li><a name="toc-Perl_0027s-Unicode-Support" 
href="#perluniintro-Perl_0027s-Unicode-Support">83.2.2 Perl&rsquo;s Unicode 
Support</a></li>
       <li><a name="toc-Perl_0027s-Unicode-Model" 
href="#perluniintro-Perl_0027s-Unicode-Model">83.2.3 Perl&rsquo;s Unicode 
Model</a></li>
       <li><a name="toc-Unicode-and-EBCDIC" 
href="#perluniintro-Unicode-and-EBCDIC">83.2.4 Unicode and EBCDIC</a></li>
-      <li><a name="toc-Creating-Unicode" 
href="#perluniintro-Creating-Unicode">83.2.5 Creating Unicode</a></li>
+      <li><a name="toc-Creating-Unicode" 
href="#perluniintro-Creating-Unicode">83.2.5 Creating Unicode</a>
+      <ul class="no-bullet">
+        <li><a name="toc-Earlier-releases-caveats" 
href="#perluniintro-Earlier-releases-caveats">83.2.5.1 Earlier releases 
caveats</a></li>
+      </ul></li>
       <li><a name="toc-Handling-Unicode" 
href="#perluniintro-Handling-Unicode">83.2.6 Handling Unicode</a></li>
       <li><a name="toc-Legacy-Encodings" 
href="#perluniintro-Legacy-Encodings">83.2.7 Legacy Encodings</a></li>
       <li><a name="toc-Unicode-I_002fO" 
href="#perluniintro-Unicode-I_002fO">83.2.8 Unicode I/O</a></li>
@@ -2880,7 +2893,9 @@
 <tr><th colspan="3" align="left" valign="top"><pre class="menu-comment">
 Scalar value constructors
 
-</pre></th></tr><tr><td align="left" valign="top">&bull; <a 
href="#perldata-Version-Strings">perldata Version 
Strings</a>:</td><td>&nbsp;&nbsp;</td><td align="left" valign="top">
+</pre></th></tr><tr><td align="left" valign="top">&bull; <a 
href="#perldata-Special-floating-point_003a-infinity-_0028Inf_0029-and-not_002da_002dnumber-_0028NaN_0029">perldata
 Special floating point: infinity (Inf) and not-a-number 
(NaN)</a>:</td><td>&nbsp;&nbsp;</td><td align="left" valign="top">
+</td></tr>
+<tr><td align="left" valign="top">&bull; <a 
href="#perldata-Version-Strings">perldata Version 
Strings</a>:</td><td>&nbsp;&nbsp;</td><td align="left" valign="top">
 </td></tr>
 <tr><td align="left" valign="top">&bull; <a 
href="#perldata-Special-Literals">perldata Special 
Literals</a>:</td><td>&nbsp;&nbsp;</td><td align="left" valign="top">
 </td></tr>
@@ -3201,8 +3216,6 @@
 </td></tr>
 <tr><td align="left" valign="top">&bull; <a 
href="#perlebcdic-Unicode-code-points-versus-EBCDIC-code-points">perlebcdic 
Unicode code points versus EBCDIC code points</a>:</td><td>&nbsp;&nbsp;</td><td 
align="left" valign="top">
 </td></tr>
-<tr><td align="left" valign="top">&bull; <a 
href="#perlebcdic-Remaining-Perl-Unicode-problems-in-EBCDIC">perlebcdic 
Remaining Perl Unicode problems in EBCDIC</a>:</td><td>&nbsp;&nbsp;</td><td 
align="left" valign="top">
-</td></tr>
 <tr><td align="left" valign="top">&bull; <a 
href="#perlebcdic-Unicode-and-UTF">perlebcdic Unicode and 
UTF</a>:</td><td>&nbsp;&nbsp;</td><td align="left" valign="top">
 </td></tr>
 <tr><td align="left" valign="top">&bull; <a 
href="#perlebcdic-Using-Encode">perlebcdic Using 
Encode</a>:</td><td>&nbsp;&nbsp;</td><td align="left" valign="top">
@@ -3212,6 +3225,13 @@
 
 </pre></th></tr><tr><td align="left" valign="top">&bull; <a 
href="#perlebcdic-The-13-variant-characters">perlebcdic The 13 variant 
characters</a>:</td><td>&nbsp;&nbsp;</td><td align="left" valign="top">
 </td></tr>
+<tr><td align="left" valign="top">&bull; <a 
href="#perlebcdic-EBCDIC-code-sets-recognized-by-Perl">perlebcdic EBCDIC code 
sets recognized by Perl</a>:</td><td>&nbsp;&nbsp;</td><td align="left" 
valign="top">
+</td></tr>
+<tr><th colspan="3" align="left" valign="top"><pre class="menu-comment">
+SINGLE OCTET TABLES
+
+</pre></th></tr><tr><td align="left" valign="top">&bull; <a 
href="#perlebcdic-Table-in-hex_002c-sorted-in-1047-order">perlebcdic Table in 
hex, sorted in 1047 order</a>:</td><td>&nbsp;&nbsp;</td><td align="left" 
valign="top">
+</td></tr>
 <tr><th colspan="3" align="left" valign="top"><pre class="menu-comment">
 CONVERSIONS
 
@@ -3228,9 +3248,9 @@
 
 </pre></th></tr><tr><td align="left" valign="top">&bull; <a 
href="#perlebcdic-Ignore-ASCII-vs_002e-EBCDIC-sort-differences_002e">perlebcdic 
Ignore ASCII vs. EBCDIC sort differences.</a>:</td><td>&nbsp;&nbsp;</td><td 
align="left" valign="top">
 </td></tr>
-<tr><td align="left" valign="top">&bull; <a 
href="#perlebcdic-MONO-CASE-then-sort-data_002e">perlebcdic MONO CASE then sort 
data.</a>:</td><td>&nbsp;&nbsp;</td><td align="left" valign="top">
+<tr><td align="left" valign="top">&bull; <a 
href="#perlebcdic-Use-a-sort-helper-function">perlebcdic Use a sort helper 
function</a>:</td><td>&nbsp;&nbsp;</td><td align="left" valign="top">
 </td></tr>
-<tr><td align="left" valign="top">&bull; <a 
href="#perlebcdic-Convert_002c-sort-data_002c-then-re-convert_002e">perlebcdic 
Convert, sort data, then re convert.</a>:</td><td>&nbsp;&nbsp;</td><td 
align="left" valign="top">
+<tr><td align="left" valign="top">&bull; <a 
href="#perlebcdic-MONO-CASE-then-sort-data-_0028for-non_002ddigits_002c-non_002dunderscore_0029">perlebcdic
 MONO CASE then sort data (for non-digits, 
non-underscore)</a>:</td><td>&nbsp;&nbsp;</td><td align="left" valign="top">
 </td></tr>
 <tr><td align="left" valign="top">&bull; <a 
href="#perlebcdic-Perform-sorting-on-one-type-of-platform-only_002e">perlebcdic 
Perform sorting on one type of platform only.</a>:</td><td>&nbsp;&nbsp;</td><td 
align="left" valign="top">
 </td></tr>
@@ -3347,6 +3367,8 @@
 </td></tr>
 <tr><td align="left" valign="top">&bull; <a 
href="#perlfilter-CONCLUSION">perlfilter 
CONCLUSION</a>:</td><td>&nbsp;&nbsp;</td><td align="left" valign="top">
 </td></tr>
+<tr><td align="left" valign="top">&bull; <a 
href="#perlfilter-LIMITATIONS">perlfilter 
LIMITATIONS</a>:</td><td>&nbsp;&nbsp;</td><td align="left" valign="top">
+</td></tr>
 <tr><td align="left" valign="top">&bull; <a 
href="#perlfilter-THINGS-TO-LOOK-OUT-FOR">perlfilter THINGS TO LOOK OUT 
FOR</a>:</td><td>&nbsp;&nbsp;</td><td align="left" valign="top">
 </td></tr>
 <tr><td align="left" valign="top">&bull; <a 
href="#perlfilter-REQUIREMENTS">perlfilter 
REQUIREMENTS</a>:</td><td>&nbsp;&nbsp;</td><td align="left" valign="top">
@@ -3683,6 +3705,8 @@
 </td></tr>
 <tr><td align="left" valign="top">&bull; <a 
href="#perlguts-How-do-I-convert-a-string-to-UTF_002d8_003f">perlguts How do I 
convert a string to UTF-8?</a>:</td><td>&nbsp;&nbsp;</td><td align="left" 
valign="top">
 </td></tr>
+<tr><td align="left" valign="top">&bull; <a 
href="#perlguts-How-do-I-compare-strings_003f">perlguts How do I compare 
strings?</a>:</td><td>&nbsp;&nbsp;</td><td align="left" valign="top">
+</td></tr>
 <tr><td align="left" valign="top">&bull; <a 
href="#perlguts-Is-there-anything-else-I-need-to-know_003f">perlguts Is there 
anything else I need to know?</a>:</td><td>&nbsp;&nbsp;</td><td align="left" 
valign="top">
 </td></tr>
 <tr><th colspan="3" align="left" valign="top"><pre class="menu-comment">
@@ -3800,6 +3824,8 @@
 </td></tr>
 <tr><td align="left" valign="top">&bull; <a 
href="#perlhack-Using-t_002fharness-for-testing">perlhack Using 
<samp>t/harness</samp> for testing</a>:</td><td>&nbsp;&nbsp;</td><td 
align="left" valign="top">
 </td></tr>
+<tr><td align="left" valign="top">&bull; <a 
href="#perlhack-Performance-testing">perlhack Performance 
testing</a>:</td><td>&nbsp;&nbsp;</td><td align="left" valign="top">
+</td></tr>
 <tr><th colspan="3" align="left" valign="top"><pre class="menu-comment">
 Using <samp>t/harness</samp> for testing
 
@@ -3895,6 +3921,8 @@
 </td></tr>
 <tr><td align="left" valign="top">&bull; <a 
href="#perlhacktips-DDD-over-gdb">perlhacktips DDD over 
gdb</a>:</td><td>&nbsp;&nbsp;</td><td align="left" valign="top">
 </td></tr>
+<tr><td align="left" valign="top">&bull; <a 
href="#perlhacktips-C-backtrace">perlhacktips C 
backtrace</a>:</td><td>&nbsp;&nbsp;</td><td align="left" valign="top">
+</td></tr>
 <tr><td align="left" valign="top">&bull; <a 
href="#perlhacktips-Poison">perlhacktips 
Poison</a>:</td><td>&nbsp;&nbsp;</td><td align="left" valign="top">
 </td></tr>
 <tr><td align="left" valign="top">&bull; <a 
href="#perlhacktips-Read_002donly-optrees">perlhacktips Read-only 
optrees</a>:</td><td>&nbsp;&nbsp;</td><td align="left" valign="top">
@@ -4188,7 +4216,7 @@
 <tr><th colspan="3" align="left" valign="top"><pre class="menu-comment">
 USING LOCALES
 
-</pre></th></tr><tr><td align="left" valign="top">&bull; <a 
href="#perllocale-The-use-locale-pragma">perllocale The use locale 
pragma</a>:</td><td>&nbsp;&nbsp;</td><td align="left" valign="top">
+</pre></th></tr><tr><td align="left" valign="top">&bull; <a 
href="#perllocale-The-_0022use-locale_0022-pragma">perllocale The 
<code>&quot;use locale&quot;</code> pragma</a>:</td><td>&nbsp;&nbsp;</td><td 
align="left" valign="top">
 </td></tr>
 <tr><td align="left" valign="top">&bull; <a 
href="#perllocale-The-setlocale-function">perllocale The setlocale 
function</a>:</td><td>&nbsp;&nbsp;</td><td align="left" valign="top">
 </td></tr>
@@ -4365,6 +4393,8 @@
 </td></tr>
 <tr><td align="left" valign="top">&bull; <a 
href="#perlmodstyle-What_0027s-in-a-name_003f">perlmodstyle What's in a 
name?</a>:</td><td>&nbsp;&nbsp;</td><td align="left" valign="top">
 </td></tr>
+<tr><td align="left" valign="top">&bull; <a 
href="#perlmodstyle-Get-feedback-before-publishing">perlmodstyle Get feedback 
before publishing</a>:</td><td>&nbsp;&nbsp;</td><td align="left" valign="top">
+</td></tr>
 <tr><th colspan="3" align="left" valign="top"><pre class="menu-comment">
 DESIGNING AND WRITING YOUR MODULE
 
@@ -5422,6 +5452,8 @@
 </td></tr>
 <tr><td align="left" valign="top">&bull; <a 
href="#perlref-Postfix-Dereference-Syntax">perlref Postfix Dereference 
Syntax</a>:</td><td>&nbsp;&nbsp;</td><td align="left" valign="top">
 </td></tr>
+<tr><td align="left" valign="top">&bull; <a 
href="#perlref-Assigning-to-References">perlref Assigning to 
References</a>:</td><td>&nbsp;&nbsp;</td><td align="left" valign="top">
+</td></tr>
 <tr><td align="left" valign="top">&bull; <a href="#perlref-SEE-ALSO">perlref 
SEE ALSO</a>:</td><td>&nbsp;&nbsp;</td><td align="left" valign="top">
 </td></tr>
 <tr><th colspan="3" align="left" valign="top"><pre class="menu-comment">
@@ -5618,6 +5650,8 @@
 </td></tr>
 <tr><td align="left" valign="top">&bull; <a 
href="#perlrequick-The-split-operator">perlrequick The split 
operator</a>:</td><td>&nbsp;&nbsp;</td><td align="left" valign="top">
 </td></tr>
+<tr><td align="left" valign="top">&bull; <a 
href="#perlrequick-use-re-_0027strict_0027">perlrequick <code>use re 
'strict'</code></a>:</td><td>&nbsp;&nbsp;</td><td align="left" valign="top">
+</td></tr>
 <tr><th colspan="3" align="left" valign="top"><pre class="menu-comment">
 AUTHOR AND COPYRIGHT
 
@@ -6178,7 +6212,9 @@
 </td></tr>
 <tr><td align="left" valign="top">&bull; <a 
href="#perlunicode-Byte-and-Character-Semantics">perlunicode Byte and Character 
Semantics</a>:</td><td>&nbsp;&nbsp;</td><td align="left" valign="top">
 </td></tr>
-<tr><td align="left" valign="top">&bull; <a 
href="#perlunicode-Effects-of-Character-Semantics">perlunicode Effects of 
Character Semantics</a>:</td><td>&nbsp;&nbsp;</td><td align="left" valign="top">
+<tr><td align="left" valign="top">&bull; <a 
href="#perlunicode-ASCII-Rules-versus-Unicode-Rules">perlunicode ASCII Rules 
versus Unicode Rules</a>:</td><td>&nbsp;&nbsp;</td><td align="left" 
valign="top">
+</td></tr>
+<tr><td align="left" valign="top">&bull; <a 
href="#perlunicode-Extended-Grapheme-Clusters-_0028Logical-characters_0029">perlunicode
 Extended Grapheme Clusters (Logical 
characters)</a>:</td><td>&nbsp;&nbsp;</td><td align="left" valign="top">
 </td></tr>
 <tr><td align="left" valign="top">&bull; <a 
href="#perlunicode-Unicode-Character-Properties">perlunicode Unicode Character 
Properties</a>:</td><td>&nbsp;&nbsp;</td><td align="left" valign="top">
 </td></tr>
@@ -6192,7 +6228,7 @@
 </td></tr>
 <tr><td align="left" valign="top">&bull; <a 
href="#perlunicode-Unicode-Encodings">perlunicode Unicode 
Encodings</a>:</td><td>&nbsp;&nbsp;</td><td align="left" valign="top">
 </td></tr>
-<tr><td align="left" valign="top">&bull; <a 
href="#perlunicode-Non_002dcharacter-code-points">perlunicode Non-character 
code points</a>:</td><td>&nbsp;&nbsp;</td><td align="left" valign="top">
+<tr><td align="left" valign="top">&bull; <a 
href="#perlunicode-Noncharacter-code-points">perlunicode Noncharacter code 
points</a>:</td><td>&nbsp;&nbsp;</td><td align="left" valign="top">
 </td></tr>
 <tr><td align="left" valign="top">&bull; <a 
href="#perlunicode-Beyond-Unicode-code-points">perlunicode Beyond Unicode code 
points</a>:</td><td>&nbsp;&nbsp;</td><td align="left" valign="top">
 </td></tr>
@@ -6212,6 +6248,8 @@
 </td></tr>
 <tr><td align="left" valign="top">&bull; <a 
href="#perlunicode-Hacking-Perl-to-work-on-earlier-Unicode-versions-_0028for-very-serious-hackers-only_0029">perlunicode
 Hacking Perl to work on earlier Unicode versions (for very serious hackers 
only)</a>:</td><td>&nbsp;&nbsp;</td><td align="left" valign="top">
 </td></tr>
+<tr><td align="left" valign="top">&bull; <a 
href="#perlunicode-Porting-code-from-perl_002d5_002e6_002eX">perlunicode 
Porting code from perl-5.6.X</a>:</td><td>&nbsp;&nbsp;</td><td align="left" 
valign="top">
+</td></tr>
 <tr><th colspan="3" align="left" valign="top"><pre class="menu-comment">
 Unicode Character Properties
 
@@ -6230,18 +6268,10 @@
 <tr><th colspan="3" align="left" valign="top"><pre class="menu-comment">
 BUGS
 
-</pre></th></tr><tr><td align="left" valign="top">&bull; <a 
href="#perlunicode-Interaction-with-Locales">perlunicode Interaction with 
Locales</a>:</td><td>&nbsp;&nbsp;</td><td align="left" valign="top">
-</td></tr>
-<tr><td align="left" valign="top">&bull; <a 
href="#perlunicode-Problems-with-characters-in-the-Latin_002d1-Supplement-range">perlunicode
 Problems with characters in the Latin-1 Supplement 
range</a>:</td><td>&nbsp;&nbsp;</td><td align="left" valign="top">
-</td></tr>
-<tr><td align="left" valign="top">&bull; <a 
href="#perlunicode-Interaction-with-Extensions">perlunicode Interaction with 
Extensions</a>:</td><td>&nbsp;&nbsp;</td><td align="left" valign="top">
+</pre></th></tr><tr><td align="left" valign="top">&bull; <a 
href="#perlunicode-Interaction-with-Extensions">perlunicode Interaction with 
Extensions</a>:</td><td>&nbsp;&nbsp;</td><td align="left" valign="top">
 </td></tr>
 <tr><td align="left" valign="top">&bull; <a 
href="#perlunicode-Speed">perlunicode Speed</a>:</td><td>&nbsp;&nbsp;</td><td 
align="left" valign="top">
 </td></tr>
-<tr><td align="left" valign="top">&bull; <a 
href="#perlunicode-Problems-on-EBCDIC-platforms">perlunicode Problems on EBCDIC 
platforms</a>:</td><td>&nbsp;&nbsp;</td><td align="left" valign="top">
-</td></tr>
-<tr><td align="left" valign="top">&bull; <a 
href="#perlunicode-Porting-code-from-perl_002d5_002e6_002eX">perlunicode 
Porting code from perl-5.6.X</a>:</td><td>&nbsp;&nbsp;</td><td align="left" 
valign="top">
-</td></tr>
 <tr><th colspan="3" align="left" valign="top"><pre class="menu-comment">
 perlunifaq
 
@@ -6356,6 +6386,11 @@
 <tr><td align="left" valign="top">&bull; <a 
href="#perluniintro-Further-Resources">perluniintro Further 
Resources</a>:</td><td>&nbsp;&nbsp;</td><td align="left" valign="top">
 </td></tr>
 <tr><th colspan="3" align="left" valign="top"><pre class="menu-comment">
+Creating Unicode
+
+</pre></th></tr><tr><td align="left" valign="top">&bull; <a 
href="#perluniintro-Earlier-releases-caveats">perluniintro Earlier releases 
caveats</a>:</td><td>&nbsp;&nbsp;</td><td align="left" valign="top">
+</td></tr>
+<tr><th colspan="3" align="left" valign="top"><pre class="menu-comment">
 perlunitut
 
 </pre></th></tr><tr><td align="left" valign="top">&bull; <a 
href="#perlunitut-NAME">perlunitut NAME</a>:</td><td>&nbsp;&nbsp;</td><td 
align="left" valign="top">
@@ -6807,6 +6842,19 @@
 
     perlhist            Perl history records
     perldelta           Perl changes since previous version
+    perl52111delta      Perl changes in version 5.21.11
+    perl52110delta      Perl changes in version 5.21.10
+    perl5219delta       Perl changes in version 5.21.9
+    perl5218delta       Perl changes in version 5.21.8
+    perl5217delta       Perl changes in version 5.21.7
+    perl5216delta       Perl changes in version 5.21.6
+    perl5215delta       Perl changes in version 5.21.5
+    perl5214delta       Perl changes in version 5.21.4
+    perl5213delta       Perl changes in version 5.21.3
+    perl5212delta       Perl changes in version 5.21.2
+    perl5211delta       Perl changes in version 5.21.1
+    perl5210delta       Perl changes in version 5.21.0
+    perl5202delta       Perl changes in version 5.20.2
     perl5201delta       Perl changes in version 5.20.1
     perl5200delta       Perl changes in version 5.20.0
     perl5184delta       Perl changes in version 5.18.4
@@ -8127,6 +8175,7 @@
 <pre class="verbatim">        by Tom Christiansen and Nathan Torkington,
             with Foreword by Larry Wall
         ISBN 978-0-596-00313-5 [2nd Edition August 2003]
+        ISBN 978-0-596-15888-0 [ebook]
         http://oreilly.com/catalog/9780596003135/
 </pre>
 </dd>
@@ -8141,7 +8190,8 @@
 <dd><a name="perlbook-Learning-Perl-_0028the-_0022Llama-Book_0022_0029"></a>
 <pre class="verbatim">        by Randal L. Schwartz, Tom Phoenix, and brian d 
foy
         ISBN 978-1-4493-0358-7 [6th edition June 2011]
-        http://oreilly.com/catalog/0636920018452
+        ISBN 978-1-4493-0458-4 [ebook]
+        http://www.learning-perl.com/
 </pre>
 </dd>
 </dl>
@@ -8156,7 +8206,8 @@
 <pre class="verbatim">        by Randal L. Schwartz and brian d foy, with Tom 
Phoenix
                 foreword by Damian Conway
         ISBN 978-1-4493-9309-0 [2nd edition August 2012]
-        http://oreilly.com/catalog/0636920012689/
+        ISBN 978-1-4493-0459-1 [ebook]
+        http://www.intermediateperl.com/
 </pre>
 </dd>
 </dl>
@@ -8185,13 +8236,15 @@
 <dd><a name="perlbook-Perl-Debugger-Pocket-Reference"></a>
 <pre class="verbatim">        by Richard Foley
         ISBN 978-0-596-00503-0 [1st edition January 2004]
+        ISBN 978-0-596-55625-9 [ebook]
         http://oreilly.com/catalog/9780596005030/
 </pre>
 </dd>
 <dt><em>Regular Expression Pocket Reference</em></dt>
 <dd><a name="perlbook-Regular-Expression-Pocket-Reference"></a>
 <pre class="verbatim">        by Tony Stubblebine
-        ISBN 978-0-596-51427-3 [July 2007]
+        ISBN 978-0-596-51427-3 [2nd edition July 2007]
+        ISBN 978-0-596-55782-9 [ebook]
         http://oreilly.com/catalog/9780596514273/
 </pre>
 </dd>
@@ -8210,30 +8263,33 @@
 <dt><em>Beginning Perl</em></dt>
 <dd><a name="perlbook-Beginning-Perl"></a>
 <pre class="verbatim">        by James Lee
-        ISBN 1-59059-391-X [3rd edition April 2010]
+        ISBN 1-59059-391-X [3rd edition April 2010 &amp; ebook]
         http://www.apress.com/9781430227939
 </pre>
 </dd>
-<dt><em>Learning Perl</em></dt>
-<dd><a name="perlbook-Learning-Perl"></a>
+<dt><em>Learning Perl</em> (the &quot;Llama Book&quot;)</dt>
+<dd><a name="perlbook-Learning-Perl-_0028the-_0022Llama-Book_0022_0029-1"></a>
 <pre class="verbatim">        by Randal L. Schwartz, Tom Phoenix, and brian d 
foy
-        ISBN 978-0-596-52010-6 [5th edition June 2008]
-        http://oreilly.com/catalog/9780596520106
+        ISBN 978-1-4493-0358-7 [6th edition June 2011]
+        ISBN 978-1-4493-0458-4 [ebook]
+        http://www.learning-perl.com/
 </pre>
 </dd>
 <dt><em>Intermediate Perl</em> (the &quot;Alpaca Book&quot;)</dt>
 <dd><a 
name="perlbook-Intermediate-Perl-_0028the-_0022Alpaca-Book_0022_0029-1"></a>
 <pre class="verbatim">        by Randal L. Schwartz and brian d foy, with Tom 
Phoenix
                 foreword by Damian Conway
-        ISBN 0-596-10206-2 [1st edition March 2006]
-        http://oreilly.com/catalog/9780596102067
+        ISBN 978-1-4493-9309-0 [2nd edition August 2012]
+        ISBN 978-1-4493-0459-1 [ebook]
+        http://www.intermediateperl.com/
 </pre>
 </dd>
 <dt><em>Mastering Perl</em></dt>
 <dd><a name="perlbook-Mastering-Perl"></a>
 <pre class="verbatim">        by brian d foy
-        ISBN 978-0-596-10206-7 [1st edition July 2007]
-        http://www.oreilly.com/catalog/9780596527242
+        ISBN 9978-1-4493-9311-3 [2st edition January 2014]
+        ISBN 978-1-4493-6487-8 [ebook]
+        http://www.masteringperl.org/
 </pre>
 </dd>
 <dt><em>Effective Perl Programming</em></dt>
@@ -8258,29 +8314,31 @@
 <dt><em>Writing Perl Modules for CPAN</em></dt>
 <dd><a name="perlbook-Writing-Perl-Modules-for-CPAN"></a>
 <pre class="verbatim">        by Sam Tregar
-        ISBN 1-59059-018-X [1st edition August 2002]
+        ISBN 1-59059-018-X [1st edition August 2002 &amp; ebook]
         http://www.apress.com/9781590590188
 </pre>
 </dd>
 <dt><em>The Perl Cookbook</em></dt>
 <dd><a name="perlbook-The-Perl-Cookbook"></a>
-<pre class="verbatim">        by Tom Christiansen and Nathan Torkington
-            with foreword by Larry Wall
-        ISBN 1-56592-243-3 [2nd edition August 2003]
-        http://oreilly.com/catalog/9780596003135
+<pre class="verbatim">        by Tom Christiansen and Nathan Torkington,
+            with Foreword by Larry Wall
+        ISBN 978-0-596-00313-5 [2nd Edition August 2003]
+        ISBN 978-0-596-15888-0 [ebook]
+        http://oreilly.com/catalog/9780596003135/
 </pre>
 </dd>
 <dt><em>Automating System Administration with Perl</em></dt>
 <dd><a name="perlbook-Automating-System-Administration-with-Perl"></a>
 <pre class="verbatim">        by David N. Blank-Edelman
         ISBN 978-0-596-00639-6 [2nd edition May 2009]
+        ISBN 978-0-596-80251-6 [ebook]
         http://oreilly.com/catalog/9780596006396
 </pre>
 </dd>
 <dt><em>Real World SQL Server Administration with Perl</em></dt>
 <dd><a name="perlbook-Real-World-SQL-Server-Administration-with-Perl"></a>
 <pre class="verbatim">        by Linchi Shea
-        ISBN 1-59059-097-X [1st edition July 2003]
+        ISBN 1-59059-097-X [1st edition July 2003 &amp; ebook]
         http://www.apress.com/9781590590973
 </pre>
 </dd>
@@ -8299,28 +8357,32 @@
 <dt><em>Regular Expressions Cookbook</em></dt>
 <dd><a name="perlbook-Regular-Expressions-Cookbook"></a>
 <pre class="verbatim">        by Jan Goyvaerts and Steven Levithan
-        ISBN 978-0-596-52069-4 [May 2009]
-        http://oreilly.com/catalog/9780596520694
+        ISBN 978-1-4493-1943-4 [2nd edition August 2012]
+        ISBN 978-1-4493-2747-7 [ebook]
+        http://shop.oreilly.com/product/0636920023630.do
 </pre>
 </dd>
 <dt><em>Programming the Perl DBI</em></dt>
 <dd><a name="perlbook-Programming-the-Perl-DBI"></a>
 <pre class="verbatim">        by Tim Bunce and Alligator Descartes
         ISBN 978-1-56592-699-8 [February 2000]
+        ISBN 978-1-4493-8670-2 [ebook]
         http://oreilly.com/catalog/9781565926998
 </pre>
 </dd>
 <dt><em>Perl Best Practices</em></dt>
 <dd><a name="perlbook-Perl-Best-Practices"></a>
 <pre class="verbatim">        by Damian Conway
-        ISBN: 978-0-596-00173-5 [1st edition July 2005]
+        ISBN 978-0-596-00173-5 [1st edition July 2005]
+        ISBN 978-0-596-15900-9 [ebook]
         http://oreilly.com/catalog/9780596001735
 </pre>
 </dd>
 <dt><em>Higher-Order Perl</em></dt>
 <dd><a name="perlbook-Higher_002dOrder-Perl"></a>
 <pre class="verbatim">        by Mark-Jason Dominus
-        ISBN: 1-55860-701-3 [1st edition March 2005]
+        ISBN 1-55860-701-3 [1st edition March 2005]
+        free ebook http://hop.perl.plover.com/book/
         http://hop.perl.plover.com/
 </pre>
 </dd>
@@ -8328,6 +8390,7 @@
 <dd><a name="perlbook-Mastering-Regular-Expressions"></a>
 <pre class="verbatim">        by Jeffrey E. F. Friedl
         ISBN 978-0-596-52812-6 [3rd edition August 2006]
+        ISBN 978-0-596-55899-4 [ebook]
         http://oreilly.com/catalog/9780596528126
 </pre>
 </dd>
@@ -8342,6 +8405,7 @@
 <dd><a name="perlbook-Perl-Template-Toolkit"></a>
 <pre class="verbatim">        by Darren Chamberlain, Dave Cross, and Andy 
Wardley
         ISBN 978-0-596-00476-7 [December 2003]
+        ISBN 978-1-4493-8647-4 [ebook]
         http://oreilly.com/catalog/9780596004767
 </pre>
 </dd>
@@ -8349,14 +8413,14 @@
 <dd><a name="perlbook-Object-Oriented-Perl"></a>
 <pre class="verbatim">        by Damian Conway
             with foreword by Randal L. Schwartz
-        ISBN 1-884777-79-1 [1st edition August 1999]
+        ISBN 1-884777-79-1 [1st edition August 1999 &amp; ebook]
         http://www.manning.com/conway/
 </pre>
 </dd>
 <dt><em>Data Munging with Perl</em></dt>
 <dd><a name="perlbook-Data-Munging-with-Perl"></a>
 <pre class="verbatim">        by Dave Cross
-        ISBN 1-930110-00-6 [1st edition 2001]
+        ISBN 1-930110-00-6 [1st edition 2001 &amp; ebook]
         http://www.manning.com/cross
 </pre>
 </dd>
@@ -8364,20 +8428,21 @@
 <dd><a name="perlbook-Mastering-Perl_002fTk"></a>
 <pre class="verbatim">        by Steve Lidie and Nancy Walsh
         ISBN 978-1-56592-716-2 [1st edition January 2002]
+        ISBN 978-0-596-10344-6 [ebook]
         http://oreilly.com/catalog/9781565927162
 </pre>
 </dd>
 <dt><em>Extending and Embedding Perl</em></dt>
 <dd><a name="perlbook-Extending-and-Embedding-Perl"></a>
 <pre class="verbatim">        by Tim Jenness and Simon Cozens
-        ISBN 1-930110-82-0 [1st edition August 2002]
+        ISBN 1-930110-82-0 [1st edition August 2002 &amp; ebook]
         http://www.manning.com/jenness
 </pre>
 </dd>
 <dt><em>Pro Perl Debugging</em></dt>
 <dd><a name="perlbook-Pro-Perl-Debugging"></a>
 <pre class="verbatim">        by Richard Foley with Andy Lester
-        ISBN 1-59059-454-1 [1st edition July 2005]
+        ISBN 1-59059-454-1 [1st edition July 2005 &amp; ebook]
         http://www.apress.com/9781590594544
 </pre>
 </dd>
@@ -10788,7 +10853,7 @@
   , =&gt;            /a ASCII    /aa safe  {3,7}  repeat in range
   list ops        /l locale   /d  dual  |      alternation
   not             /u Unicode            []     character class
-  and             /e evaluate /ee rpts  \b     word boundary
+  and             /e evaluate /ee rpts  \b     boundary
   or xor          /g global             \z     string end
                   /o compile pat once   ()     capture
   DEBUG                                 (?:p)  no capture
@@ -11072,6 +11137,9 @@
  strcmp(s1, s2)                 strLE(s1, s2) / strEQ(s1, s2)
                                               / strGT(s1,s2)
  strncmp(s1, s2, n)             strnNE(s1, s2, n) / strnEQ(s1, s2, n)
+
+ memcmp(p1, p2, n)              memNE(p1, p2, n)
+ !memcmp(p1, p2, n)             memEQ(p1, p2, n)
 </pre>
 <p>Notice the different order of arguments to <code>Copy</code> and 
<code>Move</code> than used
 in <code>memcpy</code> and <code>memmove</code>.
@@ -11115,7 +11183,7 @@
 The only ones described here are those that directly correspond to C
 library functions that operate on 8-bit characters, but there are
 equivalents that operate on wide characters, and UTF-8 encoded strings.
-All are more fully described in <a 
href="perlapi.html#Character-classes">(perlapi)Character classes</a> and
+All are more fully described in <a 
href="perlapi.html#Character-classification">(perlapi)Character 
classification</a> and
 <a href="perlapi.html#Character-case-changing">(perlapi)Character case 
changing</a>.
 </p>
 <p>The C library routines listed in the table below return values based on
@@ -11168,14 +11236,30 @@
 <pre class="verbatim"> Instead Of:                 Use:
 
  atof(s)                     Atof(s)
- atol(s)                     Atol(s)
+ atoi(s)                     grok_atoUV(s, &amp;uv, &amp;e)
+ atol(s)                     grok_atoUV(s, &amp;uv, &amp;e)
  strtod(s, &amp;p)               Nothing.  Just don't use it.
- strtol(s, &amp;p, n)            Strtol(s, &amp;p, n)
- strtoul(s, &amp;p, n)           Strtoul(s, &amp;p, n)
+ strtol(s, &amp;p, n)            grok_atoUV(s, &amp;uv, &amp;e)
+ strtoul(s, &amp;p, n)           grok_atoUV(s, &amp;uv, &amp;e)
+</pre>
+<p>Typical use is to do range checks on <code>uv</code> before casting:
+</p>
+<pre class="verbatim">  int i; UV uv; char* end_ptr;
+  if (grok_atoUV(input, &amp;uv, &amp;end_ptr)
+      &amp;&amp; uv &lt;= INT_MAX)
+    i = (int)uv;
+    ... /* continue parsing from end_ptr */
+  } else {
+    ... /* parse error: not a decimal integer in range 0 .. MAX_IV */
+  }
 </pre>
 <p>Notice also the <code>grok_bin</code>, <code>grok_hex</code>, and 
<code>grok_oct</code> functions in
 <samp>numeric.c</samp> for converting strings representing numbers in the 
respective
-bases into <code>NV</code>s.
+bases into <code>NV</code>s.  Note that grok_atoUV() doesn&rsquo;t handle 
negative inputs,
+or leading whitespace (being purposefully strict).
+</p>
+<p>Note that strtol() and strtoul() may be disguised as Strtol(), Strtoul(),
+Atol(), Atoul().  Avoid those, too.
 </p>
 <p>In theory <code>Strtol</code> and <code>Strtoul</code> may not be defined 
if the machine perl is
 built on doesn&rsquo;t actually have strtol and strtoul. But as those 2
@@ -11187,10 +11271,10 @@
                                PL_srand_called = TRUE; }
 
  exit(n)                     my_exit(n)
- system(s)                   Don't. Look at pp_system or use my_popen
+ system(s)                   Don't. Look at pp_system or use my_popen.
 
  getenv(s)                   PerlEnv_getenv(s)
- setenv(s, val)              my_putenv(s, val)
+ setenv(s, val)              my_setenv(s, val)
 </pre>
 <hr>
 <a name="perlclib-Miscellaneous-functions"></a>
@@ -11780,23 +11864,42 @@
 by Perl.  Because they have special parsing rules, these generally can&rsquo;t 
be
 fully-qualified.  They come in four forms:
 </p>
-<dl compact="compact">
-<dt>A sigil, followed solely by digits matching \p{POSIX_Digit}, like 
<code>$0</code>, <code>$1</code>, or <code>$10000</code>.</dt>
-<dd><a 
name="perldata-A-sigil_002c-followed-solely-by-digits-matching-_005cp_007bPOSIX_005fDigit_007d_002c-like-_00240_002c-_00241_002c-or-_002410000_002e"></a>
-</dd>
-<dt>A sigil, followed by either a caret and a single POSIX uppercase letter, 
like <code>$^V</code> or <code>$^W</code>, or a sigil followed by a literal 
control character matching the <code>\p{POSIX_Cntrl}</code> property. Due to a 
historical oddity, if not running under <code>use utf8</code>, the 128 extra 
controls in the <code>[0x80-0xff]</code> range may also be used in length one 
variables.  The use of a literal control character is deprecated.  Support for 
this form will be removed in a future version of perl.</dt>
-<dd><a 
name="perldata-A-sigil_002c-followed-by-either-a-caret-and-a-single-POSIX-uppercase-letter_002c-like-_0024_005eV-or-_0024_005eW_002c-or-a-sigil-followed-by-a-literal-control-character-matching-the-_005cp_007bPOSIX_005fCntrl_007d-property_002e-Due-to-a-historical-oddity_002c-if-not-running-under-use-utf8_002c-the-128-extra-controls-in-the-_005b0x80_002d0xff_005d-range-may-also-be-used-in-length-one-variables_002e-The-use-of-a-literal-control-character-is-deprecated_002e-Support-for-this-form-will-be-removed-in-a-future-version-of-perl_002e"></a>
-</dd>
-<dt>Similar to the above, a sigil, followed by bareword text in brackets, 
where the first character is either a caret followed by an uppercase letter, or 
a literal control, like <code>${^GLOBAL_PHASE}</code> or 
<code>${\7LOBAL_PHASE}</code>.  The use of a literal control character is 
deprecated.  Support for this form will be removed in a future version of 
perl.</dt>
-<dd><a 
name="perldata-Similar-to-the-above_002c-a-sigil_002c-followed-by-bareword-text-in-brackets_002c-where-the-first-character-is-either-a-caret-followed-by-an-uppercase-letter_002c-or-a-literal-control_002c-like-_0024_007b_005eGLOBAL_005fPHASE_007d-or-_0024_007b_005c7LOBAL_005fPHASE_007d_002e-The-use-of-a-literal-control-character-is-deprecated_002e-Support-for-this-form-will-be-removed-in-a-future-version-of-perl_002e"></a>
-</dd>
-<dt>A sigil followed by a single character matching the 
<code>\p{POSIX_Punct}</code> property, like <code>$!</code> or 
<code>%+</code>.</dt>
-<dd><a 
name="perldata-A-sigil-followed-by-a-single-character-matching-the-_005cp_007bPOSIX_005fPunct_007d-property_002c-like-_0024_0021-or-_0025_002b_002e"></a>
-</dd>
-</dl>
+<ul>
+<li> A sigil, followed solely by digits matching <code>\p{POSIX_Digit}</code>, 
like
+<code>$0</code>, <code>$1</code>, or <code>$10000</code>.
+
+</li><li> A sigil, followed by either a caret and a single POSIX uppercase 
letter,
+like <code>$^V</code> or <code>$^W</code>, or a sigil followed by a literal 
non-space,
+non-<code>NUL</code> control character matching the 
<code>\p{POSIX_Cntrl}</code> property.
+Due to a historical oddity, if not running under <code>use utf8</code>, the 128
+characters in the <code>[0x80-0xff]</code> range are considered to be controls,
+and may also be used in length-one variables.  However, the use of
+non-graphical characters is deprecated as of v5.22, and support for them
+will be removed in a future version of perl.  ASCII space characters and
+<code>NUL</code> already aren&rsquo;t allowed, so this means that a 
single-character
+variable name with that name being any other C0 control 
<code>[0x01-0x1F]</code>,
+or <code>DEL</code> will generate a deprecated warning.  Already, under 
<code>&quot;use
+utf8&quot;</code>, non-ASCII characters must match <code>Perl_XIDS</code>.  As 
of v5.22, when
+not under <code>&quot;use utf8&quot;</code> C1 controls 
<code>[0x80-0x9F]</code>, NO BREAK SPACE, and
+SOFT HYPHEN (<code>SHY</code>)) generate a deprecated warning.
+
+</li><li> Similar to the above, a sigil, followed by bareword text in brackets,
+where the first character is either a caret followed by an uppercase
+letter, like <code>${^GLOBAL_PHASE}</code> or a non-<code>NUL</code>, 
non-space literal
+control like <code>${\7LOBAL_PHASE}</code>.  Like the above, when not under
+<code>&quot;use utf8&quot;</code>, the characters in <code>[0x80-0xFF]</code> 
are considered controls, but as
+of v5.22, the use of any that are non-graphical are deprecated, and as
+of v5.20 the use of any ASCII-range literal control is deprecated.
+Support for these will be removed in a future version of perl.
+
+</li><li> A sigil followed by a single character matching the 
<code>\p{POSIX_Punct}</code>
+property, like <code>$!</code> or <code>%+</code>, except the character 
<code>&quot;{&quot;</code> doesn&rsquo;t work.
+
+</li></ul>
 
 <p>Note that as of Perl 5.20, literal control characters in variable names
-are deprecated.
+are deprecated; and as of Perl 5.22, any other non-graphic characters
+are also deprecated.
 </p>
 <hr>
 <a name="perldata-Context"></a>
@@ -11999,6 +12102,7 @@
     0xdead_beef         # more hex   
     0377                # octal (only numbers, begins with 0)
     0b011011            # binary
+ 0x1.999ap-4         # hexadecimal floating point (the 'p' is required)
 </pre>
 <p>You are allowed to use underscores (underbars) in numeric literals
 between digits for legibility (but not multiple underscores in a row:
@@ -12020,6 +12124,17 @@
 representation.  The hex() and oct() functions make these conversions
 for you.  See <a href="#perlfunc-hex">perlfunc hex</a> and <a 
href="#perlfunc-oct">perlfunc oct</a> for more details.
 </p>
+<p>Hexadecimal floating point can start just like a hexadecimal literal,
+and it can be followed by an optional fractional hexadecimal part,
+but it must be followed by <code>p</code>, an optional sign, and a power of 
two.
+The format is useful for accurately presenting floating point values,
+avoiding conversions to or from decimal floating point, and therefore
+avoiding possible loss in precision.  Notice that while most current
+platforms use the 64-bit IEEE 754 floating point, not all do.  Another
+potential source of (low-order) differences are the floating point
+rounding modes, which can differ between CPUs, operating systems,
+and compilers, and which Perl doesn&rsquo;t control.
+</p>
 <p>You can also embed newlines directly in your strings, i.e., they can end
 on a different line than they begin.  This is nice, but if you forget
 your trailing quote, the error will not be reported until Perl finds
@@ -12066,24 +12181,64 @@
 equivalent to <code>$version{2}++</code>, not to 
<code>$version{'2.0'}++</code>.
 </p>
 <table class="menu" border="0" cellspacing="0">
-<tr><td align="left" valign="top">&bull; <a href="#perldata-Version-Strings" 
accesskey="1">perldata Version Strings</a>:</td><td>&nbsp;&nbsp;</td><td 
align="left" valign="top">
+<tr><td align="left" valign="top">&bull; <a 
href="#perldata-Special-floating-point_003a-infinity-_0028Inf_0029-and-not_002da_002dnumber-_0028NaN_0029"
 accesskey="1">perldata Special floating point: infinity (Inf) and not-a-number 
(NaN)</a>:</td><td>&nbsp;&nbsp;</td><td align="left" valign="top">
 </td></tr>
-<tr><td align="left" valign="top">&bull; <a href="#perldata-Special-Literals" 
accesskey="2">perldata Special Literals</a>:</td><td>&nbsp;&nbsp;</td><td 
align="left" valign="top">
+<tr><td align="left" valign="top">&bull; <a href="#perldata-Version-Strings" 
accesskey="2">perldata Version Strings</a>:</td><td>&nbsp;&nbsp;</td><td 
align="left" valign="top">
 </td></tr>
-<tr><td align="left" valign="top">&bull; <a href="#perldata-Barewords" 
accesskey="3">perldata Barewords</a>:</td><td>&nbsp;&nbsp;</td><td align="left" 
valign="top">
+<tr><td align="left" valign="top">&bull; <a href="#perldata-Special-Literals" 
accesskey="3">perldata Special Literals</a>:</td><td>&nbsp;&nbsp;</td><td 
align="left" valign="top">
 </td></tr>
-<tr><td align="left" valign="top">&bull; <a 
href="#perldata-Array-Interpolation" accesskey="4">perldata Array 
Interpolation</a>:</td><td>&nbsp;&nbsp;</td><td align="left" valign="top">
+<tr><td align="left" valign="top">&bull; <a href="#perldata-Barewords" 
accesskey="4">perldata Barewords</a>:</td><td>&nbsp;&nbsp;</td><td align="left" 
valign="top">
+</td></tr>
+<tr><td align="left" valign="top">&bull; <a 
href="#perldata-Array-Interpolation" accesskey="5">perldata Array 
Interpolation</a>:</td><td>&nbsp;&nbsp;</td><td align="left" valign="top">
 </td></tr>
 </table>
 
 <hr>
+<a 
name="perldata-Special-floating-point_003a-infinity-_0028Inf_0029-and-not_002da_002dnumber-_0028NaN_0029"></a>
+<div class="header">
+<p>
+Next: <a href="#perldata-Version-Strings" accesskey="n" rel="next">perldata 
Version Strings</a>, Up: <a href="#perldata-Scalar-value-constructors" 
accesskey="u" rel="up">perldata Scalar value constructors</a> &nbsp; [<a 
href="#SEC_Contents" title="Table of contents" rel="contents">Contents</a>]</p>
+</div>
+<a 
name="Special-floating-point_003a-infinity-_0028Inf_0029-and-not_002da_002dnumber-_0028NaN_0029"></a>
+<h4 class="subsubsection">11.2.5.1 Special floating point: infinity (Inf) and 
not-a-number (NaN)</h4>
+
+<p>Floating point values include the special values <code>Inf</code> and 
<code>NaN</code>,
+for infinity and not-a-number.  The infinity can be also negative.
+</p>
+<p>The infinity is the result of certain math operations that overflow
+the floating point range, like 9**9**9.  The not-a-number is the
+result when the result is undefined or unrepresentable.  Though note
+that you cannot get <code>NaN</code> from some common &quot;undefined&quot; or
+&quot;out-of-range&quot; operations like dividing by zero, or square root of
+a negative number, since Perl generates fatal errors for those.
+</p>
+<p>The infinity and not-a-number have their own special arithmetic rules.
+The general rule is that they are &quot;contagious&quot;: <code>Inf</code> 
plus one is
+<code>Inf</code>, and <code>NaN</code> plus one is <code>NaN</code>.  Where 
things get interesting
+is when you combine infinities and not-a-numbers: <code>Inf</code> minus 
<code>Inf</code>
+and <code>Inf</code> divided by <code>INf</code> are <code>NaN</code> (while 
<code>Inf</code> plus <code>Inf</code> is
+<code>Inf</code> and <code>Inf</code> times <code>Inf</code> is 
<code>Inf</code>).  <code>NaN</code> is also curious
+in that it does not equal any number, <em>including</em> itself:
+<code>NaN</code> != <code>NaN</code>.
+</p>
+<p>Perl doesn&rsquo;t understand <code>Inf</code> and <code>NaN</code> as 
numeric literals, but
+you can have them as strings, and Perl will convert them as needed:
+&quot;Inf&quot; + 1.  (You can, however, import them from the POSIX extension;
+<code>use POSIX qw(Inf NaN);</code> and then use them as literals.)
+</p>
+<p>Note that on input (string to number) Perl accepts <code>Inf</code> and 
<code>NaN</code>
+in many forms.   Case is ignored, and the Win32-specific forms like
+<code>1.#INF</code> are understood, but on output the values are normalized to
+<code>Inf</code> and <code>NaN</code>.
+</p>
+<hr>
 <a name="perldata-Version-Strings"></a>
 <div class="header">
 <p>
-Next: <a href="#perldata-Special-Literals" accesskey="n" rel="next">perldata 
Special Literals</a>, Up: <a href="#perldata-Scalar-value-constructors" 
accesskey="u" rel="up">perldata Scalar value constructors</a> &nbsp; [<a 
href="#SEC_Contents" title="Table of contents" rel="contents">Contents</a>]</p>
+Next: <a href="#perldata-Special-Literals" accesskey="n" rel="next">perldata 
Special Literals</a>, Previous: <a 
href="#perldata-Special-floating-point_003a-infinity-_0028Inf_0029-and-not_002da_002dnumber-_0028NaN_0029"
 accesskey="p" rel="prev">perldata Special floating point: infinity (Inf) and 
not-a-number (NaN)</a>, Up: <a href="#perldata-Scalar-value-constructors" 
accesskey="u" rel="up">perldata Scalar value constructors</a> &nbsp; [<a 
href="#SEC_Contents" title="Table of contents" rel="contents">Contents</a>]</p>
 </div>
 <a name="Version-Strings"></a>
-<h4 class="subsubsection">11.2.5.1 Version Strings</h4>
+<h4 class="subsubsection">11.2.5.2 Version Strings</h4>
 
 <p>A literal of the form <code>v1.20.300.4000</code> is parsed as a string 
composed
 of characters with the specified ordinals.  This form, known as
@@ -12118,7 +12273,7 @@
 Next: <a href="#perldata-Barewords" accesskey="n" rel="next">perldata 
Barewords</a>, Previous: <a href="#perldata-Version-Strings" accesskey="p" 
rel="prev">perldata Version Strings</a>, Up: <a 
href="#perldata-Scalar-value-constructors" accesskey="u" rel="up">perldata 
Scalar value constructors</a> &nbsp; [<a href="#SEC_Contents" title="Table of 
contents" rel="contents">Contents</a>]</p>
 </div>
 <a name="Special-Literals"></a>
-<h4 class="subsubsection">11.2.5.2 Special Literals</h4>
+<h4 class="subsubsection">11.2.5.3 Special Literals</h4>
 
 <p>The special literals __FILE__, __LINE__, and __PACKAGE__
 represent the current filename, line number, and package name at that
@@ -12159,7 +12314,7 @@
 Next: <a href="#perldata-Array-Interpolation" accesskey="n" 
rel="next">perldata Array Interpolation</a>, Previous: <a 
href="#perldata-Special-Literals" accesskey="p" rel="prev">perldata Special 
Literals</a>, Up: <a href="#perldata-Scalar-value-constructors" accesskey="u" 
rel="up">perldata Scalar value constructors</a> &nbsp; [<a href="#SEC_Contents" 
title="Table of contents" rel="contents">Contents</a>]</p>
 </div>
 <a name="Barewords"></a>
-<h4 class="subsubsection">11.2.5.3 Barewords</h4>
+<h4 class="subsubsection">11.2.5.4 Barewords</h4>
 
 <p>A word that has no other interpretation in the grammar will
 be treated as if it were a quoted string.  These are known as
@@ -12187,7 +12342,7 @@
 Previous: <a href="#perldata-Barewords" accesskey="p" rel="prev">perldata 
Barewords</a>, Up: <a href="#perldata-Scalar-value-constructors" accesskey="u" 
rel="up">perldata Scalar value constructors</a> &nbsp; [<a href="#SEC_Contents" 
title="Table of contents" rel="contents">Contents</a>]</p>
 </div>
 <a name="Array-Interpolation"></a>
-<h4 class="subsubsection">11.2.5.4 Array Interpolation</h4>
+<h4 class="subsubsection">11.2.5.5 Array Interpolation</h4>
 
 <p>Arrays and slices are interpolated into double-quoted strings
 by joining the elements with the delimiter specified in the 
<code>$&quot;</code>
@@ -12319,6 +12474,10 @@
 </p>
 <pre class="verbatim">    ($dev, $ino, undef, undef, $uid, $gid) = stat($file);
 </pre>
+<p>As of Perl 5.22, you can also use <code>(undef)x2</code> instead of 
<code>undef, undef</code>.
+(You can also do <code>($x) x 2</code>, which is less useful, because it 
assigns to
+the same variable twice, clobbering the first value assigned.)
+</p>
 <p>List assignment in scalar context returns the number of elements
 produced by the expression on the right side of the assignment:
 </p>
@@ -12534,26 +12693,21 @@
         s/(\w+)/\u\L$1/g;   # &quot;titlecase&quot; words
     }
 </pre>
-<p>A slice of an empty list is still an empty list.  Thus:
+<p>As a special exception, when you slice a list (but not an array or a hash),
+if the list evaluates to empty, then taking a slice of that empty list will
+always yield the empty list in turn.  Thus:
 </p>
-<pre class="verbatim">    @a = ()[1,0];           # @a has no elements
+<pre class="verbatim">    @a = ()[0,1];          # @a has no elements
     @b = (@a)[0,1];         # @b has no elements
-</pre>
-<p>But:
-</p>
-<pre class="verbatim">    @a = (1)[1,0];          # @a has two elements
-    @b = (1,undef)[1,0,2];  # @b has three elements
-</pre>
-<p>More generally, a slice yields the empty list if it indexes only
-beyond the end of a list:
-</p>
-<pre class="verbatim">    @a = (1)[  1,2];        # @a has no elements
-    @b = (1)[0,1,2];        # @b has three elements
+    @c = (sub{}-&gt;())[0,1]; # @c has no elements
+    @d = ('a','b')[0,1];   # @d has two elements
+    @e = (@d)[0,1,8,9];    # @e has four elements
+    @f = (@d)[8,9];        # @f has two elements
 </pre>
 <p>This makes it easy to write loops that terminate when a null list
 is returned:
 </p>
-<pre class="verbatim">    while ( ($home, $user) = (getpwent)[7,0]) {
+<pre class="verbatim">    while ( ($home, $user) = (getpwent)[7,0] ) {
         printf &quot;%-8s %s\n&quot;, $user, $home;
     }
 </pre>
@@ -13604,7 +13758,7 @@
 <dt><code>anchored(TYPE)</code></dt>
 <dd><a name="perldebguts-anchored_0028TYPE_0029"></a>
 <p>If the pattern may match only at a handful of places, with <code>TYPE</code>
-being <code>BOL</code>, <code>MBOL</code>, or <code>GPOS</code>.  See the 
table below.
+being <code>SBOL</code>, <code>MBOL</code>, or <code>GPOS</code>.  See the 
table below.
 </p>
 </dd>
 </dl>
@@ -13640,39 +13794,45 @@
  END             no         End of program.
  SUCCEED         no         Return from a subroutine, basically.
 
- # Anchors:
+ # Line Start Anchors:
+ SBOL            no         Match &quot;&quot; at beginning of line: /^/, /\A/
+ MBOL            no         Same, assuming multiline: /^/m
+
+ # Line End Anchors:
+ SEOL            no         Match &quot;&quot; at end of line: /$/
+ MEOL            no         Same, assuming multiline: /$/m
+ EOS             no         Match &quot;&quot; at end of string: /\z/
 
- BOL             no         Match &quot;&quot; at beginning of line.
- MBOL            no         Same, assuming multiline.
- SBOL            no         Same, assuming singleline.
- EOS             no         Match &quot;&quot; at end of string.
- EOL             no         Match &quot;&quot; at end of line.
- MEOL            no         Same, assuming multiline.
- SEOL            no         Same, assuming singleline.
- BOUND           no         Match &quot;&quot; at any word boundary using 
native
-                            charset rules for non-utf8
- BOUNDL          no         Match &quot;&quot; at any locale word boundary
- BOUNDU          no         Match &quot;&quot; at any word boundary using 
Unicode
-                            rules
- BOUNDA          no         Match &quot;&quot; at any word boundary using ASCII
-                            rules
- NBOUND          no         Match &quot;&quot; at any word non-boundary using
-                            native charset rules for non-utf8
- NBOUNDL         no         Match &quot;&quot; at any locale word non-boundary
- NBOUNDU         no         Match &quot;&quot; at any word non-boundary using
-                            Unicode rules
- NBOUNDA         no         Match &quot;&quot; at any word non-boundary using
-                            ASCII rules
+ # Match Start Anchors:
  GPOS            no         Matches where last m//g left off.
 
- # [Special] alternatives:
+ # Word Boundary Opcodes:
+ BOUND           no         Like BOUNDA for non-utf8, otherwise match 
&quot;&quot;
+                            between any Unicode \w\W or \W\w
+ BOUNDL          no         Like BOUND/BOUNDU, but \w and \W are defined
+                            by current locale
+ BOUNDU          no         Match &quot;&quot; at any boundary of a given type
+                            using Unicode rules
+ BOUNDA          no         Match &quot;&quot; at any boundary between \w\W or
+                            \W\w, where \w is [_a-zA-Z0-9]
+ NBOUND          no         Like NBOUNDA for non-utf8, otherwise match
+                            &quot;&quot; between any Unicode \w\w or \W\W
+ NBOUNDL         no         Like NBOUND/NBOUNDU, but \w and \W are
+                            defined by current locale
+ NBOUNDU         no         Match &quot;&quot; at any non-boundary of a given 
type
+                            using using Unicode rules
+ NBOUNDA         no         Match &quot;&quot; betweeen any \w\w or \W\W, 
where \w
+                            is [_a-zA-Z0-9]
 
+ # [Special] alternatives:
  REG_ANY         no         Match any one character (except newline).
  SANY            no         Match any one character.
  CANY            no         Match any one byte.
- ANYOF           sv         Match character in (or not in) this class,
+ ANYOF           sv 1       Match character in (or not in) this class,
                             single char match only
+ ANYOFL          sv 1       Like ANYOF, but /l is in effect
 
+ # POSIX Character Classes:
  POSIXD          none       Some [[:class:]] under /d; the FLAGS field
                             gives which one
  POSIXL          none       Some [[:class:]] under /l; the FLAGS field
@@ -13701,16 +13861,10 @@
  #
  BRANCH          node       Match this alternative, or the next...
 
- # Back pointer
-
- # BACK          Normal &quot;next&quot; pointers all implicitly point forward;
- #               BACK exists to make loop structures possible.
- # not used
- BACK            no         Match &quot;&quot;, &quot;next&quot; ptr points 
backward.
-
  # Literals
 
  EXACT           str        Match this string (preceded by length).
+ EXACTL          str        Like EXACT, but /l is in effect.
  EXACTF          str        Match this non-UTF-8 string (not guaranteed
                             to be folded) using /id rules (w/len).
  EXACTFL         str        Match this string (not guaranteed to be
@@ -13720,9 +13874,13 @@
                             UTF-8) using /iu rules (w/len).
  EXACTFA         str        Match this string (not guaranteed to be
                             folded) using /iaa rules (w/len).
+
  EXACTFU_SS      str        Match this string (folded iff in UTF-8,
                             length in folding may change even if not in
                             UTF-8) using /iu rules (w/len).
+ EXACTFLU8       str        Rare cirucmstances: like EXACTFU, but is
+                            under /l, UTF-8, folded, and everything in
+                            it is above 255.
  EXACTFA_NO_TRIE str        Match this string (which is not trie-able;
                             not guaranteed to be folded) using /iaa
                             rules (w/len).
@@ -13737,7 +13895,7 @@
  # Loops
 
  # STAR,PLUS    '?', and complex '*' and '+', are implemented as
- #               circular BRANCH structures using BACK.  Simple cases
+ #               circular BRANCH structures.  Simple cases
  #               (one character per match) are implemented with STAR
  #               and PLUS for speed and to minimize recursive plunges.
  #
@@ -13781,20 +13939,21 @@
                             unicode rules for non-utf8, no mixing ASCII,
                             non-ASCII
 
+ # Support for long RE
+ LONGJMP         off 1 1    Jump far away.
+ BRANCHJ         off 1 1    BRANCH with long offset.
+
+ # Special Case Regops
  IFMATCH         off 1 2    Succeeds if the following matches.
  UNLESSM         off 1 2    Fails if the following matches.
  SUSPEND         off 1 1    &quot;Independent&quot; sub-RE.
  IFTHEN          off 1 1    Switch, should be preceded by switcher.
  GROUPP          num 1      Whether the group matched.
 
- # Support for long RE
-
- LONGJMP         off 1 1    Jump far away.
- BRANCHJ         off 1 1    BRANCH with long offset.
-
  # The heavy worker
 
- EVAL            evl 1      Execute some Perl code.
+ EVAL            evl/flags  Execute some Perl code.
+                 2L
 
  # Modifiers
 
@@ -16444,6 +16603,20 @@
 &lsquo;perlfunc accept&rsquo;.
 </p>
 </dd>
+<dt>Aliasing via reference is experimental</dt>
+<dd><a name="perldiag-Aliasing-via-reference-is-experimental"></a>
+<p>(S experimental::refaliasing) This warning is emitted if you use
+a reference constructor on the left-hand side of an assignment to
+alias one variable to another.  Simply suppress the warning if you
+want to use the feature, but know that in doing so you are taking
+the risk of using an experimental feature which may change or be
+removed in a future Perl version:
+</p>
+<pre class="verbatim">    no warnings &quot;experimental::refaliasing&quot;;
+    use feature &quot;refaliasing&quot;;
+    \$x = \$y;
+</pre>
+</dd>
 <dt>Allocation too large: %x</dt>
 <dd><a name="perldiag-Allocation-too-large_003a-_0025x"></a>
 <p>(X) You can&rsquo;t allocate more than 64K on an MS-DOS machine.
@@ -16584,6 +16757,11 @@
 that expected a numeric value instead.  If you&rsquo;re fortunate the message
 will identify which operator was so unfortunate.
 </p>
+<p>Note that for the <code>Inf</code> and <code>NaN</code> (infinity and 
not-a-number) the
+definition of &quot;numeric&quot; is somewhat unusual: the strings themselves
+(like &quot;Inf&quot;) are considered numeric, and anything following them is
+considered non-numeric.
+</p>
 </dd>
 <dt>Argument list not closed for PerlIO layer &quot;%s&quot;</dt>
 <dd><a 
name="perldiag-Argument-list-not-closed-for-PerlIO-layer-_0022_0025s_0022"></a>
@@ -16596,19 +16774,11 @@
 result of the value of the environment variable PERLIO.
 </p>
 </dd>
-<dt>Array @%s missing the @ in argument %d of %s()</dt>
-<dd><a 
name="perldiag-Array-_0040_0025s-missing-the-_0040-in-argument-_0025d-of-_0025s_0028_0029"></a>
-<p>(D deprecated) Really old Perl let you omit the @ on array names in some
-spots.  This is now heavily deprecated.
-</p>
-</dd>
-<dt>A sequence of multiple spaces in a charnames alias definition is 
deprecated</dt>
-<dd><a 
name="perldiag-A-sequence-of-multiple-spaces-in-a-charnames-alias-definition-is-deprecated"></a>
-<p>(D deprecated) You defined a character name which had multiple space
-characters in a row.  Change them to single spaces.  Usually these
-names are defined in the <code>:alias</code> import argument to <code>use 
charnames</code>, but
-they could be defined by a translator installed into 
<code>$^H{charnames}</code>.
-See <a href="charnames.html#CUSTOM-ALIASES">(charnames)CUSTOM ALIASES</a>.
+<dt>Argument &quot;%s&quot; treated as 0 in increment (++)</dt>
+<dd><a 
name="perldiag-Argument-_0022_0025s_0022-treated-as-0-in-increment-_0028_002b_002b_0029"></a>
+<p>(W numeric) The indicated string was fed as an argument to the 
<code>++</code>
+operator which expects either a number or a string matching
+<code>/^[a-zA-Z]*[0-9]*\z/</code>.  See <a 
href="#perlop-Auto_002dincrement-and-Auto_002ddecrement">perlop Auto-increment 
and Auto-decrement</a> for details.
 </p>
 </dd>
 <dt>assertion botched: %s</dt>
@@ -16621,6 +16791,25 @@
 <p>(X) A general assertion failed.  The file in question must be examined.
 </p>
 </dd>
+<dt>Assigned value is not a reference</dt>
+<dd><a name="perldiag-Assigned-value-is-not-a-reference"></a>
+<p>(F) You tried to assign something that was not a reference to an lvalue
+reference (e.g., <code>\$x = $y</code>).  If you meant to make $x an alias to 
$y, use
+<code>\$x = \$y</code>.
+</p>
+</dd>
+<dt>Assigned value is not %s reference</dt>
+<dd><a name="perldiag-Assigned-value-is-not-_0025s-reference"></a>
+<p>(F) You tried to assign a reference to a reference constructor, but the
+two references were not of the same type.  You cannot alias a scalar to
+an array, or an array to a hash; the two types must match.
+</p>
+<pre class="verbatim">    \$x = address@hidden;  # error
+    address@hidden = \%y;  # error
+     $y = [];
+    \$x = $y;   # error; did you mean \$y?
+</pre>
+</dd>
 <dt>Assigning non-zero to $[ is no longer possible</dt>
 <dd><a 
name="perldiag-Assigning-non_002dzero-to-_0024_005b-is-no-longer-possible"></a>
 <p>(F) When the &quot;array_base&quot; feature is disabled (e.g., under 
<code>use v5.16;</code>)
@@ -16634,6 +16823,12 @@
 know which context to supply to the right side.
 </p>
 </dd>
+<dt>&lt;&gt; at require-statement should be quotes</dt>
+<dd><a 
name="perldiag-_003c_003e-at-require_002dstatement-should-be-quotes"></a>
+<p>(F) You wrote <code>require &lt;file&gt;</code> when you should have written
+<code>require 'file'</code>.
+</p>
+</dd>
 <dt>Attempt to access disallowed key &rsquo;%s&rsquo; in a restricted hash</dt>
 <dd><a 
name="perldiag-Attempt-to-access-disallowed-key-_0027_0025s_0027-in-a-restricted-hash"></a>
 <p>(F) The failing code has attempted to get or set a key which is not in
@@ -16969,18 +17164,6 @@
 Check your control flow and number of arguments.
 </p>
 </dd>
-<dt>&quot;\b{&quot; is deprecated; use &quot;\b\{&quot; or &quot;\b[{]&quot; 
instead in regex; marked by &lt;&ndash;&nbsp;HERE<!-- /@w --> in m/%s/</dt>
-<dd><a 
name="perldiag-_0022_005cb_007b_0022-is-deprecated_003b-use-_0022_005cb_005c_007b_0022-or-_0022_005cb_005b_007b_005d_0022-instead-in-regex_003b-marked-by-_003c_002d_002d-HERE-in-m_002f_0025s_002f"></a>
-</dd>
-<dt>&quot;\B{&quot; is deprecated; use &quot;\B\{&quot; or &quot;\B[{]&quot; 
instead in regex; marked by &lt;&ndash;&nbsp;HERE<!-- /@w --> in m/%s/</dt>
-<dd><a 
name="perldiag-_0022_005cB_007b_0022-is-deprecated_003b-use-_0022_005cB_005c_007b_0022-or-_0022_005cB_005b_007b_005d_0022-instead-in-regex_003b-marked-by-_003c_002d_002d-HERE-in-m_002f_0025s_002f"></a>
-<p>(D deprecated) Use of an unescaped &quot;{&quot; immediately following
-a <code>\b</code> or <code>\B</code> is now deprecated so as to reserve its 
use for Perl
-itself in a future release.  You can either precede the brace
-with a backslash, or enclose it in square brackets; the latter
-is the way to go if the pattern delimiters are <code>{}</code>.
-</p>
-</dd>
 <dt>Bit vector size &gt; 32 non-portable</dt>
 <dd><a name="perldiag-Bit-vector-size-_003e-32-non_002dportable"></a>
 <p>(W portable) Using bit vector sizes larger than 32 is non-portable.
@@ -16998,6 +17181,22 @@
 encountered an invalid data type.
 </p>
 </dd>
+<dt>Both or neither range ends should be Unicode in regex; marked by 
&lt;&ndash;&nbsp;HERE<!-- /@w --> in m/%s/</dt>
+<dd><a 
name="perldiag-Both-or-neither-range-ends-should-be-Unicode-in-regex_003b-marked-by-_003c_002d_002d-HERE-in-m_002f_0025s_002f"></a>
+<p>(W regexp) (only under <code>use&nbsp;re&nbsp;'strict'<!-- /@w --></code> 
or within <code>(?[...])</code>)
+</p>
+<p>In a bracketed character class in a regular expression pattern, you
+had a range which has exactly one end of it specified using <code>\N{}</code>, 
and
+the other end is specified using a non-portable mechanism.  Perl treats
+the range as a Unicode range, that is, all the characters in it are
+considered to be the Unicode characters, and which may be different code
+points on some platforms Perl runs on.  For example, 
<code>[\N{U+06}-\x08]</code>
+is treated as if you had instead said <code>[\N{U+06}-\N{U+08}]</code>, that 
is it
+matches the characters whose code points in Unicode are 6, 7, and 8.
+But that <code>\x08</code> might indicate that you meant something different, 
so
+the warning gets raised.
+</p>
+</dd>
 <dt>Buffer overflow in prime_env_iter: %s</dt>
 <dd><a 
name="perldiag-Buffer-overflow-in-prime_005fenv_005fiter_003a-_0025s"></a>
 <p>(W internal) A warning peculiar to VMS.  While Perl was preparing to
@@ -17029,11 +17228,22 @@
 the function&rsquo;s name in <a href="POSIX.html#Top">(POSIX)</a> for details.
 </p>
 </dd>
+<dt>Cannot chr %f</dt>
+<dd><a name="perldiag-Cannot-chr-_0025f"></a>
+<p>(F) You passed an invalid number (like an infinity or not-a-number) to 
<code>chr</code>.
+</p>
+</dd>
+<dt>Cannot compress %f in pack</dt>
+<dd><a name="perldiag-Cannot-compress-_0025f-in-pack"></a>
+<p>(F) You tried compressing an infinity or not-a-number as an unsigned
+integer with BER, which makes no sense.
+</p>
+</dd>
 <dt>Cannot compress integer in pack</dt>
 <dd><a name="perldiag-Cannot-compress-integer-in-pack"></a>
-<p>(F) An argument to pack(&quot;w&quot;,...) was too large to compress.  The 
BER
-compressed integer format can only be used with positive integers, and you
-attempted to compress Infinity or a very large number (&gt; 1e308).
+<p>(F) An argument to pack(&quot;w&quot;,...) was too large to compress.
+The BER compressed integer format can only be used with positive
+integers, and you attempted to compress a very large number (&gt; 1e308).
 See &lsquo;perlfunc pack&rsquo;.
 </p>
 </dd>
@@ -17063,6 +17273,18 @@
 either with open() or binmode().
 </p>
 </dd>
+<dt>Cannot pack %f with &rsquo;%c&rsquo;</dt>
+<dd><a name="perldiag-Cannot-pack-_0025f-with-_0027_0025c_0027"></a>
+<p>(F) You tried converting an infinity or not-a-number to an integer,
+which makes no sense.
+</p>
+</dd>
+<dt>Cannot printf %f with &rsquo;%c&rsquo;</dt>
+<dd><a name="perldiag-Cannot-printf-_0025f-with-_0027_0025c_0027"></a>
+<p>(F) You tried printing an infinity or not-a-number as a character (%c),
+which makes no sense.  Maybe you meant &rsquo;%s&rsquo;, or just stringifying 
it?
+</p>
+</dd>
 <dt>Cannot set tied @DB::args</dt>
 <dd><a name="perldiag-Cannot-set-tied-_0040DB_003a_003aargs"></a>
 <p>(F) <code>caller</code> tried to set <code>@DB::args</code>, but found it 
tied.  Tying <code>@DB::args</code>
@@ -17224,6 +17446,33 @@
 inplace editing with the <strong>-i</strong> switch.  The file was ignored.
 </p>
 </dd>
+<dt>Can&rsquo;t do %s(&quot;%s&quot;) on non-UTF-8 locale; resolved to 
&quot;%s&quot;.</dt>
+<dd><a 
name="perldiag-Can_0027t-do-_0025s_0028_0022_0025s_0022_0029-on-non_002dUTF_002d8-locale_003b-resolved-to-_0022_0025s_0022_002e"></a>
+<p>(W locale) You are 1) running under &quot;<code>use locale</code>&quot;; 2) 
the current
+locale is not a UTF-8 one; 3) you tried to do the designated case-change
+operation on the specified Unicode character; and 4) the result of this
+operation would mix Unicode and locale rules, which likely conflict.
+Mixing of different rule types is forbidden, so the operation was not
+done; instead the result is the indicated value, which is the best
+available that uses entirely Unicode rules.  That turns out to almost
+always be the original character, unchanged.
+</p>
+<p>It is generally a bad idea to mix non-UTF-8 locales and Unicode, and
+this issue is one of the reasons why.  This warning is raised when
+Unicode rules would normally cause the result of this operation to
+contain a character that is in the range specified by the locale,
+0..255, and hence is subject to the locale&rsquo;s rules, not Unicode&rsquo;s.
+</p>
+<p>If you are using locale purely for its characteristics related to things
+like its numeric and time formatting (and not <code>LC_CTYPE</code>), consider
+using a restricted form of the locale pragma (see <a 
href="#perllocale-The-_0022use-locale_0022-pragma">perllocale The 
<code>&quot;use locale&quot;</code> pragma</a>) like 
&quot;<code>use&nbsp;locale&nbsp;<span 
class="nolinebreak">':not_characters'</span></code><!-- /@w -->&quot;.
+</p>
+<p>Note that failed case-changing operations done as a result of
+case-insensitive <code>/i</code> regular expression matching will show up in 
this
+warning as having the <code>fc</code> operation (as that is what the regular
+expression engine calls behind the scenes.)
+</p>
+</dd>
 <dt>Can&rsquo;t do waitpid with flags</dt>
 <dd><a name="perldiag-Can_0027t-do-waitpid-with-flags"></a>
 <p>(F) This machine doesn&rsquo;t have either waitpid() or wait4(), so only
@@ -17554,6 +17803,30 @@
 such.  See <a href="#perlsub-Lvalue-subroutines">perlsub Lvalue 
subroutines</a>.
 </p>
 </dd>
+<dt>Can&rsquo;t modify reference to %s in %s assignment</dt>
+<dd><a 
name="perldiag-Can_0027t-modify-reference-to-_0025s-in-_0025s-assignment"></a>
+<p>(F) Only a limited number of constructs can be used as the argument to a
+reference constructor on the left-hand side of an assignment, and what
+you used was not one of them.  See <a 
href="#perlref-Assigning-to-References">perlref Assigning to References</a>.
+</p>
+</dd>
+<dt>Can&rsquo;t modify reference to localized parenthesized array in list 
assignment</dt>
+<dd><a 
name="perldiag-Can_0027t-modify-reference-to-localized-parenthesized-array-in-list-assignment"></a>
+<p>(F) Assigning to <code>\local(@array)</code> or <code>\(local 
@array)</code> is not supported, as
+it is not clear exactly what it should do.  If you meant to make @array
+refer to some other array, use <code>address@hidden = address@hidden</code>.  
If you want to
+make the elements of @array aliases of the scalars referenced on the
+right-hand side, use <code>\(@array) = @scalar_refs</code>.
+</p>
+</dd>
+<dt>Can&rsquo;t modify reference to parenthesized hash in list assignment</dt>
+<dd><a 
name="perldiag-Can_0027t-modify-reference-to-parenthesized-hash-in-list-assignment"></a>
+<p>(F) Assigning to <code>\(%hash)</code> is not supported.  If you meant to 
make %hash
+refer to some other hash, use <code>\%hash = \%other_hash</code>.  If you want 
to
+make the elements of %hash into aliases of the scalars referenced on the
+right-hand side, use a hash slice: <code>address@hidden@keys} = 
@those_scalar_refs</code>.
+</p>
+</dd>
 <dt>Can&rsquo;t msgrcv to read-only var</dt>
 <dd><a name="perldiag-Can_0027t-msgrcv-to-read_002donly-var"></a>
 <p>(F) The target of a msgrcv must be modifiable to be used as a receive
@@ -17570,13 +17843,6 @@
 once.  See <a href="#perlfunc-next">perlfunc next</a>.
 </p>
 </dd>
-<dt>Can&rsquo;t open %s</dt>
-<dd><a name="perldiag-Can_0027t-open-_0025s"></a>
-<p>(F) You tried to run a perl built with MAD support with
-the PERL_XMLDUMP environment variable set, but the file
-named by that variable could not be opened.
-</p>
-</dd>
 <dt>Can&rsquo;t open %s: %s</dt>
 <dd><a name="perldiag-Can_0027t-open-_0025s_003a-_0025s"></a>
 <p>(S inplace) The implicit opening of a file through use of the 
<code>&lt;&gt;</code>
@@ -17683,6 +17949,14 @@
 to reopen it to accept binary data.  Alas, it failed.
 </p>
 </dd>
+<dt>Can&rsquo;t represent character for Ox%X on this platform</dt>
+<dd><a 
name="perldiag-Can_0027t-represent-character-for-Ox_0025X-on-this-platform"></a>
+<p>(F) There is a hard limit to how big a character code point can be due
+to the fundamental properties of UTF-8, especially on EBCDIC
+platforms.  The given code point exceeds that.  The only work-around is
+to not use such a large code point.
+</p>
+</dd>
 <dt>Can&rsquo;t reset %ENV on this system</dt>
 <dd><a name="perldiag-Can_0027t-reset-_0025ENV-on-this-system"></a>
 <p>(F) You called <code>reset('E')</code> or similar, which tried to reset
@@ -17762,6 +18036,22 @@
 other than &quot;=&quot; after the module name.
 </p>
 </dd>
+<dt>Can&rsquo;t use a hash as a reference</dt>
+<dd><a name="perldiag-Can_0027t-use-a-hash-as-a-reference"></a>
+<p>(F) You tried to use a hash as a reference, as in
+<code>%foo-&gt;{&quot;bar&quot;}</code> or 
<code>%$ref-&gt;{&quot;hello&quot;}</code>.  Versions of perl
+&lt;= 5.22.0 used to allow this syntax, but shouldn&rsquo;t
+have.  This was deprecated in perl 5.6.1.
+</p>
+</dd>
+<dt>Can&rsquo;t use an array as a reference</dt>
+<dd><a name="perldiag-Can_0027t-use-an-array-as-a-reference"></a>
+<p>(F) You tried to use an array as a reference, as in
+<code>@foo-&gt;[23]</code> or <code>@$ref-&gt;[99]</code>.  Versions of perl 
&lt;= 5.22.0
+used to allow this syntax, but shouldn&rsquo;t have.  This
+was deprecated in perl 5.6.1.
+</p>
+</dd>
 <dt>Can&rsquo;t use anonymous symbol table for method lookup</dt>
 <dd><a 
name="perldiag-Can_0027t-use-anonymous-symbol-table-for-method-lookup"></a>
 <p>(F) The internal routine that does method lookup was handed a symbol
@@ -17795,10 +18085,39 @@
 allowed.  See &lsquo;perlfunc pack&rsquo;.
 </p>
 </dd>
+<dt>Can&rsquo;t use &rsquo;defined(@array)&rsquo; (Maybe you should just omit 
the defined()?)</dt>
+<dd><a 
name="perldiag-Can_0027t-use-_0027defined_0028_0040array_0029_0027-_0028Maybe-you-should-just-omit-the-defined_0028_0029_003f_0029"></a>
+<p>(F) defined() is not useful on arrays because it
+checks for an undefined <em>scalar</em> value.  If you want to see if the
+array is empty, just use <code>if (@array) { # not empty }</code> for example.
+</p>
+</dd>
+<dt>Can&rsquo;t use &rsquo;defined(%hash)&rsquo; (Maybe you should just omit 
the defined()?)</dt>
+<dd><a 
name="perldiag-Can_0027t-use-_0027defined_0028_0025hash_0029_0027-_0028Maybe-you-should-just-omit-the-defined_0028_0029_003f_0029"></a>
+<p>(F) <code>defined()</code> is not usually right on hashes.
+</p>
+<p>Although <code>defined %hash</code> is false on a plain not-yet-used hash, 
it
+becomes true in several non-obvious circumstances, including iterators,
+weak references, stash names, even remaining true after <code>undef 
%hash</code>.
+These things make <code>defined %hash</code> fairly useless in practice, so it 
now
+generates a fatal error.
+</p>
+<p>If a check for non-empty is what you wanted then just put it in boolean
+context (see <a href="#perldata-Scalar-values">perldata Scalar values</a>):
+</p>
+<pre class="verbatim">    if (%hash) {
+       # not empty
+    }
+</pre>
+<p>If you had <code>defined %Foo::Bar::QUUX</code> to check whether such a 
package
+variable exists then that&rsquo;s never really been reliable, and isn&rsquo;t
+a good way to enquire about the features of a package, or whether
+it&rsquo;s loaded, etc.
+</p>
+</dd>
 <dt>Can&rsquo;t use %s for loop variable</dt>
 <dd><a name="perldiag-Can_0027t-use-_0025s-for-loop-variable"></a>
-<p>(F) Only a simple scalar variable may be used as a loop variable on a
-foreach.
+<p>(P) The parser got confused when trying to parse a <code>foreach</code> 
loop.
 </p>
 </dd>
 <dt>Can&rsquo;t use global %s in &quot;%s&quot;</dt>
@@ -17888,15 +18207,10 @@
 </dd>
 <dt>Character following &quot;\c&quot; must be printable ASCII</dt>
 <dd><a 
name="perldiag-Character-following-_0022_005cc_0022-must-be-printable-ASCII"></a>
-<p>(F)(D deprecated, syntax) In <code>\c<em>X</em></code>, <em>X</em> must be 
a printable
-(non-control) ASCII character.  This is fatal starting in v5.20 for
-non-ASCII characters, and it is planned to make this fatal in all
-instances in Perl v5.22.  In
-the cases where it isn&rsquo;t fatal, the character this evaluates to is
-derived by exclusive or&rsquo;ing the code point of this character with 0x40.
+<p>(F) In <code>\c<em>X</em></code>, <em>X</em> must be a printable 
(non-control) ASCII character.
 </p>
 <p>Note that ASCII characters that don&rsquo;t map to control characters are
-discouraged here as well, and will generate the warning (when enabled)
+discouraged, and will generate the warning (when enabled)
 <a 
href="#perldiag-_0022_005cc_0025c_0022-is-more-clearly-written-simply-as-_0022_0025s_0022">&quot;\c%c&quot;
 is more clearly written simply as &quot;%s&quot;</a>.
 </p>
 </dd>
@@ -17984,6 +18298,35 @@
 <pre class="verbatim">   unpack(&quot;s&quot;, &quot;\x{f3}b&quot;)
 </pre>
 </dd>
+<dt>charnames alias definitions may not contain a sequence of multiple 
spaces</dt>
+<dd><a 
name="perldiag-charnames-alias-definitions-may-not-contain-a-sequence-of-multiple-spaces"></a>
+<p>(F) You defined a character name which had multiple space characters
+in a row.  Change them to single spaces.  Usually these names are
+defined in the <code>:alias</code> import argument to <code>use 
charnames</code>, but they
+could be defined by a translator installed into <code>$^H{charnames}</code>.  
See
+<a href="charnames.html#CUSTOM-ALIASES">(charnames)CUSTOM ALIASES</a>.
+</p>
+</dd>
+<dt>charnames alias definitions may not contain trailing white-space</dt>
+<dd><a 
name="perldiag-charnames-alias-definitions-may-not-contain-trailing-white_002dspace"></a>
+<p>(F) You defined a character name which ended in a space
+character.  Remove the trailing space(s).  Usually these names are
+defined in the <code>:alias</code> import argument to <code>use 
charnames</code>, but they
+could be defined by a translator installed into <code>$^H{charnames}</code>.
+See <a href="charnames.html#CUSTOM-ALIASES">(charnames)CUSTOM ALIASES</a>.
+</p>
+</dd>
+<dt>\C is deprecated in regex; marked by &lt;&ndash;&nbsp;HERE<!-- /@w --> in 
m/%s/</dt>
+<dd><a 
name="perldiag-_005cC-is-deprecated-in-regex_003b-marked-by-_003c_002d_002d-HERE-in-m_002f_0025s_002f"></a>
+<p>(D deprecated, regexp) The \C character class is deprecated, and will
+become a compile-time error in a future release of perl (tentatively
+v5.24).  This construct allows you to match a single byte of what makes
+up a multi-byte single UTF8 character, and breaks encapsulation.  It is
+currently also very buggy.  If you really need to process the individual
+bytes, you probably want to convert your string to one where each
+underlying byte is stored as a character, with utf8::encode().
+</p>
+</dd>
 <dt>&quot;\c%c&quot; is more clearly written simply as &quot;%s&quot;</dt>
 <dd><a 
name="perldiag-_0022_005cc_0025c_0022-is-more-clearly-written-simply-as-_0022_0025s_0022"></a>
 <p>(W syntax) The <code>\c<em>X</em></code> construct is intended to be a way 
to specify
@@ -18038,8 +18381,8 @@
 <dt>%s: Command not found</dt>
 <dd><a name="perldiag-_0025s_003a-Command-not-found"></a>
 <p>(A) You&rsquo;ve accidentally run your script through <strong>csh</strong> 
or another shell
-instead of Perl.  Check the #! line, or manually feed your script
-into Perl yourself.  The #! line at the top of your file could look like
+instead of Perl.  Check the #! line, or manually feed your script into
+Perl yourself.  The #! line at the top of your file could look like
 </p>
 <pre class="verbatim">  #!/usr/bin/perl -w
 </pre>
@@ -18094,6 +18437,42 @@
 See <a href="#perlsub-Constant-Functions">perlsub Constant Functions</a> and 
<a href="constant.html#Top">(constant)</a>.
 </p>
 </dd>
+<dt>Constants from lexical variables potentially modified elsewhere are 
deprecated</dt>
+<dd><a 
name="perldiag-Constants-from-lexical-variables-potentially-modified-elsewhere-are-deprecated"></a>
+<p>(D deprecated) You wrote something like
+</p>
+<pre class="verbatim">    my $var;
+    $sub = sub () { $var };
+</pre>
+<p>but $var is referenced elsewhere and could be modified after the 
<code>sub</code>
+expression is evaluated.  Either it is explicitly modified elsewhere
+(<code>$var = 3</code>) or it is passed to a subroutine or to an operator like
+<code>printf</code> or <code>map</code>, which may or may not modify the 
variable.
+</p>
+<p>Traditionally, Perl has captured the value of the variable at that
+point and turned the subroutine into a constant eligible for inlining.
+In those cases where the variable can be modified elsewhere, this
+breaks the behavior of closures, in which the subroutine captures
+the variable itself, rather than its value, so future changes to the
+variable are reflected in the subroutine&rsquo;s return value.
+</p>
+<p>This usage is deprecated, because the behavior is likely to change
+in a future version of Perl.
+</p>
+<p>If you intended for the subroutine to be eligible for inlining, then
+make sure the variable is not referenced elsewhere, possibly by
+copying it:
+</p>
+<pre class="verbatim">    my $var2 = $var;
+    $sub = sub () { $var2 };
+</pre>
+<p>If you do want this subroutine to be a closure that reflects future
+changes to the variable that it closes over, add an explicit 
<code>return</code>:
+</p>
+<pre class="verbatim">    my $var;
+    $sub = sub () { return $var };
+</pre>
+</dd>
 <dt>Constant subroutine %s redefined</dt>
 <dd><a name="perldiag-Constant-subroutine-_0025s-redefined"></a>
 <p>(W redefine)(S) You redefined a subroutine which had previously
@@ -18113,7 +18492,22 @@
 <p>(F) The parser found inconsistencies either while attempting
 to define an overloaded constant, or when trying to find the
 character name specified in the <code>\N{...}</code> escape.  Perhaps you
-forgot to load the corresponding <a href="overload.html#Top">(overload)</a> 
pragma?.
+forgot to load the corresponding <a href="overload.html#Top">(overload)</a> 
pragma?
+</p>
+</dd>
+<dt>:const is experimental</dt>
+<dd><a name="perldiag-_003aconst-is-experimental"></a>
+<p>(S experimental::const_attr) The &quot;const&quot; attribute is 
experimental.
+If you want to use the feature, disable the warning with <code>no warnings
+'experimental::const_attr'</code>, but know that in doing so you are taking
+the risk that your code may break in a future Perl version.
+</p>
+</dd>
+<dt>:const is not permitted on named subroutines</dt>
+<dd><a name="perldiag-_003aconst-is-not-permitted-on-named-subroutines"></a>
+<p>(F) The &quot;const&quot; attribute causes an anonymous subroutine to be 
run and
+its value captured at the time that it is cloned.  Named subroutines are
+not cloned like this, so the attribute does not make sense on them.
 </p>
 </dd>
 <dt>Copy method did not return a reference</dt>
@@ -18183,43 +18577,13 @@
 setting the C pre-processor macro <code>PERL_SUB_DEPTH_WARN</code> to the 
desired value.
 </p>
 </dd>
-<dt>defined(@array) is deprecated</dt>
-<dd><a name="perldiag-defined_0028_0040array_0029-is-deprecated"></a>
-<p>(D deprecated) defined() is not usually useful on arrays because it
-checks for an undefined <em>scalar</em> value.  If you want to see if the
-array is empty, just use <code>if (@array) { # not empty }</code> for example.
-</p>
-</dd>
-<dt>defined(%hash) is deprecated</dt>
-<dd><a name="perldiag-defined_0028_0025hash_0029-is-deprecated"></a>
-<p>(D deprecated) <code>defined()</code> is not usually right on hashes and 
has been
-discouraged since 5.004.
-</p>
-<p>Although <code>defined %hash</code> is false on a plain not-yet-used hash, 
it
-becomes true in several non-obvious circumstances, including iterators,
-weak references, stash names, even remaining true after <code>undef 
%hash</code>.
-These things make <code>defined %hash</code> fairly useless in practice.
-</p>
-<p>If a check for non-empty is what you wanted then just put it in boolean
-context (see <a href="#perldata-Scalar-values">perldata Scalar values</a>):
-</p>
-<pre class="verbatim">    if (%hash) {
-       # not empty
-    }
-</pre>
-<p>If you had <code>defined %Foo::Bar::QUUX</code> to check whether such a 
package
-variable exists then that&rsquo;s never really been reliable, and isn&rsquo;t
-a good way to enquire about the features of a package, or whether
-it&rsquo;s loaded, etc.
-</p>
-</dd>
 <dt>(?(DEFINE)....) does not allow branches in regex; marked by 
&lt;&ndash;&nbsp;HERE<!-- /@w --> in m/%s/</dt>
 <dd><a 
name="perldiag-_0028_003f_0028DEFINE_0029_002e_002e_002e_002e_0029-does-not-allow-branches-in-regex_003b-marked-by-_003c_002d_002d-HERE-in-m_002f_0025s_002f"></a>
 <p>(F) You used something like <code>(?(DEFINE)...|..)</code> which is 
illegal.  The
 most likely cause of this error is that you left out a parenthesis inside
 of the <code>....</code> part.
 </p>
-<p>The &lt;&ndash; HERE shows whereabouts in the regular expression the 
problem was
+<p>The &lt;&ndash;&nbsp;HERE<!-- /@w --> shows whereabouts in the regular 
expression the problem was
 discovered.
 </p>
 </dd>
@@ -18445,24 +18809,6 @@
 conversion routines don&rsquo;t handle.  Drat.
 </p>
 </dd>
-<dt>Escape literal pattern white space under /x</dt>
-<dd><a name="perldiag-Escape-literal-pattern-white-space-under-_002fx"></a>
-<p>(D deprecated) You compiled a regular expression pattern with 
<code>/x</code> to
-ignore white space, and you used, as a literal, one of the characters
-that Perl plans to eventually treat as white space.  The character must
-be escaped somehow, or it will work differently on a future Perl that
-does treat it as white space.  The easiest way is to insert a backslash
-immediately before it, or to enclose it with square brackets.  This
-change is to bring Perl into conformance with Unicode recommendations.
-Here are the five characters that generate this warning:
-U+0085 NEXT LINE,
-U+200E LEFT-TO-RIGHT MARK,
-U+200F RIGHT-TO-LEFT MARK,
-U+2028 LINE SEPARATOR,
-and
-U+2029 PARAGRAPH SEPARATOR.
-</p>
-</dd>
 <dt>Eval-group in insecure regular expression</dt>
 <dd><a name="perldiag-Eval_002dgroup-in-insecure-regular-expression"></a>
 <p>(F) Perl detected tainted data when trying to compile a regular
@@ -18493,7 +18839,7 @@
 <p>(F) You used a pattern that nested too many EVAL calls without consuming
 any text.  Restructure the pattern so that text is consumed.
 </p>
-<p>The &lt;&ndash; HERE shows whereabouts in the regular expression the 
problem was
+<p>The &lt;&ndash;&nbsp;HERE<!-- /@w --> shows whereabouts in the regular 
expression the problem was
 discovered.
 </p>
 </dd>
@@ -18586,6 +18932,15 @@
 <a 
href="#perlrecharclass-Extended-Bracketed-Character-Classes">perlrecharclass 
Extended Bracketed Character Classes</a>.
 </p>
 </dd>
+<dt>Experimental aliasing via reference not enabled</dt>
+<dd><a name="perldiag-Experimental-aliasing-via-reference-not-enabled"></a>
+<p>(F) To do aliasing via references, you must first enable the feature:
+</p>
+<pre class="verbatim">    no warnings &quot;experimental::refaliasing&quot;;
+    use feature &quot;refaliasing&quot;;
+    \$x = \$y;
+</pre>
+</dd>
 <dt>Experimental subroutine signatures not enabled</dt>
 <dd><a name="perldiag-Experimental-subroutine-signatures-not-enabled"></a>
 <p>(F) To use subroutine signatures, you must first enable them:
@@ -18786,8 +19141,8 @@
 <a href="#perlsyn-Experimental-Details-on-given-and-when">perlsyn Experimental 
Details on given and when</a>.
 </p>
 </dd>
-<dt>Global symbol &quot;%s&quot; requires explicit package name</dt>
-<dd><a 
name="perldiag-Global-symbol-_0022_0025s_0022-requires-explicit-package-name"></a>
+<dt>Global symbol &quot;%s&quot; requires explicit package name (did you 
forget to declare &quot;my %s&quot;?)</dt>
+<dd><a 
name="perldiag-Global-symbol-_0022_0025s_0022-requires-explicit-package-name-_0028did-you-forget-to-declare-_0022my-_0025s_0022_003f_0029"></a>
 <p>(F) You&rsquo;ve said &quot;use strict&quot; or &quot;use strict 
vars&quot;, which indicates 
 that all variables must either be lexically scoped (using &quot;my&quot; or 
&quot;state&quot;), 
 declared beforehand using &quot;our&quot;, or explicitly qualified to say 
@@ -18881,18 +19236,59 @@
 created on an emergency basis to prevent a core dump.
 </p>
 </dd>
-<dt>Hash %%s missing the % in argument %d of %s()</dt>
-<dd><a 
name="perldiag-Hash-_0025_0025s-missing-the-_0025-in-argument-_0025d-of-_0025s_0028_0029"></a>
-<p>(D deprecated) Really old Perl let you omit the % on hash names in some
-spots.  This is now heavily deprecated.
-</p>
-</dd>
 <dt>%s has too many errors</dt>
 <dd><a name="perldiag-_0025s-has-too-many-errors"></a>
 <p>(F) The parser has given up trying to parse the program after 10 errors.
 Further error messages would likely be uninformative.
 </p>
 </dd>
+<dt>Having more than one /%c regexp modifier is deprecated</dt>
+<dd><a 
name="perldiag-Having-more-than-one-_002f_0025c-regexp-modifier-is-deprecated"></a>
+<p>(D deprecated, regexp) You used the indicated regular expression pattern
+modifier at least twice in a string of modifiers.  It is deprecated to
+do this with this particular modifier, to allow future extensions to the
+Perl language.
+</p>
+</dd>
+<dt>Hexadecimal float: exponent overflow</dt>
+<dd><a name="perldiag-Hexadecimal-float_003a-exponent-overflow"></a>
+<p>(W overflow) The hexadecimal floating point has a larger exponent
+than the floating point supports.
+</p>
+</dd>
+<dt>Hexadecimal float: exponent underflow</dt>
+<dd><a name="perldiag-Hexadecimal-float_003a-exponent-underflow"></a>
+<p>(W overflow) The hexadecimal floating point has a smaller exponent
+than the floating point supports.
+</p>
+</dd>
+<dt>Hexadecimal float: internal error</dt>
+<dd><a name="perldiag-Hexadecimal-float_003a-internal-error"></a>
+<p>(F) Something went horribly bad in hexadecimal float handling.
+</p>
+</dd>
+<dt>Hexadecimal float: mantissa overflow</dt>
+<dd><a name="perldiag-Hexadecimal-float_003a-mantissa-overflow"></a>
+<p>(W overflow) The hexadecimal floating point literal had more bits in
+the mantissa (the part between the 0x and the exponent, also known as
+the fraction or the significand) than the floating point supports.
+</p>
+</dd>
+<dt>Hexadecimal float: precision loss</dt>
+<dd><a name="perldiag-Hexadecimal-float_003a-precision-loss"></a>
+<p>(W overflow) The hexadecimal floating point had internally more
+digits than could be output.  This can be caused by unsupported
+long double formats, or by 64-bit integers not being available
+(needed to retrieve the digits under some configurations).
+</p>
+</dd>
+<dt>Hexadecimal float: unsupported long double format</dt>
+<dd><a 
name="perldiag-Hexadecimal-float_003a-unsupported-long-double-format"></a>
+<p>(F) You have configured Perl to use long doubles but
+the internals of the long double format are unknown;
+therefore the hexadecimal float output is impossible.
+</p>
+</dd>
 <dt>Hexadecimal number &gt; 0xffffffff non-portable</dt>
 <dd><a 
name="perldiag-Hexadecimal-number-_003e-0xffffffff-non_002dportable"></a>
 <p>(W portable) The hexadecimal number you specified is larger than 2**32-1
@@ -18911,9 +19307,9 @@
 <dt>Ignoring zero length \N{} in character class in regex; marked by 
&lt;&ndash;&nbsp;HERE<!-- /@w --> in m/%s/</dt>
 <dd><a 
name="perldiag-Ignoring-zero-length-_005cN_007b_007d-in-character-class-in-regex_003b-marked-by-_003c_002d_002d-HERE-in-m_002f_0025s_002f"></a>
 <p>(W regexp) Named Unicode character escapes (<code>\N{...}</code>) may 
return a
-zero-length sequence.  When such an escape is used in a character class
-its behaviour is not well defined.  Check that the correct escape has
-been used, and the correct charname handler is in scope.
+zero-length sequence.  When such an escape is used in a character
+class its behavior is not well defined.  Check that the correct
+escape has been used, and the correct charname handler is in scope.
 </p>
 </dd>
 <dt>Illegal binary digit %s</dt>
@@ -19013,6 +19409,11 @@
 <a 
href="#perlre-_0028_003fPARNO_0029-_0028_003f_002dPARNO_0029-_0028_003f_002bPARNO_0029-_0028_003fR_0029-_0028_003f0_0029"><code>(?<em>PARNO</em>)</code></a>.
 </p>
 </dd>
+<dt>Illegal suidscript</dt>
+<dd><a name="perldiag-Illegal-suidscript"></a>
+<p>(F) The script run under suidperl was somehow illegal.
+</p>
+</dd>
 <dt>Illegal switch in PERL5OPT: -%c</dt>
 <dd><a name="perldiag-Illegal-switch-in-PERL5OPT_003a-_002d_0025c"></a>
 <p>(X) The PERL5OPT environment variable may only be used to set the
@@ -19144,17 +19545,6 @@
 See <a href="#perlunicode-User_002dDefined-Character-Properties">perlunicode 
User-Defined Character Properties</a> and <a href="#perlsec-NAME">perlsec 
NAME</a>.
 </p>
 </dd>
-<dt>In &rsquo;(?...)&rsquo;, splitting the initial &rsquo;(?&rsquo; is 
deprecated in regex; marked by &lt;&ndash;&nbsp;HERE<!-- /@w --> in m/%s/</dt>
-<dd><a 
name="perldiag-In-_0027_0028_003f_002e_002e_002e_0029_0027_002c-splitting-the-initial-_0027_0028_003f_0027-is-deprecated-in-regex_003b-marked-by-_003c_002d_002d-HERE-in-m_002f_0025s_002f"></a>
-<p>(D regexp, deprecated) The two-character sequence 
<code>&quot;(?&quot;</code> in
-this context in a regular expression pattern should be an
-indivisible token, with nothing intervening between the 
<code>&quot;(&quot;</code>
-and the <code>&quot;?&quot;</code>, but you separated them.  Due to an 
accident of
-implementation, this prohibition was not enforced, but we do
-plan to forbid it in a future Perl version.  This message
-serves as giving you fair warning of this pending change.
-</p>
-</dd>
 <dt>Integer overflow in format string for %s</dt>
 <dd><a name="perldiag-Integer-overflow-in-format-string-for-_0025s"></a>
 <p>(F) The indexes and widths specified in the format string of 
<code>printf()</code>
@@ -19239,6 +19629,14 @@
 <a href="#perlop-Terms-and-List-Operators-_0028Leftward_0029">perlop Terms and 
List Operators (Leftward)</a>.
 </p>
 </dd>
+<dt>In &rsquo;(?...)&rsquo;, the &rsquo;(&rsquo; and &rsquo;?&rsquo; must be 
adjacent in regex; marked by &lt;&ndash;&nbsp;HERE<!-- /@w --> in m/%s/</dt>
+<dd><a 
name="perldiag-In-_0027_0028_003f_002e_002e_002e_0029_0027_002c-the-_0027_0028_0027-and-_0027_003f_0027-must-be-adjacent-in-regex_003b-marked-by-_003c_002d_002d-HERE-in-m_002f_0025s_002f"></a>
+<p>(F) The two-character sequence <code>&quot;(?&quot;</code> in this context 
in a regular
+expression pattern should be an indivisible token, with nothing
+intervening between the <code>&quot;(&quot;</code> and the 
<code>&quot;?&quot;</code>, but you separated them
+with whitespace.
+</p>
+</dd>
 <dt>Invalid %s attribute: %s</dt>
 <dd><a name="perldiag-Invalid-_0025s-attribute_003a-_0025s"></a>
 <p>(F) The indicated attribute for a subroutine or variable was not recognized
@@ -19328,6 +19726,14 @@
 See also <a href="#perlrun-_002dDletters">perlrun 
<strong>-D</strong><em>letters</em></a>.
 </p>
 </dd>
+<dt>Invalid quantifier in {,} in regex; marked by &lt;&ndash;&nbsp;HERE<!-- 
/@w --> in m/%s/</dt>
+<dd><a 
name="perldiag-Invalid-quantifier-in-_007b_002c_007d-in-regex_003b-marked-by-_003c_002d_002d-HERE-in-m_002f_0025s_002f"></a>
+<p>(F) The pattern looks like a {min,max} quantifier, but the min or max
+could not be parsed as a valid number - either it has leading zeroes,
+or it represents too big a number to cope with.  The &lt;&ndash;&nbsp;HERE<!-- 
/@w --> shows
+where in the regular expression the problem was discovered.  See <a 
href="#perlre-NAME">perlre NAME</a>.
+</p>
+</dd>
 <dt>Invalid [] range &quot;%s&quot; in regex; marked by 
&lt;&ndash;&nbsp;HERE<!-- /@w --> in m/%s/</dt>
 <dd><a 
name="perldiag-Invalid-_005b_005d-range-_0022_0025s_0022-in-regex_003b-marked-by-_003c_002d_002d-HERE-in-m_002f_0025s_002f"></a>
 <p>(F) The range specified in a character class had a minimum character
@@ -19399,15 +19805,12 @@
 an arbitrary reference was blessed into the &quot;version&quot; class.
 </p>
 </dd>
-<dt>In &rsquo;(*VERB...)&rsquo;, splitting the initial &rsquo;(*&rsquo; is 
deprecated in regex; marked by &lt;&ndash;&nbsp;HERE<!-- /@w --> in m/%s/</dt>
-<dd><a 
name="perldiag-In-_0027_0028_002aVERB_002e_002e_002e_0029_0027_002c-splitting-the-initial-_0027_0028_002a_0027-is-deprecated-in-regex_003b-marked-by-_003c_002d_002d-HERE-in-m_002f_0025s_002f"></a>
-<p>(D regexp, deprecated) The two-character sequence 
<code>&quot;(*&quot;</code> in
+<dt>In &rsquo;(*VERB...)&rsquo;, the &rsquo;(&rsquo; and &rsquo;*&rsquo; must 
be adjacent in regex; marked by &lt;&ndash;&nbsp;HERE<!-- /@w --> in m/%s/</dt>
+<dd><a 
name="perldiag-In-_0027_0028_002aVERB_002e_002e_002e_0029_0027_002c-the-_0027_0028_0027-and-_0027_002a_0027-must-be-adjacent-in-regex_003b-marked-by-_003c_002d_002d-HERE-in-m_002f_0025s_002f"></a>
+<p>(F) The two-character sequence <code>&quot;(*&quot;</code> in
 this context in a regular expression pattern should be an
 indivisible token, with nothing intervening between the 
<code>&quot;(&quot;</code>
-and the <code>&quot;*&quot;</code>, but you separated them.  Due to an 
accident of
-implementation, this prohibition was not enforced, but we do
-plan to forbid it in a future Perl version.  This message
-serves as giving you fair warning of this pending change.
+and the <code>&quot;*&quot;</code>, but you separated them.
 </p>
 </dd>
 <dt>ioctl is not implemented</dt>
@@ -19435,6 +19838,22 @@
 neither as a system call nor an ioctl call (SIOCATMARK).
 </p>
 </dd>
+<dt>&rsquo;%s&rsquo; is an unknown bound type in regex; marked by 
&lt;&ndash;&nbsp;HERE<!-- /@w --> in m/%s/</dt>
+<dd><a 
name="perldiag-_0027_0025s_0027-is-an-unknown-bound-type-in-regex_003b-marked-by-_003c_002d_002d-HERE-in-m_002f_0025s_002f"></a>
+<p>(F) You used <code>\b{...}</code> or <code>\B{...}</code> and the 
<code>...</code> is not known to
+Perl.  The current valid ones are given in
+<a 
href="#perlrebackslash-_005cb_007b_007d_002c-_005cb_002c-_005cB_007b_007d_002c-_005cB">perlrebackslash
 \b{}, \b, \B{}, \B</a>.
+</p>
+</dd>
+<dt>&quot;%s&quot; is more clearly written simply as &quot;%s&quot; in regex; 
marked by &lt;&ndash;&nbsp;HERE<!-- /@w --> in m/%s/</dt>
+<dd><a 
name="perldiag-_0022_0025s_0022-is-more-clearly-written-simply-as-_0022_0025s_0022-in-regex_003b-marked-by-_003c_002d_002d-HERE-in-m_002f_0025s_002f"></a>
+<p>(W regexp) (only under <code>use&nbsp;re&nbsp;'strict'<!-- /@w --></code> 
or within <code>(?[...])</code>)
+</p>
+<p>You specified a character that has the given plainer way of writing it,
+and which is also portable to platforms running with different character
+sets.
+</p>
+</dd>
 <dt>$* is no longer supported</dt>
 <dd><a name="perldiag-_0024_002a-is-no-longer-supported"></a>
 <p>(D deprecated, syntax) The special variable <code>$*</code>, deprecated in 
older
@@ -19570,6 +19989,46 @@
 Use the two-argument <code>open($pipe, '|prog arg1 arg2...')</code> form 
instead.
 </p>
 </dd>
+<dt>%s: loadable library and perl binaries are mismatched (got handshake key 
%p, needed %p)</dt>
+<dd><a 
name="perldiag-_0025s_003a-loadable-library-and-perl-binaries-are-mismatched-_0028got-handshake-key-_0025p_002c-needed-_0025p_0029"></a>
+<p>(P) A dynamic loading library <code>.so</code> or <code>.dll</code> was 
being loaded into the
+process that was built against a different build of perl than the
+said library was compiled against.  Reinstalling the XS module will
+likely fix this error.
+</p>
+</dd>
+<dt>Locale &rsquo;%s&rsquo; may not work well.%s</dt>
+<dd><a 
name="perldiag-Locale-_0027_0025s_0027-may-not-work-well_002e_0025s"></a>
+<p>(W locale) You are using the named locale, which is a non-UTF-8 one, and
+which perl has determined is not fully compatible with what it can
+handle.  The second <code>%s</code> gives a reason.
+</p>
+<p>By far the most common reason is that the locale has characters in it
+that are represented by more than one byte.  The only such locales that
+Perl can handle are the UTF-8 locales.  Most likely the specified locale
+is a non-UTF-8 one for an East Asian language such as Chinese or
+Japanese.  If the locale is a superset of ASCII, the ASCII portion of it
+may work in Perl.
+</p>
+<p>Some essentially obsolete locales that aren&rsquo;t supersets of ASCII, 
mainly
+those in ISO 646 or other 7-bit locales, such as ASMO 449, can also have
+problems, depending on what portions of the ASCII character set get
+changed by the locale and are also used by the program.
+The warning message lists the determinable conflicting characters.
+</p>
+<p>Note that not all incompatibilities are found.
+</p>
+<p>If this happens to you, there&rsquo;s not much you can do except switch to 
use a
+different locale or use <a href="Encode.html#Top">(Encode)</a> to translate 
from the locale into
+UTF-8; if that&rsquo;s impracticable, you have been warned that some things
+may break.
+</p>
+<p>This message is output once each time a bad locale is switched into
+within the scope of <code>use&nbsp;locale<!-- /@w --></code>, or on the first 
possibly-affected
+operation if the <code>use&nbsp;locale<!-- /@w --></code> inherits a bad one.  
It is not raised
+for any operations from the <a href="POSIX.html#Top">(POSIX)</a> module.
+</p>
+</dd>
 <dt>localtime(%f) failed</dt>
 <dd><a name="perldiag-localtime_0028_0025f_0029-failed"></a>
 <p>(W overflow) You called <code>localtime</code> with a number that it could 
not handle:
@@ -19833,8 +20292,13 @@
 </dd>
 <dt>Missing argument in %s</dt>
 <dd><a name="perldiag-Missing-argument-in-_0025s"></a>
-<p>(W uninitialized) A printf-type format required more arguments than were
-supplied.
+<p>(W missing) You called a function with fewer arguments than other
+arguments you supplied indicated would be needed.
+</p>
+<p>Currently only emitted when a printf-type format required more
+arguments than were supplied, but might be used in the future for
+other cases where we can statically determine that arguments to
+functions are missing, e.g. for the &lsquo;perlfunc pack&rsquo; function.
 </p>
 </dd>
 <dt>Missing argument to -%c</dt>
@@ -19903,6 +20367,13 @@
 &quot;%s found where operator expected&quot;.  Often the missing operator is a 
comma.
 </p>
 </dd>
+<dt>Missing or undefined argument to require</dt>
+<dd><a name="perldiag-Missing-or-undefined-argument-to-require"></a>
+<p>(F) You tried to call require with no argument or with an undefined
+value as an argument.  Require expects either a package name or a
+file-specification as an argument.  See <a href="#perlfunc-require">perlfunc 
require</a>.
+</p>
+</dd>
 <dt>Missing right brace on \%c{} in regex; marked by &lt;&ndash;&nbsp;HERE<!-- 
/@w --> in m/%s/</dt>
 <dd><a 
name="perldiag-Missing-right-brace-on-_005c_0025c_007b_007d-in-regex_003b-marked-by-_003c_002d_002d-HERE-in-m_002f_0025s_002f"></a>
 <p>(F) Missing right brace in <code>\x{...}</code>, <code>\p{...}</code>, 
<code>\P{...}</code>, or <code>\N{...}</code>.
@@ -20056,6 +20527,12 @@
 that yet.
 </p>
 </dd>
+<dt>&quot;my&quot; subroutine %s can&rsquo;t be in a package</dt>
+<dd><a 
name="perldiag-_0022my_0022-subroutine-_0025s-can_0027t-be-in-a-package"></a>
+<p>(F) Lexically scoped subroutines aren&rsquo;t in a package, so it 
doesn&rsquo;t make
+sense to try to declare one with a package qualifier on the front.
+</p>
+</dd>
 <dt>&quot;my %s&quot; used in sort comparison</dt>
 <dd><a name="perldiag-_0022my-_0025s_0022-used-in-sort-comparison"></a>
 <p>(W syntax) The package variables $a and $b are used for sort comparisons.
@@ -20079,10 +20556,10 @@
 just mention it again somehow to suppress the message.  The <code>our</code>
 declaration is also provided for this purpose.
 </p>
-<p>NOTE: This warning detects package symbols that have been used only
-once. This means lexical variables will never trigger this warning.
-It also means that all of the package variables $c, @c, %c, as well
-as *c, &amp;c, sub c{}, c(), and c (the filehandle or
+<p>NOTE: This warning detects package symbols that have been used
+only once.  This means lexical variables will never trigger this
+warning.  It also means that all of the package variables $c, @c,
+%c, as well as *c, &amp;c, sub c{}, c(), and c (the filehandle or
 format) are considered the same; if a program uses $c only once
 but also uses any of the others it will not trigger this warning.
 Symbols beginning with an underscore and symbols using special
@@ -20127,6 +20604,13 @@
 greater than or equal to zero.
 </p>
 </dd>
+<dt>Negative repeat count does nothing</dt>
+<dd><a name="perldiag-Negative-repeat-count-does-nothing"></a>
+<p>(W numeric) You tried to execute the
+<a href="#perlop-Multiplicative-Operators"><code>x</code></a> repetition 
operator fewer than 0
+times, which doesn&rsquo;t make sense.
+</p>
+</dd>
 <dt>Nested quantifiers in regex; marked by &lt;&ndash;&nbsp;HERE<!-- /@w --> 
in m/%s/</dt>
 <dd><a 
name="perldiag-Nested-quantifiers-in-regex_003b-marked-by-_003c_002d_002d-HERE-in-m_002f_0025s_002f"></a>
 <p>(F) You can&rsquo;t quantify a quantifier without intervening parentheses.
@@ -20158,14 +20642,22 @@
 probably not what you want.
 </p>
 </dd>
-<dt>\N{} in character class restricted to one character in regex; marked by 
&lt;&ndash;&nbsp;HERE<!-- /@w --> in m/%s/</dt>
-<dd><a 
name="perldiag-_005cN_007b_007d-in-character-class-restricted-to-one-character-in-regex_003b-marked-by-_003c_002d_002d-HERE-in-m_002f_0025s_002f"></a>
+<dt>\N{} in inverted character class or as a range end-point is restricted to 
one character in regex; marked by &lt;&ndash; HERE in m/%s/</dt>
+<dd><a 
name="perldiag-_005cN_007b_007d-in-inverted-character-class-or-as-a-range-end_002dpoint-is-restricted-to-one-character-in-regex_003b-marked-by-_003c_002d_002d-HERE-in-m_002f_0025s_002f"></a>
 <p>(F) Named Unicode character escapes (<code>\N{...}</code>) may return a
-multi-character sequence.  Such an escape may not be used in
-a character class, because character classes always match one
-character of input.  Check that the correct escape has been used,
-and the correct charname handler is in scope.  The &lt;&ndash;&nbsp;HERE<!-- 
/@w --> shows
-whereabouts in the regular expression the problem was discovered.
+multi-character sequence.  Even though a character class is
+supposed to match just one character of input, perl will match the
+whole thing correctly, except when the class is inverted (<code>[^...]</code>),
+or the escape is the beginning or final end point of a range.  The
+mathematically logical behavior for what matches when inverting
+is very different from what people expect, so we have decided to
+forbid it.  Similarly unclear is what should be generated when the
+<code>\N{...}</code> is used as one of the end points of the range, such as in
+</p>
+<pre class="verbatim"> [\x{41}-\N{ARABIC SEQUENCE YEH WITH HAMZA ABOVE WITH 
AE}]
+</pre>
+<p>What is meant here is unclear, as the <code>\N{...}</code> escape is a 
sequence
+of code points, so this is made an error.
 </p>
 </dd>
 <dt>\N{NAME} must be resolved by the lexer in regex; marked by 
&lt;&ndash;&nbsp;HERE<!-- /@w --> in m/%s/</dt>
@@ -20208,6 +20700,15 @@
 securable.  See <a href="#perlsec-NAME">perlsec NAME</a>.
 </p>
 </dd>
+<dt>NO-BREAK SPACE in a charnames alias definition is deprecated</dt>
+<dd><a 
name="perldiag-NO_002dBREAK-SPACE-in-a-charnames-alias-definition-is-deprecated"></a>
+<p>(D deprecated) You defined a character name which contained a no-break
+space character.  Change it to a regular space.  Usually these names are
+defined in the <code>:alias</code> import argument to <code>use 
charnames</code>, but they
+could be defined by a translator installed into <code>$^H{charnames}</code>.  
See
+<a href="charnames.html#CUSTOM-ALIASES">(charnames)CUSTOM ALIASES</a>.
+</p>
+</dd>
 <dt>No code specified for -%c</dt>
 <dd><a name="perldiag-No-code-specified-for-_002d_0025c"></a>
 <p>(F) Perl&rsquo;s <strong>-e</strong> and <strong>-E</strong> command-line 
options require an argument.  If
@@ -20301,6 +20802,13 @@
 or <code>next::can</code>.  See <a href="mro.html#Top">(mro)</a>.
 </p>
 </dd>
+<dt>Non-finite repeat count does nothing</dt>
+<dd><a name="perldiag-Non_002dfinite-repeat-count-does-nothing"></a>
+<p>(W numeric) You tried to execute the
+<a href="#perlop-Multiplicative-Operators"><code>x</code></a> repetition 
operator <code>Inf</code> (or
+<code>-Inf</code>) or <code>NaN</code> times, which doesn&rsquo;t make sense.
+</p>
+</dd>
 <dt>Non-hex character in regex; marked by &lt;&ndash;&nbsp;HERE<!-- /@w --> in 
m/%s/</dt>
 <dd><a 
name="perldiag-Non_002dhex-character-in-regex_003b-marked-by-_003c_002d_002d-HERE-in-m_002f_0025s_002f"></a>
 <p>(F) In a regular expression, there was a non-hexadecimal character where
@@ -20356,7 +20864,7 @@
 <dd><a 
name="perldiag-No-package-name-allowed-for-variable-_0025s-in-_0022our_0022"></a>
 <p>(F) Fully qualified variable names are not allowed in &quot;our&quot;
 declarations, because that doesn&rsquo;t make much sense under existing
-semantics.  Such syntax is reserved for future extensions.
+rules.  Such syntax is reserved for future extensions.
 </p>
 </dd>
 <dt>No Perl script found in input</dt>
@@ -20504,12 +21012,6 @@
 need to be added to UTC to get local time.
 </p>
 </dd>
-<dt>Null filename used</dt>
-<dd><a name="perldiag-Null-filename-used"></a>
-<p>(F) You can&rsquo;t require the null filename, especially because on many
-machines that means the current directory!  See <a 
href="#perlfunc-require">perlfunc require</a>.
-</p>
-</dd>
 <dt>NULL OP IN RUN</dt>
 <dd><a name="perldiag-NULL-OP-IN-RUN"></a>
 <p>(S debugging) Some internal routine called run() with a null opcode
@@ -20596,7 +21098,7 @@
 imagine.  The sole exceptions to this are that zero padding will
 take place when going past the end of the string when either
 <code>sysread()</code>ing a file, or when seeking past the end of a scalar 
opened
-for I/O (in anticipation of future reads and to imitate the behaviour
+for I/O (in anticipation of future reads and to imitate the behavior
 with real files).
 </p>
 </dd>
@@ -20659,7 +21161,7 @@
 </dd>
 <dt>Operation &quot;%s&quot; returns its argument for non-Unicode code point 
0x%X</dt>
 <dd><a 
name="perldiag-Operation-_0022_0025s_0022-returns-its-argument-for-non_002dUnicode-code-point-0x_0025X"></a>
-<p>(S non_unicode) You performed an operation requiring Unicode semantics
+<p>(S non_unicode) You performed an operation requiring Unicode rules
 on a code point that is not in Unicode, so what it should do is not
 defined.  Perl has chosen to have it do nothing, and warn you.
 </p>
@@ -20673,9 +21175,9 @@
 <dt>Operation &quot;%s&quot; returns its argument for UTF-16 surrogate 
U+%X</dt>
 <dd><a 
name="perldiag-Operation-_0022_0025s_0022-returns-its-argument-for-UTF_002d16-surrogate-U_002b_0025X"></a>
 <p>(S surrogate) You performed an operation requiring Unicode
-semantics on a Unicode surrogate.  Unicode frowns upon the use
+rules on a Unicode surrogate.  Unicode frowns upon the use
 of surrogates for anything but storing strings in UTF-16, but
-semantics are (reluctantly) defined for the surrogates, and
+rules are (reluctantly) defined for the surrogates, and
 they are to do nothing for this operation.  Because the use of
 surrogates can be dangerous, Perl warns.
 </p>
@@ -20888,8 +21390,8 @@
 failure was caught.
 </p>
 </dd>
-<dt>panic: frexp</dt>
-<dd><a name="perldiag-panic_003a-frexp"></a>
+<dt>panic: frexp: %f</dt>
+<dd><a name="perldiag-panic_003a-frexp_003a-_0025f"></a>
 <p>(P) The library function frexp() failed, making printf(&quot;%f&quot;) 
impossible.
 </p>
 </dd>
@@ -20971,7 +21473,8 @@
 </dd>
 <dt>panic: pad_free po</dt>
 <dd><a name="perldiag-panic_003a-pad_005ffree-po"></a>
-<p>(P) An invalid scratch pad offset was detected internally.
+<p>(P) A zero scratch pad offset was detected internally.  An attempt was
+made to free a target that had not been allocated to begin with.
 </p>
 </dd>
 <dt>panic: pad_reset curpad, %p!=%p</dt>
@@ -20982,7 +21485,9 @@
 </dd>
 <dt>panic: pad_sv po</dt>
 <dd><a name="perldiag-panic_003a-pad_005fsv-po"></a>
-<p>(P) An invalid scratch pad offset was detected internally.
+<p>(P) A zero scratch pad offset was detected internally.  Most likely
+an operator needed a target but that target had not been allocated
+for whatever reason.
 </p>
 </dd>
 <dt>panic: pad_swipe curpad, %p!=%p</dt>
@@ -21139,6 +21644,12 @@
 redirected it with select().)
 </p>
 </dd>
+<dt>Perl API version %s of %s does not match %s</dt>
+<dd><a 
name="perldiag-Perl-API-version-_0025s-of-_0025s-does-not-match-_0025s"></a>
+<p>(F) The XS module in question was compiled against a different incompatible
+version of Perl than the one that has loaded the XS module.
+</p>
+</dd>
 <dt>Perl folding rules are not up-to-date for 0x%X; please use the perlbug  
utility to report; in regex; marked by &lt;&ndash;&nbsp;HERE<!-- /@w --> in 
m/%s/</dt>
 <dd><a 
name="perldiag-Perl-folding-rules-are-not-up_002dto_002ddate-for-0x_0025X_003b-please-use-the-perlbug-utility-to-report_003b-in-regex_003b-marked-by-_003c_002d_002d-HERE-in-m_002f_0025s_002f"></a>
 <p>(S regexp) You used a regular expression with case-insensitive matching,
@@ -21147,6 +21658,15 @@
 Please report this as a bug using the <a href="perlbug.html#Top">(perlbug)</a> 
utility.
 </p>
 </dd>
+<dt>PerlIO layer &rsquo;:win32&rsquo; is experimental</dt>
+<dd><a name="perldiag-PerlIO-layer-_0027_003awin32_0027-is-experimental"></a>
+<p>(S experimental::win32_perlio) The <code>:win32</code> PerlIO layer is
+experimental.  If you want to take the risk of using this layer,
+simply disable this warning:
+</p>
+<pre class="verbatim">    no warnings &quot;experimental::win32_perlio&quot;;
+</pre>
+</dd>
 <dt>Perl_my_%s() not available</dt>
 <dd><a name="perldiag-Perl_005fmy_005f_0025s_0028_0029-not-available"></a>
 <p>(F) Your platform has very uncommon byte-order and integer size,
@@ -21259,7 +21779,7 @@
 <pre class="verbatim">    no warnings &quot;experimental::autoderef&quot;;
 </pre>
 </dd>
-<dt>POSIX class [:%s:] unknown in regex; marked by 
&lt;&ndash;&nbsp;HERE&nbsp;in&nbsp;m/%s/<!-- /@w --></dt>
+<dt>POSIX class [:%s:] unknown in regex; marked by &lt;&ndash;&nbsp;HERE<!-- 
/@w --> in m/%s/</dt>
 <dd><a 
name="perldiag-POSIX-class-_005b_003a_0025s_003a_005d-unknown-in-regex_003b-marked-by-_003c_002d_002d-HERE-in-m_002f_0025s_002f"></a>
 <p>(F) The class in the character class [: :] syntax is unknown.  The 
&lt;&ndash;&nbsp;HERE<!-- /@w -->
 shows whereabouts in the regular expression the problem was discovered.
@@ -21382,8 +21902,8 @@
 <pre class="verbatim">    sub { 1 if die; }
 </pre>
 </dd>
-<dt>Possible precedence problem on bitwise %c operator</dt>
-<dd><a 
name="perldiag-Possible-precedence-problem-on-bitwise-_0025c-operator"></a>
+<dt>Possible precedence problem on bitwise %s operator</dt>
+<dd><a 
name="perldiag-Possible-precedence-problem-on-bitwise-_0025s-operator"></a>
 <p>(W precedence) Your program uses a bitwise logical operator in conjunction
 with a numeric comparison operator, like this :
 </p>
@@ -21513,31 +22033,6 @@
 from the attribute before it&rsquo;s ever used.
 </p>
 </dd>
-<dt>\p{} uses Unicode rules, not locale rules</dt>
-<dd><a 
name="perldiag-_005cp_007b_007d-uses-Unicode-rules_002c-not-locale-rules"></a>
-<p>(W) You compiled a regular expression that contained a Unicode property
-match (<code>\p</code> or <code>\P</code>), but the regular expression is also 
being told to
-use the run-time locale, not Unicode.  Instead, use a POSIX character
-class, which should know about the locale&rsquo;s rules.
-(See <a href="#perlrecharclass-POSIX-Character-Classes">perlrecharclass POSIX 
Character Classes</a>.)
-</p>
-<p>Even if the run-time locale is ISO 8859-1 (Latin1), which is a subset of
-Unicode, some properties will give results that are not valid for that
-subset.
-</p>
-<p>Here are a couple of examples to help you see what&rsquo;s going on.  If the
-locale is ISO 8859-7, the character at code point 0xD7 is the &quot;GREEK
-CAPITAL LETTER CHI&quot;.  But in Unicode that code point means the
-&quot;MULTIPLICATION SIGN&quot; instead, and <code>\p</code> always uses the 
Unicode
-meaning.  That means that <code>\p{Alpha}</code> won&rsquo;t match, but 
<code>[[:alpha:]]</code>
-should.  Only in the Latin1 locale are all the characters in the same
-positions as they are in Unicode.  But, even here, some properties give
-incorrect results.  An example is <code>\p{Changes_When_Uppercased}</code> 
which
-is true for &quot;LATIN SMALL LETTER Y WITH DIAERESIS&quot;, but since the 
upper
-case of that character is not in Latin1, in that locale it doesn&rsquo;t
-change when upper cased.
-</p>
-</dd>
 <dt>push on reference is experimental</dt>
 <dd><a name="perldiag-push-on-reference-is-experimental"></a>
 <p>(S experimental::autoderef) <code>push</code> with a scalar argument is 
experimental
@@ -21547,7 +22042,7 @@
 <pre class="verbatim">    no warnings &quot;experimental::autoderef&quot;;
 </pre>
 </dd>
-<dt>Quantifier follows nothing in regex; marked by 
&lt;&ndash;&nbsp;HERE&nbsp;in&nbsp;m/%s/<!-- /@w --></dt>
+<dt>Quantifier follows nothing in regex; marked by &lt;&ndash;&nbsp;HERE<!-- 
/@w --> in m/%s/</dt>
 <dd><a 
name="perldiag-Quantifier-follows-nothing-in-regex_003b-marked-by-_003c_002d_002d-HERE-in-m_002f_0025s_002f"></a>
 <p>(F) You started a regular expression with a quantifier.  Backslash it if
 you meant it literally.  The &lt;&ndash;&nbsp;HERE<!-- /@w --> shows 
whereabouts in the regular
@@ -21570,17 +22065,14 @@
 want your regexp to match something 0 times, just put {0}.
 </p>
 </dd>
-<dt>Quantifier unexpected on zero-length expression in regex; marked by 
&lt;&ndash;  HERE in m/%s/</dt>
-<dd><a 
name="perldiag-Quantifier-unexpected-on-zero_002dlength-expression-in-regex_003b-marked-by-_003c_002d_002d-HERE-in-m_002f_0025s_002f"></a>
+<dt>Quantifier unexpected on zero-length expression in regex m/%s/</dt>
+<dd><a 
name="perldiag-Quantifier-unexpected-on-zero_002dlength-expression-in-regex-m_002f_0025s_002f"></a>
 <p>(W regexp) You applied a regular expression quantifier in a place where
 it makes no sense, such as on a zero-width assertion.  Try putting the
 quantifier inside the assertion instead.  For example, the way to match
 &quot;abc&quot; provided that it is followed by three repetitions of 
&quot;xyz&quot; is
 <code>/abc(?=(?:xyz){3})/</code>, not <code>/abc(?=xyz){3}/</code>.
 </p>
-<p>The &lt;&ndash; HERE shows whereabouts in the regular expression the 
problem was
-discovered.
-</p>
 </dd>
 <dt>Range iterator outside integer range</dt>
 <dd><a name="perldiag-Range-iterator-outside-integer-range"></a>
@@ -21590,6 +22082,45 @@
 by prepending &quot;0&quot; to your numbers.
 </p>
 </dd>
+<dt>Ranges of ASCII printables should be some subset of &quot;0-9&quot;, 
&quot;A-Z&quot;, or &quot;a-z&quot; in regex; marked by 
&lt;&ndash;&nbsp;HERE<!-- /@w --> in m/%s/</dt>
+<dd><a 
name="perldiag-Ranges-of-ASCII-printables-should-be-some-subset-of-_00220_002d9_0022_002c-_0022A_002dZ_0022_002c-or-_0022a_002dz_0022-in-regex_003b-marked-by-_003c_002d_002d-HERE-in-m_002f_0025s_002f"></a>
+<p>(W regexp) (only under <code>use&nbsp;re&nbsp;'strict'<!-- /@w --></code> 
or within <code>(?[...])</code>)
+</p>
+<p>Stricter rules help to find typos and other errors.  Perhaps you 
didn&rsquo;t
+even intend a range here, if the <code>&quot;-&quot;</code> was meant to be 
some other
+character, or should have been escaped (like <code>&quot;\-&quot;</code>).  If 
you did
+intend a range, the one that was used is not portable between ASCII and
+EBCDIC platforms, and doesn&rsquo;t have an obvious meaning to a casual
+reader.
+</p>
+<pre class="verbatim"> [3-7]    # OK; Obvious and portable
+ [d-g]    # OK; Obvious and portable
+ [A-Y]    # OK; Obvious and portable
+ [A-z]    # WRONG; Not portable; not clear what is meant
+ [a-Z]    # WRONG; Not portable; not clear what is meant
+ [%-.]    # WRONG; Not portable; not clear what is meant
+ [\x41-Z] # WRONG; Not portable; not obvious to non-geek
+</pre>
+<p>(You can force portability by specifying a Unicode range, which means that
+the endpoints are specified by
+<a href="#perlrecharclass-Character-Ranges"><code>\N{...}</code></a>, but the 
meaning may
+still not be obvious.)
+The stricter rules require that ranges that start or stop with an ASCII
+character that is not a control have all their endpoints be the literal
+character, and not some escape sequence (like <code>&quot;\x41&quot;</code>), 
and the ranges
+must be all digits, or all uppercase letters, or all lowercase letters.
+</p>
+</dd>
+<dt>Ranges of digits should be from the same group in regex; marked by 
&lt;&ndash;&nbsp;HERE<!-- /@w --> in m/%s/</dt>
+<dd><a 
name="perldiag-Ranges-of-digits-should-be-from-the-same-group-in-regex_003b-marked-by-_003c_002d_002d-HERE-in-m_002f_0025s_002f"></a>
+<p>(W regexp) (only under <code>use&nbsp;re&nbsp;'strict'<!-- /@w --></code> 
or within <code>(?[...])</code>)
+</p>
+<p>Stricter rules help to find typos and other errors.  You included a
+range, and at least one of the end points is a decimal digit.  Under the
+stricter rules, when this happens, both end points should be digits in
+the same group of 10 consecutive digits.
+</p>
+</dd>
 <dt>readdir() attempted on invalid dirhandle %s</dt>
 <dd><a 
name="perldiag-readdir_0028_0029-attempted-on-invalid-dirhandle-_0025s"></a>
 <p>(W io) The dirhandle you&rsquo;re reading from is either closed or not 
really
@@ -21645,6 +22176,14 @@
 crude check that bails out after 100 levels of <code>@ISA</code> depth.
 </p>
 </dd>
+<dt>Redundant argument in %s</dt>
+<dd><a name="perldiag-Redundant-argument-in-_0025s"></a>
+<p>(W redundant) You called a function with more arguments than other
+arguments you supplied indicated would be needed.  Currently only
+emitted when a printf-type format required fewer arguments than were
+supplied, but might be used in the future for e.g. &lsquo;perlfunc pack&rsquo;.
+</p>
+</dd>
 <dt>refcnt_dec: fd %d%s</dt>
 <dd><a name="perldiag-refcnt_005fdec_003a-fd-_0025d_0025s"></a>
 </dd>
@@ -21691,7 +22230,7 @@
 you wanted to have the character with ordinal 7 inserted into the regular
 expression, prepend zeroes to make it three digits long: <code>\007</code>
 </p>
-<p>The &lt;&ndash; HERE shows whereabouts in the regular expression the 
problem was
+<p>The &lt;&ndash;&nbsp;HERE<!-- /@w --> shows whereabouts in the regular 
expression the problem was
 discovered.
 </p>
 </dd>
@@ -21702,7 +22241,7 @@
 such as <code>(?'NAME'...)</code> or <code>(?&lt;NAME&gt;...)</code>.  Check 
if the name has been
 spelled correctly both in the backreference and the declaration.
 </p>
-<p>The &lt;&ndash; HERE shows whereabouts in the regular expression the 
problem was
+<p>The &lt;&ndash;&nbsp;HERE<!-- /@w --> shows whereabouts in the regular 
expression the problem was
 discovered.
 </p>
 </dd>
@@ -21712,7 +22251,7 @@
 are not at least seven sets of closed capturing parentheses in the
 expression before where the <code>\g{-7}</code> was located.
 </p>
-<p>The &lt;&ndash; HERE shows whereabouts in the regular expression the 
problem was
+<p>The &lt;&ndash;&nbsp;HERE<!-- /@w --> shows whereabouts in the regular 
expression the problem was
 discovered.
 </p>
 </dd>
@@ -21853,17 +22392,6 @@
 misparsed by pre-5.10.0 Perls as a non-terminated search pattern.
 </p>
 </dd>
-<dt>Search pattern not terminated or ternary operator parsed as search 
pattern</dt>
-<dd><a 
name="perldiag-Search-pattern-not-terminated-or-ternary-operator-parsed-as-search-pattern"></a>
-<p>(F) The lexer couldn&rsquo;t find the final delimiter of a 
<code>?PATTERN?</code>
-construct.
-</p>
-<p>The question mark is also used as part of the ternary operator (as in
-<code>foo ? 0 : 1</code>) leading to some ambiguous constructions being wrongly
-parsed.  One way to disambiguate the parsing is to put parentheses around
-the conditional expression, i.e. <code>(foo) ? 0 : 1</code>.
-</p>
-</dd>
 <dt>seekdir() attempted on invalid dirhandle %s</dt>
 <dd><a 
name="perldiag-seekdir_0028_0029-attempted-on-invalid-dirhandle-_0025s"></a>
 <p>(W io) The dirhandle you are doing a seekdir() on is either closed or not
@@ -21910,6 +22438,15 @@
 before now.  Check your control flow.
 </p>
 </dd>
+<dt>Sequence &quot;\c{&quot; invalid</dt>
+<dd><a name="perldiag-Sequence-_0022_005cc_007b_0022-invalid"></a>
+<p>(F) These three characters may not appear in sequence in a
+double-quotish context.  This message is raised only on non-ASCII
+platforms (a different error message is output on ASCII ones).  If you
+were intending to specify a control character with this sequence, you&rsquo;ll
+have to use a different way to specify it.
+</p>
+</dd>
 <dt>Sequence (? incomplete in regex; marked by &lt;&ndash;&nbsp;HERE<!-- /@w 
--> in m/%s/</dt>
 <dd><a 
name="perldiag-Sequence-_0028_003f-incomplete-in-regex_003b-marked-by-_003c_002d_002d-HERE-in-m_002f_0025s_002f"></a>
 <p>(F) A regular expression ended with an incomplete extension (?.  The
@@ -22055,9 +22592,15 @@
 &lsquo;perlfunc setsockopt&rsquo;.
 </p>
 </dd>
+<dt>Setting ${^ENCODING} is deprecated</dt>
+<dd><a name="perldiag-Setting-_0024_007b_005eENCODING_007d-is-deprecated"></a>
+<p>(D deprecated) You assigned a non-<code>undef</code> value to 
<code>${^ENCODING}</code>.
+This is deprecated; see <code><a 
href="#perlvar-_0024_007b_005eENCODING_007d">perlvar ${^ENCODING}</a></code> 
for details.
+</p>
+</dd>
 <dt>Setting $/ to a reference to %s as a form of slurp is deprecated, treating 
as undef</dt>
 <dd><a 
name="perldiag-Setting-_0024_002f-to-a-reference-to-_0025s-as-a-form-of-slurp-is-deprecated_002c-treating-as-undef"></a>
-<p>(W deprecated) You assigned a reference to a scalar to <code>$/</code> 
where the
+<p>(D deprecated) You assigned a reference to a scalar to <code>$/</code> 
where the
 referenced item is not a positive integer.  In older perls this 
<strong>appeared</strong>
 to work the same as setting it to <code>undef</code> but was in fact internally
 different, less efficient and with very bad luck could have resulted in
@@ -22102,12 +22645,6 @@
 operators: probably not what you intended.
 </p>
 </dd>
-<dt>&lt;&gt; should be quotes</dt>
-<dd><a name="perldiag-_003c_003e-should-be-quotes"></a>
-<p>(F) You wrote <code>require &lt;file&gt;</code> when you should have written
-<code>require 'file'</code>.
-</p>
-</dd>
 <dt>/%s/ should probably be written as &quot;%s&quot;</dt>
 <dd><a 
name="perldiag-_002f_0025s_002f-should-probably-be-written-as-_0022_0025s_0022"></a>
 <p>(W syntax) You have used a pattern where Perl expected to find a string,
@@ -22221,6 +22758,12 @@
 a block by itself.
 </p>
 </dd>
+<dt>&quot;state&quot; subroutine %s can&rsquo;t be in a package</dt>
+<dd><a 
name="perldiag-_0022state_0022-subroutine-_0025s-can_0027t-be-in-a-package"></a>
+<p>(F) Lexically scoped subroutines aren&rsquo;t in a package, so it 
doesn&rsquo;t make
+sense to try to declare one with a package qualifier on the front.
+</p>
+</dd>
 <dt>&quot;state %s&quot; used in sort comparison</dt>
 <dd><a name="perldiag-_0022state-_0025s_0022-used-in-sort-comparison"></a>
 <p>(W syntax) The package variables $a and $b are used for sort comparisons.
@@ -22307,6 +22850,26 @@
     }
 </pre>
 </dd>
+<dt>Subroutine &quot;%s&quot; will not stay shared</dt>
+<dd><a name="perldiag-Subroutine-_0022_0025s_0022-will-not-stay-shared"></a>
+<p>(W closure) An inner (nested) <em>named</em> subroutine is referencing a 
&quot;my&quot;
+subroutine defined in an outer named subroutine.
+</p>
+<p>When the inner subroutine is called, it will see the value of the outer
+subroutine&rsquo;s lexical subroutine as it was before and during the *first*
+call to the outer subroutine; in this case, after the first call to the
+outer subroutine is complete, the inner and outer subroutines will no
+longer share a common value for the lexical subroutine.  In other words,
+it will no longer be shared.  This will especially make a difference
+if the lexical subroutines accesses lexical variables declared in its
+surrounding scope.
+</p>
+<p>This problem can usually be solved by making the inner subroutine
+anonymous, using the <code>sub {}</code> syntax.  When inner anonymous subs 
that
+reference lexical subroutines in outer subroutines are created, they
+are automatically rebound to the current values of such lexical subs.
+</p>
+</dd>
 <dt>Substitution loop</dt>
 <dd><a name="perldiag-Substitution-loop"></a>
 <p>(P) The substitution was looping infinitely.  (Obviously, a substitution
@@ -22378,10 +22941,17 @@
  (R&amp;NAME)           true if directly inside named capture
  (DEFINE)           always false; for defining named subpatterns
 </pre>
-<p>The &lt;&ndash; HERE shows whereabouts in the regular expression the 
problem was
+<p>The &lt;&ndash;&nbsp;HERE<!-- /@w --> shows whereabouts in the regular 
expression the problem was
 discovered.  See <a href="#perlre-NAME">perlre NAME</a>.
 </p>
 </dd>
+<dt>Switch (?(condition)... not terminated in regex; marked by 
&lt;&ndash;&nbsp;HERE<!-- /@w --> in m/%s/</dt>
+<dd><a 
name="perldiag-Switch-_0028_003f_0028condition_0029_002e_002e_002e-not-terminated-in-regex_003b-marked-by-_003c_002d_002d-HERE-in-m_002f_0025s_002f"></a>
+<p>(F) You omitted to close a (?(condition)...) block somewhere
+in the pattern.  Add a closing parenthesis in the appropriate
+position.  See <a href="#perlre-NAME">perlre NAME</a>.
+</p>
+</dd>
 <dt>switching effective %s is not implemented</dt>
 <dd><a name="perldiag-switching-effective-_0025s-is-not-implemented"></a>
 <p>(F) While under the <code>use filetest</code> pragma, we cannot switch the 
real
@@ -22499,6 +23069,19 @@
 from under another module inadvertently.  See <a 
href="#perlvar-_0024_005b">perlvar $[</a> and <a 
href="arybase.html#Top">(arybase)</a>.
 </p>
 </dd>
+<dt>The bitwise feature is experimental</dt>
+<dd><a name="perldiag-The-bitwise-feature-is-experimental"></a>
+<p>(S experimental::bitwise) This warning is emitted if you use bitwise
+operators (<code>&amp; | ^ ~ &amp;. |. ^. ~.</code>) with the 
&quot;bitwise&quot; feature enabled.
+Simply suppress the warning if you want to use the feature, but know
+that in doing so you are taking the risk of using an experimental
+feature which may change or be removed in a future Perl version:
+</p>
+<pre class="verbatim">    no warnings &quot;experimental::bitwise&quot;;
+    use feature &quot;bitwise&quot;;
+    $x |.= $y;
+</pre>
+</dd>
 <dt>The crypt() function is unimplemented due to excessive paranoia.</dt>
 <dd><a 
name="perldiag-The-crypt_0028_0029-function-is-unimplemented-due-to-excessive-paranoia_002e"></a>
 <p>(F) Configure couldn&rsquo;t find the crypt() function on your machine,
@@ -22699,15 +23282,6 @@
 Backslash it.   See <a href="#perlre-NAME">perlre NAME</a>.
 </p>
 </dd>
-<dt>Trailing white-space in a charnames alias definition is deprecated</dt>
-<dd><a 
name="perldiag-Trailing-white_002dspace-in-a-charnames-alias-definition-is-deprecated"></a>
-<p>(D deprecated) You defined a character name which ended in a space
-character.  Remove the trailing space(s).  Usually these names are
-defined in the <code>:alias</code> import argument to <code>use 
charnames</code>, but they
-could be defined by a translator installed into <code>$^H{charnames}</code>.
-See <a href="charnames.html#CUSTOM-ALIASES">(charnames)CUSTOM ALIASES</a>.
-</p>
-</dd>
 <dt>Transliteration pattern not terminated</dt>
 <dd><a name="perldiag-Transliteration-pattern-not-terminated"></a>
 <p>(F) The lexer couldn&rsquo;t find the interior delimiter of a tr/// or 
tr[][]
@@ -22841,6 +23415,18 @@
 Check the #! line, or manually feed your script into Perl yourself.
 </p>
 </dd>
+<dt>Unescaped left brace in regex is deprecated, passed through in regex; 
marked by &lt;&ndash;&nbsp;HERE<!-- /@w --> in m/%s/</dt>
+<dd><a 
name="perldiag-Unescaped-left-brace-in-regex-is-deprecated_002c-passed-through-in-regex_003b-marked-by-_003c_002d_002d-HERE-in-m_002f_0025s_002f"></a>
+<p>(D deprecated, regexp) You used a literal <code>&quot;{&quot;</code> 
character in a regular
+expression pattern.  You should change to use <code>&quot;\{&quot;</code> 
instead, because a
+future version of Perl (tentatively v5.26) will consider this to be a
+syntax error.  If the pattern delimiters are also braces, any matching
+right brace (<code>&quot;}&quot;</code>) should also be escaped to avoid 
confusing the parser,
+for example,
+</p>
+<pre class="verbatim">    qr{abc\{def\}ghi}
+</pre>
+</dd>
 <dt>unexec of %s into %s failed!</dt>
 <dd><a name="perldiag-unexec-of-_0025s-into-_0025s-failed_0021"></a>
 <p>(F) The unexec() routine failed for some reason.  See your local FSF
@@ -22915,16 +23501,16 @@
 <dt>Unicode non-character U+%X is illegal for open interchange</dt>
 <dd><a 
name="perldiag-Unicode-non_002dcharacter-U_002b_0025X-is-illegal-for-open-interchange"></a>
 <p>(S nonchar) Certain codepoints, such as U+FFFE and U+FFFF, are
-defined by the Unicode standard to be non-characters.  Those are
-legal codepoints, but are reserved for internal use; so, applications
-shouldn&rsquo;t attempt to exchange them.  An application may not be
-expecting any of these characters at all, and receiving them
-may lead to bugs.  If you know what you are doing
-you can turn off this warning by <code>no warnings 'nonchar';</code>.
-</p>
-<p>This is not really a &quot;serious&quot; error, but it is supposed to be 
raised
-by default even if warnings are not enabled, and currently the only
-way to do that in Perl is to mark it as serious.
+defined by the Unicode standard to be non-characters.  Those
+are legal codepoints, but are reserved for internal use; so,
+applications shouldn&rsquo;t attempt to exchange them.  An application
+may not be expecting any of these characters at all, and receiving
+them may lead to bugs.  If you know what you are doing you can
+turn off this warning by <code>no warnings 'nonchar';</code>.
+</p>
+<p>This is not really a &quot;severe&quot; error, but it is supposed to be
+raised by default even if warnings are not enabled, and currently
+the only way to do that in Perl is to mark it as serious.
 </p>
 </dd>
 <dt>Unicode surrogate U+%X is illegal in UTF-8</dt>
@@ -23017,7 +23603,7 @@
  (R&amp;NAME)           true if directly inside named capture
  (DEFINE)           always false; for defining named subpatterns
 </pre>
-<p>The &lt;&ndash; HERE shows whereabouts in the regular expression the 
problem was
+<p>The &lt;&ndash;&nbsp;HERE<!-- /@w --> shows whereabouts in the regular 
expression the problem was
 discovered.  See <a href="#perlre-NAME">perlre NAME</a>.
 </p>
 </dd>
@@ -23330,7 +23916,7 @@
 </p>
 <pre class="verbatim">    if ($string =~ /$pattern/) { ... }
 </pre>
-<p>The &lt;&ndash; HERE shows whereabouts in the regular expression the 
problem was
+<p>The &lt;&ndash;&nbsp;HERE<!-- /@w --> shows whereabouts in the regular 
expression the problem was
 discovered.  See <a href="#perlre-NAME">perlre NAME</a>.
 </p>
 </dd>
@@ -23352,10 +23938,18 @@
 </p>
 <pre class="verbatim">    if ($string =~ /$pattern/o) { ... }
 </pre>
-<p>The &lt;&ndash; HERE shows whereabouts in the regular expression the 
problem was
+<p>The &lt;&ndash;&nbsp;HERE<!-- /@w --> shows whereabouts in the regular 
expression the problem was
 discovered.  See <a href="#perlre-NAME">perlre NAME</a>.
 </p>
 </dd>
+<dt>Useless use of attribute &quot;const&quot;</dt>
+<dd><a name="perldiag-Useless-use-of-attribute-_0022const_0022"></a>
+<p>(W misc) The &quot;const&quot; attribute has no effect except
+on anonymous closure prototypes.  You applied it to
+a subroutine via <a href="attributes.html#Top">(attributes)attributes.pm</a>.  
This is only useful
+inside an attribute handler for an anonymous subroutine.
+</p>
+</dd>
 <dt>Useless use of /d modifier in transliteration operator</dt>
 <dd><a 
name="perldiag-Useless-use-of-_002fd-modifier-in-transliteration-operator"></a>
 <p>(W misc) You have used the /d modifier where the searchlist has the
@@ -23363,32 +23957,6 @@
 about the /d modifier.
 </p>
 </dd>
-<dt>Useless use of &rsquo;\&rsquo;; doesn&rsquo;t escape metacharacter 
&rsquo;%c&rsquo;</dt>
-<dd><a 
name="perldiag-Useless-use-of-_0027_005c_0027_003b-doesn_0027t-escape-metacharacter-_0027_0025c_0027"></a>
-<p>(D deprecated) You wrote a regular expression pattern something like
-one of these:
-</p>
-<pre class="verbatim"> m{ \x\{FF\} }x
- m{foo\{1,3\}}
- qr(foo\(bar\))
- s[foo\[a-z\]bar][baz]
-</pre>
-<p>The interior braces, square brackets, and parentheses are treated as
-metacharacters even though they are backslashed; instead write:
-</p>
-<pre class="verbatim"> m{ \x{FF} }x
- m{foo{1,3}}
- qr(foo(bar))
- s[foo[a-z]bar][baz]
-</pre>
-<p>The backslashes have no effect when a regular expression pattern is
-delimited by <code>{}</code>, <code>[]</code>, or <code>()</code>, which 
ordinarily are
-metacharacters, and the delimiters are also used, paired, within the
-interior of the pattern.  It is planned that a future Perl release will
-change the meaning of constructs like these so that the backslashes
-will have an effect, so remove them from your code.
-</p>
-</dd>
 <dt>Useless use of \E</dt>
 <dd><a name="perldiag-Useless-use-of-_005cE"></a>
 <p>(W misc) You have a \E in a double-quotish string without a <code>\U</code>,
@@ -23497,6 +24065,16 @@
 here-document.
 </p>
 </dd>
+<dt>Use of \b{} for non-UTF-8 locale is wrong.  Assuming a UTF-8 locale</dt>
+<dd><a 
name="perldiag-Use-of-_005cb_007b_007d-for-non_002dUTF_002d8-locale-is-wrong_002e-Assuming-a-UTF_002d8-locale"></a>
+<p>(W locale)  You are matching a regular expression using locale rules,
+and a Unicode boundary is being matched, but the locale is not a Unicode
+one.  This doesn&rsquo;t make sense.  Perl will continue, assuming a Unicode
+(UTF-8) locale, but the results could well be wrong except if the locale
+happens to be ISO-8859-1 (Latin1) where this message is spurious and can
+be ignored.
+</p>
+</dd>
 <dt>Use of chdir(&rdquo;) or chdir(undef) as chdir() deprecated</dt>
 <dd><a 
name="perldiag-Use-of-chdir_0028_0027_0027_0029-or-chdir_0028undef_0029-as-chdir_0028_0029-deprecated"></a>
 <p>(D deprecated) chdir() with no arguments is documented to change to
@@ -23618,10 +24196,19 @@
 </dd>
 <dt>Use of literal control characters in variable names is deprecated</dt>
 <dd><a 
name="perldiag-Use-of-literal-control-characters-in-variable-names-is-deprecated"></a>
-<p>(D deprecated) Using literal control characters in the source to refer
-to the ^FOO variables, like <code>$^X</code> and <code>${^GLOBAL_PHASE}</code> 
is now
-deprecated.  This only affects code like <code>$\cT</code>, where \cT is a 
control in
-the source code: <code>${&quot;\cT&quot;}</code> and <code>$^T</code> remain 
valid.
+</dd>
+<dt>Use of literal non-graphic characters in variable names is deprecated</dt>
+<dd><a 
name="perldiag-Use-of-literal-non_002dgraphic-characters-in-variable-names-is-deprecated"></a>
+<p>(D deprecated) Using literal non-graphic (including control)
+characters in the source to refer to the ^FOO variables, like <code>$^X</code> 
and
+<code>${^GLOBAL_PHASE}</code> is now deprecated.  (We use <code>^X</code> and 
<code>^G</code> here for
+legibility.  They actually represent the non-printable control
+characters, code points 0x18 and 0x07, respectively; <code>^A</code> would mean
+the control character whose code point is 0x01.) This only affects
+code like <code>$\cT</code>, where <code>\cT</code> is a control in the source 
code; <code>${&quot;\cT&quot;}</code> and
+<code>$^T</code> remain valid.  Things that are non-controls and also not 
graphic
+are NO-BREAK SPACE and SOFT HYPHEN, which were previously only allowed
+for historical reasons.
 </p>
 </dd>
 <dt>Use of -l on filehandle%s</dt>
@@ -23650,16 +24237,6 @@
 message, you must be using an older version.
 </p>
 </dd>
-<dt>Use of ?PATTERN? without explicit operator is deprecated</dt>
-<dd><a 
name="perldiag-Use-of-_003fPATTERN_003f-without-explicit-operator-is-deprecated"></a>
-<p>(D deprecated) You have written something like <code>?\w?</code>, for a 
regular
-expression that matches only once.  Starting this term directly with
-the question mark delimiter is now deprecated, so that the question mark
-will be available for use in new operators in the future.  Write 
<code>m?\w?</code>
-instead, explicitly using the <code>m</code> operator: the question mark 
delimiter
-still invokes match-once behaviour.
-</p>
-</dd>
 <dt>Use of reference &quot;%s&quot; as array index</dt>
 <dd><a name="perldiag-Use-of-reference-_0022_0025s_0022-as-array-index"></a>
 <p>(W misc) You tried to use a reference as an array index; this probably
@@ -23704,6 +24281,15 @@
 your program.
 </p>
 </dd>
+<dt>&quot;use re &rsquo;strict&rsquo;&quot; is experimental</dt>
+<dd><a name="perldiag-_0022use-re-_0027strict_0027_0022-is-experimental"></a>
+<p>(S experimental::re_strict) The things that are different when a regular
+expression pattern is compiled under <code>'strict'</code> are subject to 
change
+in future Perl releases in incompatible ways.  This means that a pattern
+that compiles today may not in a future Perl release.  This warning is
+to alert you to that risk.
+</p>
+</dd>
 <dt>Use \x{...} for more than two hex characters in regex; marked by 
&lt;&ndash;&nbsp;HERE<!-- /@w --> in m/%s/</dt>
 <dd><a 
name="perldiag-Use-_005cx_007b_002e_002e_002e_007d-for-more-than-two-hex-characters-in-regex_003b-marked-by-_003c_002d_002d-HERE-in-m_002f_0025s_002f"></a>
 <p>(F) In a regular expression, you said something like
@@ -23721,33 +24307,32 @@
 <p>You need to add either braces or blanks to disambiguate.
 </p>
 </dd>
-<dt>Using a hash as a reference is deprecated</dt>
-<dd><a name="perldiag-Using-a-hash-as-a-reference-is-deprecated"></a>
-<p>(D deprecated) You tried to use a hash as a reference, as in
-<code>%foo-&gt;{&quot;bar&quot;}</code> or 
<code>%$ref-&gt;{&quot;hello&quot;}</code>.  Versions of perl &lt;= 5.6.1
-used to allow this syntax, but shouldn&rsquo;t have.  It is now
-deprecated, and will be removed in a future version.
-</p>
-</dd>
-<dt>Using an array as a reference is deprecated</dt>
-<dd><a name="perldiag-Using-an-array-as-a-reference-is-deprecated"></a>
-<p>(D deprecated) You tried to use an array as a reference, as in
-<code>@foo-&gt;[23]</code> or <code>@$ref-&gt;[99]</code>.  Versions of perl 
&lt;= 5.6.1 used to
-allow this syntax, but shouldn&rsquo;t have.  It is now deprecated,
-and will be removed in a future version.
-</p>
-</dd>
 <dt>Using just the first character returned by \N{} in character class in  
regex; marked by &lt;&ndash;&nbsp;HERE<!-- /@w --> in m/%s/</dt>
 <dd><a 
name="perldiag-Using-just-the-first-character-returned-by-_005cN_007b_007d-in-character-class-in-regex_003b-marked-by-_003c_002d_002d-HERE-in-m_002f_0025s_002f"></a>
-<p>(W regexp) A charnames handler may return a sequence of more than one
-character.  Currently all but the first one are discarded when used in
-a regular expression pattern bracketed character class.
+<p>(W regexp) Named Unicode character escapes <code>(\N{...})</code> may return
+a multi-character sequence.  Even though a character class is
+supposed to match just one character of input, perl will match
+the whole thing correctly, except when the class is inverted
+(<code>[^...]</code>), or the escape is the beginning or final end point of
+a range.  For these, what should happen isn&rsquo;t clear at all.  In
+these circumstances, Perl discards all but the first character
+of the returned sequence, which is not likely what you want.
+</p>
+</dd>
+<dt>Using /u for &rsquo;%s&rsquo; instead of /%s in regex; marked by 
&lt;&ndash;&nbsp;HERE<!-- /@w --> in m/%s/</dt>
+<dd><a 
name="perldiag-Using-_002fu-for-_0027_0025s_0027-instead-of-_002f_0025s-in-regex_003b-marked-by-_003c_002d_002d-HERE-in-m_002f_0025s_002f"></a>
+<p>(W regexp) You used a Unicode boundary (<code>\b{...}</code> or 
<code>\B{...}</code>) in a
+portion of a regular expression where the character set modifiers 
<code>/a</code>
+or <code>/aa</code> are in effect.  These two modifiers indicate an ASCII
+interpretation, and this doesn&rsquo;t make sense for a Unicode defintion.
+The generated regular expression will compile so that the boundary uses
+all of Unicode.  No other portion of the regular expression is affected.
 </p>
 </dd>
 <dt>Using !~ with %s doesn&rsquo;t make sense</dt>
 <dd><a name="perldiag-Using-_0021_007e-with-_0025s-doesn_0027t-make-sense"></a>
 <p>(F) Using the <code>!~</code> operator with <code>s///r</code>, 
<code>tr///r</code> or <code>y///r</code> is
-currently reserved for future use, as the exact behaviour has not
+currently reserved for future use, as the exact behavior has not
 been decided.  (Simply returning the boolean opposite of the
 modified string is usually not particularly useful.)
 </p>
@@ -23929,6 +24514,15 @@
 space.
 </p>
 </dd>
+<dt>Warning: unable to close filehandle properly: %s</dt>
+<dd><a 
name="perldiag-Warning_003a-unable-to-close-filehandle-properly_003a-_0025s"></a>
+</dd>
+<dt>Warning: unable to close filehandle %s properly: %s</dt>
+<dd><a 
name="perldiag-Warning_003a-unable-to-close-filehandle-_0025s-properly_003a-_0025s"></a>
+<p>(S io) An error occurred when Perl implicitly closed a filehandle.  This
+usually indicates your file system ran out of disk space.
+</p>
+</dd>
 <dt>Warning: Use of &quot;%s&quot; without parentheses is ambiguous</dt>
 <dd><a 
name="perldiag-Warning_003a-Use-of-_0022_0025s_0022-without-parentheses-is-ambiguous"></a>
 <p>(S ambiguous) You wrote a unary operator followed by something that
@@ -23969,6 +24563,21 @@
 filehandle with an encoding, see <a href="open.html#Top">(open)</a> and 
&lsquo;perlfunc binmode&rsquo;.
 </p>
 </dd>
+<dt>Wide character (U+%X) in %s</dt>
+<dd><a name="perldiag-Wide-character-_0028U_002b_0025X_0029-in-_0025s"></a>
+<p>(W locale) While in a single-byte locale (<em>i.e.</em>, a non-UTF-8
+one), a multi-byte character was encountered.   Perl considers this
+character to be the specified Unicode code point.  Combining non-UTF-8
+locales and Unicode is dangerous.  Almost certainly some characters
+will have two different representations.  For example, in the ISO 8859-7
+(Greek) locale, the code point 0xC3 represents a Capital Gamma.  But so
+also does 0x393.  This will make string comparisons unreliable.
+</p>
+<p>You likely need to figure out how this multi-byte character got mixed up
+with your single-byte locale (or perhaps you thought you had a UTF-8
+locale, but Perl disagrees).
+</p>
+</dd>
 <dt>Within []-length &rsquo;%c&rsquo; not allowed</dt>
 <dd><a 
name="perldiag-Within-_005b_005d_002dlength-_0027_0025c_0027-not-allowed"></a>
 <p>(F) The count in the (un)pack template may be replaced by 
<code>[TEMPLATE]</code>
@@ -24120,9 +24729,9 @@
 <p>Perl lets us have complex data structures.  You can write something like
 this and all of a sudden, you&rsquo;d have an array with three dimensions!
 </p>
-<pre class="verbatim">    for $x (1 .. 10) {
-        for $y (1 .. 10) {
-            for $z (1 .. 10) {
+<pre class="verbatim">    for my $x (1 .. 10) {
+        for my $y (1 .. 10) {
+            for my $z (1 .. 10) {
                 $AoA[$x][$y][$z] =
                     $x ** $y + $z;
             }
@@ -24210,7 +24819,7 @@
 out your array in with a simple print() function, you&rsquo;ll get something
 that doesn&rsquo;t look very nice, like this:
 </p>
-<pre class="verbatim">    @AoA = ( [2, 3], [4, 5, 7], [0] );
+<pre class="verbatim">    my @AoA = ( [2, 3], [4, 5, 7], [0] );
     print $AoA[1][2];
   7
     print @AoA;
@@ -24237,8 +24846,8 @@
 repeatedly.  Here&rsquo;s the case where you just get the count instead
 of a nested array:
 </p>
-<pre class="verbatim">    for $i (1..10) {
-        @array = somefunc($i);
+<pre class="verbatim">    for my $i (1..10) {
+        my @array = somefunc($i);
         $AoA[$i] = @array;      # WRONG!
     }
 </pre>
@@ -24246,15 +24855,18 @@
 its element count.  If that&rsquo;s what you really and truly want, then you
 might do well to consider being a tad more explicit about it, like this:
 </p>
-<pre class="verbatim">    for $i (1..10) {
-        @array = somefunc($i);
+<pre class="verbatim">    for my $i (1..10) {
+        my @array = somefunc($i);
         $counts[$i] = scalar @array;
     }
 </pre>
 <p>Here&rsquo;s the case of taking a reference to the same memory location
 again and again:
 </p>
-<pre class="verbatim">    for $i (1..10) {
+<pre class="verbatim">    # Either without strict or having an outer-scope my 
@array;
+    # declaration.
+
+    for my $i (1..10) {
         @array = somefunc($i);
         $AoA[$i] = address@hidden;     # WRONG!
     }
@@ -24289,7 +24901,10 @@
 hash constructor <code>{}</code> instead.   Here&rsquo;s the right way to do 
the preceding
 broken code fragments:
 </p>
-<pre class="verbatim">    for $i (1..10) {
+<pre class="verbatim">    # Either without strict or having an outer-scope my 
@array;
+    # declaration.
+
+    for my $i (1..10) {
         @array = somefunc($i);
         $AoA[$i] = [ @array ];
     }
@@ -24301,7 +24916,9 @@
 <p>Note that this will produce something similar, but it&rsquo;s
 much harder to read:
 </p>
-<pre class="verbatim">    for $i (1..10) {
+<pre class="verbatim">    # Either without strict or having an outer-scope my 
@array;
+    # declaration.
+    for my $i (1..10) {
         @array = 0 .. $i;
         @{$AoA[$i]} = @array;
     }
@@ -24335,7 +24952,7 @@
 <p>Surprisingly, the following dangerous-looking construct will
 actually work out fine:
 </p>
-<pre class="verbatim">    for $i (1..10) {
+<pre class="verbatim">    for my $i (1..10) {
         my @array = somefunc($i);
         $AoA[$i] = address@hidden;
     }
@@ -24936,7 +25553,9 @@
 
 
  # print the whole thing sorted by number of members
- foreach $family ( sort { keys %{$HoH{$b}} &lt;=&gt; keys %{$HoH{$a}} } keys 
%HoH ) {
+ foreach $family ( sort { keys %{$HoH{$b}} &lt;=&gt; keys %{$HoH{$a}} }
+                                                             keys %HoH )
+ {
      print &quot;$family: { &quot;;
      for $role ( sort keys %{ $HoH{$family} } ) {
          print &quot;$role=$HoH{$family}{$role} &quot;;
@@ -24949,10 +25568,14 @@
  for ( qw(lead wife son daughter pal pet) ) { $rank{$_} = ++$i }
 
  # now print the whole thing sorted by number of members
- foreach $family ( sort { keys %{ $HoH{$b} } &lt;=&gt; keys %{ $HoH{$a} } } 
keys %HoH ) {
+ foreach $family ( sort { keys %{ $HoH{$b} } &lt;=&gt; keys %{ $HoH{$a} } }
+                                                             keys %HoH )
+ {
      print &quot;$family: { &quot;;
      # and print these according to rank order
-     for $role ( sort { $rank{$a} &lt;=&gt; $rank{$b} }  keys %{ $HoH{$family} 
} ) {
+     for $role ( sort { $rank{$a} &lt;=&gt; $rank{$b} }
+                                               keys %{ $HoH{$family} } )
+     {
          print &quot;$role=$HoH{$family}{$role} &quot;;
      }
      print &quot;}\n&quot;;
@@ -25588,16 +26211,53 @@
 <h3 class="section">19.2 DESCRIPTION</h3>
 
 <p>An exploration of some of the issues facing Perl programmers
-on EBCDIC based computers.  We do not cover localization,
-internationalization, or multi-byte character set issues other
-than some discussion of UTF-8 and UTF-EBCDIC.
+on EBCDIC based computers.
+</p>
+<p>Portions of this document that are still incomplete are marked with XXX.
 </p>
-<p>Portions that are still incomplete are marked with XXX.
+<p>Early Perl versions worked on some EBCDIC machines, but the last known
+version that ran on EBCDIC was v5.8.7, until v5.22, when the Perl core
+again works on z/OS.  Theoretically, it could work on OS/400 or Siemens&rsquo;
+BS2000  (or their successors), but this is untested.  In v5.22, not all
+the modules found on CPAN but shipped with core Perl work on z/OS.
 </p>
-<p>Perl used to work on EBCDIC machines, but there are now areas of the code 
where
-it doesn&rsquo;t.  If you want to use Perl on an EBCDIC machine, please let us 
know
+<p>If you want to use Perl on a non-z/OS EBCDIC machine, please let us know
 by sending mail to address@hidden
 </p>
+<p>Writing Perl on an EBCDIC platform is really no different than writing
+on an <a href="#perlebcdic-ASCII">ASCII</a> one, but with different underlying 
numbers, as we&rsquo;ll see
+shortly.  You&rsquo;ll have to know something about those <a 
href="#perlebcdic-ASCII">ASCII</a> platforms
+because the documentation is biased and will frequently use example
+numbers that don&rsquo;t apply to EBCDIC.  There are also very few CPAN
+modules that are written for EBCDIC and which don&rsquo;t work on ASCII;
+instead the vast majority of CPAN modules are written for ASCII, and
+some may happen to work on EBCDIC, while a few have been designed to
+portably work on both.
+</p>
+<p>If your code just uses the 52 letters A-Z and a-z, plus SPACE, the
+digits 0-9, and the punctuation characters that Perl uses, plus a few
+controls that are denoted by escape sequences like <code>\n</code> and 
<code>\t</code>, then
+there&rsquo;s nothing special about using Perl, and your code may very well
+work on an ASCII machine without change.
+</p>
+<p>But if you write code that uses <code>\005</code> to mean a TAB or 
<code>\xC1</code> to mean
+an &quot;A&quot;, or <code>\xDF</code> to mean a &quot;ÃÂ¿&quot; (small 
<code>&quot;y&quot;</code> with a diaeresis),
+then your code may well work on your EBCDIC platform, but not on an
+ASCII one.  That&rsquo;s fine to do if no one will ever want to run your code
+on an ASCII platform; but the bias in this document will be in writing
+code portable between EBCDIC and ASCII systems.  Again, if every
+character you care about is easily enterable from your keyboard, you
+don&rsquo;t have to know anything about ASCII, but many keyboards don&rsquo;t 
easily
+allow you to directly enter, say, the character <code>\xDF</code>, so you have 
to
+specify it indirectly, such as by using the <code>&quot;\xDF&quot;</code> 
escape sequence.
+In those cases it&rsquo;s easiest to know something about the ASCII/Unicode
+character sets.  If you know that the small &quot;ÃÂ¿&quot; is 
<code>U+00FF</code>, then
+you can instead specify it as <code>&quot;\N{U+FF}&quot;</code>, and have the 
computer
+automatically translate it to <code>\xDF</code> on your platform, and leave it 
as
+<code>\xFF</code> on ASCII ones.  Or you could specify it by name, 
<code>\N{LATIN
+SMALL LETTER Y WITH DIAERESIS</code> and not have to know the  numbers.
+Either way works, but require familiarity with Unicode.
+</p>
 <hr>
 <a name="perlebcdic-COMMON-CHARACTER-CODE-SETS"></a>
 <div class="header">
@@ -25618,11 +26278,9 @@
 </td></tr>
 <tr><td align="left" valign="top">&bull; <a 
href="#perlebcdic-Unicode-code-points-versus-EBCDIC-code-points" 
accesskey="5">perlebcdic Unicode code points versus EBCDIC code 
points</a>:</td><td>&nbsp;&nbsp;</td><td align="left" valign="top">
 </td></tr>
-<tr><td align="left" valign="top">&bull; <a 
href="#perlebcdic-Remaining-Perl-Unicode-problems-in-EBCDIC" 
accesskey="6">perlebcdic Remaining Perl Unicode problems in 
EBCDIC</a>:</td><td>&nbsp;&nbsp;</td><td align="left" valign="top">
-</td></tr>
-<tr><td align="left" valign="top">&bull; <a href="#perlebcdic-Unicode-and-UTF" 
accesskey="7">perlebcdic Unicode and UTF</a>:</td><td>&nbsp;&nbsp;</td><td 
align="left" valign="top">
+<tr><td align="left" valign="top">&bull; <a href="#perlebcdic-Unicode-and-UTF" 
accesskey="6">perlebcdic Unicode and UTF</a>:</td><td>&nbsp;&nbsp;</td><td 
align="left" valign="top">
 </td></tr>
-<tr><td align="left" valign="top">&bull; <a href="#perlebcdic-Using-Encode" 
accesskey="8">perlebcdic Using Encode</a>:</td><td>&nbsp;&nbsp;</td><td 
align="left" valign="top">
+<tr><td align="left" valign="top">&bull; <a href="#perlebcdic-Using-Encode" 
accesskey="7">perlebcdic Using Encode</a>:</td><td>&nbsp;&nbsp;</td><td 
align="left" valign="top">
 </td></tr>
 </table>
 
@@ -25635,23 +26293,31 @@
 <a name="ASCII"></a>
 <h4 class="subsection">19.3.1 ASCII</h4>
 
-<p>The American Standard Code for Information Interchange (ASCII or US-ASCII) 
is a
-set of
-integers running from 0 to 127 (decimal) that imply character
-interpretation by the display and other systems of computers.
+<p>The American Standard Code for Information Interchange (ASCII or
+US-ASCII) is a set of
+integers running from 0 to 127 (decimal) that have standardized
+interpretations by the computers which use ASCII.  For example, 65 means
+the letter &quot;A&quot;.
 The range 0..127 can be covered by setting the bits in a 7-bit binary
 digit, hence the set is sometimes referred to as &quot;7-bit ASCII&quot;.
 ASCII was described by the American National Standards Institute
 document ANSI X3.4-1986.  It was also described by ISO 646:1991
 (with localization for currency symbols).  The full ASCII set is
-given in the table below as the first 128 elements.  Languages that
+given in the table <a href="#perlebcdic-recipe-3">below</a> as the first 128 
elements.
+Languages that
 can be written adequately with the characters in ASCII include
 English, Hawaiian, Indonesian, Swahili and some Native American
 languages.
 </p>
-<p>There are many character sets that extend the range of integers
-from 0..2**7-1 up to 2**8-1, or 8 bit bytes (octets if you prefer).
-One common one is the ISO 8859-1 character set.
+<p>Most non-EBCDIC character sets are supersets of ASCII.  That is the
+integers 0-127 mean what ASCII says they mean.  But integers 128 and
+above are specific to the character set.
+</p>
+<p>Many of these fit entirely into 8 bits, using ASCII as 0-127, while
+specifying what 128-255 mean, and not using anything above 255.
+Thus, these are single-byte (or octet if you prefer) character sets.
+One important one (since Unicode is a superset of it) is the ISO 8859-1
+character set.
 </p>
 <hr>
 <a name="perlebcdic-ISO-8859"></a>
@@ -25662,10 +26328,13 @@
 <a name="ISO-8859"></a>
 <h4 class="subsection">19.3.2 ISO 8859</h4>
 
-<p>The ISO 8859-$n are a collection of character code sets from the
-International Organization for Standardization (ISO), each of which
-adds characters to the ASCII set that are typically found in European
+<p>The ISO 8859-<em><strong>$n</strong></em> are a collection of character 
code sets from the
+International Organization for Standardization (ISO), each of which adds
+characters to the ASCII set that are typically found in various
 languages, many of which are based on the Roman, or Latin, alphabet.
+Most are for European languages, but there are also ones for Arabic,
+Greek, Hebrew, and Thai.  There are good references on the web about
+all these.
 </p>
 <hr>
 <a name="perlebcdic-Latin-1-_0028ISO-8859_002d1_0029"></a>
@@ -25685,7 +26354,7 @@
 German can use ISO 8859-1 but must do so without German-style
 quotation marks.  This set is based on Western European extensions
 to ASCII and is commonly encountered in world wide web work.
-In IBM character code set identification terminology ISO 8859-1 is
+In IBM character code set identification terminology, ISO 8859-1 is
 also known as CCSID 819 (or sometimes 0819 or even 00819).
 </p>
 <hr>
@@ -25699,12 +26368,15 @@
 
 <p>The Extended Binary Coded Decimal Interchange Code refers to a
 large collection of single- and multi-byte coded character sets that are
-different from ASCII or ISO 8859-1 and are all slightly different from each
-other; they typically run on host computers.  The EBCDIC encodings derive from
-8-bit byte extensions of Hollerith punched card encodings.  The layout on the
-cards was such that high bits were set for the upper and lower case alphabet
-characters [a-z] and [A-Z], but there were gaps within each Latin alphabet
-range.
+quite different from ASCII and ISO 8859-1, and are all slightly
+different from each other; they typically run on host computers.  The
+EBCDIC encodings derive from 8-bit byte extensions of Hollerith punched
+card encodings, which long predate ASCII.  The layout on the
+cards was such that high bits were set for the upper and lower case
+alphabetic
+characters <code>[a-z]</code> and <code>[A-Z]</code>, but there were gaps 
within each Latin
+alphabet range, visible in the table <a href="#perlebcdic-recipe-3">below</a>. 
 These gaps can
+cause complications.
 </p>
 <p>Some IBM EBCDIC character sets may be known by character code set
 identification numbers (CCSID numbers) or code page numbers.
@@ -25715,13 +26387,15 @@
 <table class="menu" border="0" cellspacing="0">
 <tr><td align="left" valign="top">&bull; <a 
href="#perlebcdic-The-13-variant-characters" accesskey="1">perlebcdic The 13 
variant characters</a>:</td><td>&nbsp;&nbsp;</td><td align="left" valign="top">
 </td></tr>
+<tr><td align="left" valign="top">&bull; <a 
href="#perlebcdic-EBCDIC-code-sets-recognized-by-Perl" accesskey="2">perlebcdic 
EBCDIC code sets recognized by Perl</a>:</td><td>&nbsp;&nbsp;</td><td 
align="left" valign="top">
+</td></tr>
 </table>
 
 <hr>
 <a name="perlebcdic-The-13-variant-characters"></a>
 <div class="header">
 <p>
-Up: <a href="#perlebcdic-EBCDIC" accesskey="u" rel="up">perlebcdic EBCDIC</a> 
&nbsp; [<a href="#SEC_Contents" title="Table of contents" 
rel="contents">Contents</a>]</p>
+Next: <a href="#perlebcdic-EBCDIC-code-sets-recognized-by-Perl" accesskey="n" 
rel="next">perlebcdic EBCDIC code sets recognized by Perl</a>, Up: <a 
href="#perlebcdic-EBCDIC" accesskey="u" rel="up">perlebcdic EBCDIC</a> &nbsp; 
[<a href="#SEC_Contents" title="Table of contents" 
rel="contents">Contents</a>]</p>
 </div>
 <a name="The-13-variant-characters"></a>
 <h4 class="subsubsection">19.3.4.1 The 13 variant characters</h4>
@@ -25732,13 +26406,21 @@
 </p>
 <pre class="verbatim">    \ [ ] { } ^ ~ ! # | $ @ `
 </pre>
-<p>When Perl is compiled for a platform, it looks at some of these characters 
to
+<p>When Perl is compiled for a platform, it looks at all of these characters to
 guess which EBCDIC character set the platform uses, and adapts itself
 accordingly to that platform.  If the platform uses a character set that is not
 one of the three Perl knows about, Perl will either fail to compile, or
 mistakenly and silently choose one of the three.
-They are:
 </p>
+<hr>
+<a name="perlebcdic-EBCDIC-code-sets-recognized-by-Perl"></a>
+<div class="header">
+<p>
+Previous: <a href="#perlebcdic-The-13-variant-characters" accesskey="p" 
rel="prev">perlebcdic The 13 variant characters</a>, Up: <a 
href="#perlebcdic-EBCDIC" accesskey="u" rel="up">perlebcdic EBCDIC</a> &nbsp; 
[<a href="#SEC_Contents" title="Table of contents" 
rel="contents">Contents</a>]</p>
+</div>
+<a name="EBCDIC-code-sets-recognized-by-Perl"></a>
+<h4 class="subsubsection">19.3.4.2 EBCDIC code sets recognized by Perl</h4>
+
 <dl compact="compact">
 <dt><strong>0037</strong></dt>
 <dd><a name="perlebcdic-0037"></a>
@@ -25746,7 +26428,7 @@
 characters (i.e. ISO 8859-1) to an EBCDIC set.  0037 is used
 in North American English locales on the OS/400 operating system
 that runs on AS/400 computers.  CCSID 0037 differs from ISO 8859-1
-in 237 places, in other words they agree on only 19 code point values.
+in 236 places; in other words they agree on only 20 code point values.
 </p>
 </dd>
 <dt><strong>1047</strong></dt>
@@ -25754,13 +26436,16 @@
 <p>Character code set ID 1047 is also a mapping of the ASCII plus
 Latin-1 characters (i.e. ISO 8859-1) to an EBCDIC set.  1047 is
 used under Unix System Services for OS/390 or z/OS, and OpenEdition
-for VM/ESA.  CCSID 1047 differs from CCSID 0037 in eight places.
+for VM/ESA.  CCSID 1047 differs from CCSID 0037 in eight places,
+and from ISO 8859-1 in 236.
 </p>
 </dd>
 <dt><strong>POSIX-BC</strong></dt>
 <dd><a name="perlebcdic-POSIX_002dBC"></a>
 <p>The EBCDIC code page in use on Siemens&rsquo; BS2000 system is distinct from
 1047 and 0037.  It is identified below as the POSIX-BC set.
+Like 0037 and 1047, it is the same as ISO 8859-1 in 20 code point
+values.
 </p>
 </dd>
 </dl>
@@ -25769,80 +26454,112 @@
 <a name="perlebcdic-Unicode-code-points-versus-EBCDIC-code-points"></a>
 <div class="header">
 <p>
-Next: <a href="#perlebcdic-Remaining-Perl-Unicode-problems-in-EBCDIC" 
accesskey="n" rel="next">perlebcdic Remaining Perl Unicode problems in 
EBCDIC</a>, Previous: <a href="#perlebcdic-EBCDIC" accesskey="p" 
rel="prev">perlebcdic EBCDIC</a>, Up: <a 
href="#perlebcdic-COMMON-CHARACTER-CODE-SETS" accesskey="u" rel="up">perlebcdic 
COMMON CHARACTER CODE SETS</a> &nbsp; [<a href="#SEC_Contents" title="Table of 
contents" rel="contents">Contents</a>]</p>
+Next: <a href="#perlebcdic-Unicode-and-UTF" accesskey="n" 
rel="next">perlebcdic Unicode and UTF</a>, Previous: <a 
href="#perlebcdic-EBCDIC" accesskey="p" rel="prev">perlebcdic EBCDIC</a>, Up: 
<a href="#perlebcdic-COMMON-CHARACTER-CODE-SETS" accesskey="u" 
rel="up">perlebcdic COMMON CHARACTER CODE SETS</a> &nbsp; [<a 
href="#SEC_Contents" title="Table of contents" rel="contents">Contents</a>]</p>
 </div>
 <a name="Unicode-code-points-versus-EBCDIC-code-points"></a>
 <h4 class="subsection">19.3.5 Unicode code points versus EBCDIC code 
points</h4>
 
 <p>In Unicode terminology a <em>code point</em> is the number assigned to a
 character: for example, in EBCDIC the character &quot;A&quot; is usually 
assigned
-the number 193.  In Unicode the character &quot;A&quot; is assigned the number 
65.
-This causes a problem with the semantics of the pack/unpack &quot;U&quot;, 
which
-are supposed to pack Unicode code points to characters and back to numbers.
-The problem is: which code points to use for code points less than 256?
-(for 256 and over there&rsquo;s no problem: Unicode code points are used)
-In EBCDIC, for the low 256 the EBCDIC code points are used.  This
-means that the equivalences
-</p>
-<pre class="verbatim">    pack(&quot;U&quot;, ord($character)) eq $character
-    unpack(&quot;U&quot;, $character) == ord $character
-</pre>
-<p>will hold.  (If Unicode code points were applied consistently over
-all the possible code points, pack(&quot;U&quot;,ord(&quot;A&quot;)) would in 
EBCDIC
-equal <em>A with acute</em> or chr(101), and unpack(&quot;U&quot;, 
&quot;A&quot;) would equal
-65, or <em>non-breaking space</em>, not 193, or ord &quot;A&quot;.)
+the number 193.  In Unicode, the character &quot;A&quot; is assigned the 
number 65.
+All the code points in ASCII and Latin-1 (ISO 8859-1) have the same
+meaning in Unicode.  All three of the recognized EBCDIC code sets have
+256 code points, and in each code set, all 256 code points are mapped to
+equivalent Latin1 code points.  Obviously, &quot;A&quot; will map to 
&quot;A&quot;, &quot;B&quot; =&gt;
+&quot;B&quot;, &quot;%&quot; =&gt; &quot;%&quot;, etc., for all printable 
characters in Latin1 and these
+code pages.
+</p>
+<p>It also turns out that EBCDIC has nearly precise equivalents for the
+ASCII/Latin1 C0 controls and the DELETE control.  (The C0 controls are
+those whose ASCII code points are 0..0x1F; things like TAB, ACK, BEL,
+etc.)  A mapping is set up between these ASCII/EBCDIC controls.  There
+isn&rsquo;t such a precise mapping between the C1 controls on ASCII platforms
+and the remaining EBCDIC controls.  What has been done is to map these
+controls, mostly arbitrarily, to some otherwise unmatched character in
+the other character set.  Most of these are very very rarely used
+nowadays in EBCDIC anyway, and their names have been dropped, without
+much complaint.  For example the EO (Eight Ones) EBCDIC control
+(consisting of eight one bits = 0xFF) is mapped to the C1 APC control
+(0x9F), and you can&rsquo;t use the name &quot;EO&quot;.
+</p>
+<p>The EBCDIC controls provide three possible line terminator characters,
+CR (0x0D), LF (0x25), and NL (0x15).  On ASCII platforms, the symbols
+&quot;NL&quot; and &quot;LF&quot; refer to the same character, but in strict 
EBCDIC
+terminology they are different ones.  The EBCDIC NL is mapped to the C1
+control called &quot;NEL&quot; (&quot;Next Line&quot;; here&rsquo;s a case 
where the mapping makes
+quite a bit of sense, and hence isn&rsquo;t just arbitrary).  On some EBCDIC
+platforms, this NL or NEL is the typical line terminator.  This is true
+of z/OS and BS2000.  In these platforms, the C compilers will swap the
+LF and NEL code points, so that <code>&quot;\n&quot;</code> is 0x15, and 
refers to NL.  Perl
+does that too; you can see it in the code chart <a 
href="#perlebcdic-recipe-3">below</a>.
+This makes things generally &quot;just work&quot; without you even having to be
+aware that there is a swap.
 </p>
 <hr>
-<a name="perlebcdic-Remaining-Perl-Unicode-problems-in-EBCDIC"></a>
-<div class="header">
-<p>
-Next: <a href="#perlebcdic-Unicode-and-UTF" accesskey="n" 
rel="next">perlebcdic Unicode and UTF</a>, Previous: <a 
href="#perlebcdic-Unicode-code-points-versus-EBCDIC-code-points" accesskey="p" 
rel="prev">perlebcdic Unicode code points versus EBCDIC code points</a>, Up: <a 
href="#perlebcdic-COMMON-CHARACTER-CODE-SETS" accesskey="u" rel="up">perlebcdic 
COMMON CHARACTER CODE SETS</a> &nbsp; [<a href="#SEC_Contents" title="Table of 
contents" rel="contents">Contents</a>]</p>
-</div>
-<a name="Remaining-Perl-Unicode-problems-in-EBCDIC"></a>
-<h4 class="subsection">19.3.6 Remaining Perl Unicode problems in EBCDIC</h4>
-
-<ul>
-<li> Many of the remaining problems seem to be related to case-insensitive 
matching
-
-</li><li> The extensions Unicode::Collate and Unicode::Normalized are not
-supported under EBCDIC, likewise for the encoding pragma.
-
-</li></ul>
-
-<hr>
 <a name="perlebcdic-Unicode-and-UTF"></a>
 <div class="header">
 <p>
-Next: <a href="#perlebcdic-Using-Encode" accesskey="n" rel="next">perlebcdic 
Using Encode</a>, Previous: <a 
href="#perlebcdic-Remaining-Perl-Unicode-problems-in-EBCDIC" accesskey="p" 
rel="prev">perlebcdic Remaining Perl Unicode problems in EBCDIC</a>, Up: <a 
href="#perlebcdic-COMMON-CHARACTER-CODE-SETS" accesskey="u" rel="up">perlebcdic 
COMMON CHARACTER CODE SETS</a> &nbsp; [<a href="#SEC_Contents" title="Table of 
contents" rel="contents">Contents</a>]</p>
+Next: <a href="#perlebcdic-Using-Encode" accesskey="n" rel="next">perlebcdic 
Using Encode</a>, Previous: <a 
href="#perlebcdic-Unicode-code-points-versus-EBCDIC-code-points" accesskey="p" 
rel="prev">perlebcdic Unicode code points versus EBCDIC code points</a>, Up: <a 
href="#perlebcdic-COMMON-CHARACTER-CODE-SETS" accesskey="u" rel="up">perlebcdic 
COMMON CHARACTER CODE SETS</a> &nbsp; [<a href="#SEC_Contents" title="Table of 
contents" rel="contents">Contents</a>]</p>
 </div>
 <a name="Unicode-and-UTF"></a>
-<h4 class="subsection">19.3.7 Unicode and UTF</h4>
+<h4 class="subsection">19.3.6 Unicode and UTF</h4>
 
-<p>UTF stands for <code>Unicode Transformation Format</code>.
+<p>UTF stands for &quot;Unicode Transformation Format&quot;.
 UTF-8 is an encoding of Unicode into a sequence of 8-bit byte chunks, based on
 ASCII and Latin-1.
 The length of a sequence required to represent a Unicode code point
 depends on the ordinal number of that code point,
 with larger numbers requiring more bytes.
 UTF-EBCDIC is like UTF-8, but based on EBCDIC.
+They are enough alike that often, casual usage will conflate the two
+terms, and use &quot;UTF-8&quot; to mean both the UTF-8 found on ASCII 
platforms,
+and the UTF-EBCDIC found on EBCDIC ones.
 </p>
-<p>You may see the term <code>invariant</code> character or code point.
+<p>You may see the term &quot;invariant&quot; character or code point.
 This simply means that the character has the same numeric
-value when encoded as when not.
-(Note that this is a very different concept from <a 
href="#perlebcdic-The-13-variant-characters">The 13 variant characters</a>
-mentioned above.)
-For example, the ordinal value of &rsquo;A&rsquo; is 193 in most EBCDIC code 
pages,
-and also is 193 when encoded in UTF-EBCDIC.
-All variant code points occupy at least two bytes when encoded.
-In UTF-8, the code points corresponding to the lowest 128
+value and representation when encoded in UTF-8 (or UTF-EBCDIC) as when
+not.  (Note that this is a very different concept from <a 
href="#perlebcdic-The-13-variant-characters">The 13 variant
+characters</a> mentioned above.  Careful prose will use the term &quot;UTF-8
+invariant&quot; instead of just &quot;invariant&quot;, but most often 
you&rsquo;ll see just
+&quot;invariant&quot;.) For example, the ordinal value of &quot;A&quot; is 193 
in most
+EBCDIC code pages, and also is 193 when encoded in UTF-EBCDIC.  All
+UTF-8 (or UTF-EBCDIC) variant code points occupy at least two bytes when
+encoded in UTF-8 (or UTF-EBCDIC); by definition, the UTF-8 (or
+UTF-EBCDIC) invariant code points are exactly one byte whether encoded
+in UTF-8 (or UTF-EBCDIC), or not.  (By now you see why people typically
+just say &quot;UTF-8&quot; when they also mean &quot;UTF-EBCDIC&quot;.  For 
the rest of this
+document, we&rsquo;ll mostly be casual about it too.)
+In ASCII UTF-8, the code points corresponding to the lowest 128
 ordinal numbers (0 - 127: the ASCII characters) are invariant.
 In UTF-EBCDIC, there are 160 invariant characters.
 (If you care, the EBCDIC invariants are those characters
 which have ASCII equivalents, plus those that correspond to
-the C1 controls (80..9f on ASCII platforms).)
+the C1 controls (128 - 159 on ASCII platforms).)
 </p>
 <p>A string encoded in UTF-EBCDIC may be longer (but never shorter) than
-one encoded in UTF-8.
+one encoded in UTF-8.  Perl extends UTF-8 so that it can encode code
+points above the Unicode maximum of U+10FFFF.  It extends UTF-EBCDIC as
+well, but due to the inherent limitations in UTF-EBCDIC, the maximum
+code point expressible is U+7FFF_FFFF, even if the word size is more
+than 32 bits.
+</p>
+<p>UTF-EBCDIC is defined by
+<a href="http://www.unicode.org/reports/tr16";>Unicode Technical Report #16</a>.
+It is defined based on CCSID 1047, not allowing for the differences for
+other code pages.  This allows for easy interchange of text between
+computers running different code pages, but makes it unusable, without
+adaptation, for Perl on those other code pages.
+</p>
+<p>The reason for this unusability is that a fundamental assumption of Perl
+is that the characters it cares about for parsing and lexical analysis
+are the same whether or not the text is in UTF-8.  For example, Perl
+expects the character <code>&quot;[&quot;</code> to have the same 
representation, no matter
+if the string containing it (or program text) is UTF-8 encoded or not.
+To ensure this, Perl adapts UTF-EBCDIC to the particular code page so
+that all characters it expects to be UTF-8 invariant are in fact UTF-8
+invariant.  This means that text generated on a computer running one
+version of Perl&rsquo;s UTF-EBCDIC has to be translated to be intelligible to
+a computer running another.
 </p>
 <hr>
 <a name="perlebcdic-Using-Encode"></a>
@@ -25851,9 +26568,9 @@
 Previous: <a href="#perlebcdic-Unicode-and-UTF" accesskey="p" 
rel="prev">perlebcdic Unicode and UTF</a>, Up: <a 
href="#perlebcdic-COMMON-CHARACTER-CODE-SETS" accesskey="u" rel="up">perlebcdic 
COMMON CHARACTER CODE SETS</a> &nbsp; [<a href="#SEC_Contents" title="Table of 
contents" rel="contents">Contents</a>]</p>
 </div>
 <a name="Using-Encode"></a>
-<h4 class="subsection">19.3.8 Using Encode</h4>
+<h4 class="subsection">19.3.7 Using Encode</h4>
 
-<p>Starting from Perl 5.8 you can use the standard new module Encode
+<p>Starting from Perl 5.8 you can use the standard module Encode
 to translate from EBCDIC to Latin-1 code points.
 Encode knows about more EBCDIC character sets than Perl can currently
 be compiled to run on.
@@ -25879,7 +26596,7 @@
 <p>For doing I/O it is suggested that you use the autotranslating features
 of PerlIO, see <a href="#perluniintro-NAME">perluniintro NAME</a>.
 </p>
-<p>Since version 5.8 Perl uses the new PerlIO I/O library.  This enables
+<p>Since version 5.8 Perl uses the PerlIO I/O library.  This enables
 you to use different encodings per IO channel.  For example you may use
 </p>
 <pre class="verbatim">    use Encode;
@@ -25897,7 +26614,7 @@
 characters were printed), and
 UTF-EBCDIC (in this example identical to normal EBCDIC since only characters
 that don&rsquo;t differ between EBCDIC and UTF-EBCDIC were printed).  See the
-documentation of Encode::PerlIO for details.
+documentation of <a href="Encode-PerlIO.html#Top">(Encode-PerlIO)</a> for 
details.
 </p>
 <p>As the PerlIO layer uses raw IO (bytes) internally, all this totally
 ignores things like the type of your filesystem (ASCII or EBCDIC).
@@ -25917,12 +26634,13 @@
 table names of the Latin 1
 extensions to ASCII have been labelled with character names roughly
 corresponding to <em>The Unicode Standard, Version 6.1</em> albeit with
-substitutions such as s/LATIN// and s/VULGAR// in all cases, s/CAPITAL
-LETTER// in some cases, and s/SMALL LETTER ([A-Z])/\l$1/ in some other
+substitutions such as <code>s/LATIN//</code> and <code>s/VULGAR//</code> in 
all cases;
+<code>s/CAPITAL&nbsp;LETTER//</code><!-- /@w --> in some cases; and
+<code>s/SMALL&nbsp;LETTER&nbsp;<span 
class="nolinebreak">([A-Z])/\l$1/</span></code><!-- /@w --> in some other
 cases.  Controls are listed using their Unicode 6.2 abbreviations.
 The differences between the 0037 and 1047 sets are
-flagged with **.  The differences between the 1047 and POSIX-BC sets
-are flagged with ##.  All ord() numbers listed are decimal.  If you
+flagged with <code>**</code>.  The differences between the 1047 and POSIX-BC 
sets
+are flagged with <code>##.</code>  All <code>ord()</code> numbers listed are 
decimal.  If you
 would rather see this table listing octal values, then run the table
 (that is, the pod source text of this document, since this recipe may not
 work with a pod2_other_format translation) through:
@@ -25948,7 +26666,8 @@
 
 <pre class="verbatim"> open(FH,&quot;&lt;perlebcdic.pod&quot;) or die 
&quot;Could not open perlebcdic.pod: $!&quot;;
  while (&lt;FH&gt;) {
-     if 
(/(.{29})(\d+)\s+(\d+)\s+(\d+)\s+(\d+)\s+(\d+)\.?(\d*)\s+(\d+)\.?(\d*)/)
+     if (/(.{29})(\d+)\s+(\d+)\s+(\d+)\s+(\d+)\s+(\d+)\.?(\d*)
+                                                     \s+(\d+)\.?(\d*)/x)
      {
          if ($7 ne '' &amp;&amp; $9 ne '') {
              printf(
@@ -25989,7 +26708,8 @@
 
 <pre class="verbatim"> open(FH,&quot;&lt;perlebcdic.pod&quot;) or die 
&quot;Could not open perlebcdic.pod: $!&quot;;
  while (&lt;FH&gt;) {
-     if 
(/(.{29})(\d+)\s+(\d+)\s+(\d+)\s+(\d+)\s+(\d+)\.?(\d*)\s+(\d+)\.?(\d*)/)
+     if (/(.{29})(\d+)\s+(\d+)\s+(\d+)\s+(\d+)\s+(\d+)\.?(\d*)
+                                                     \s+(\d+)\.?(\d*)/x)
      {
          if ($7 ne '' &amp;&amp; $9 ne '') {
              printf(
@@ -26009,8 +26729,8 @@
 
 
                           ISO
-                         8859-1             POS-
-                         CCSID  CCSID CCSID IX-
+                         8859-1             POS-         CCSID
+                         CCSID  CCSID CCSID IX-          1047
   chr                     0819   0037 1047  BC  UTF-8  UTF-EBCDIC
  ---------------------------------------------------------------------
  &lt;NUL&gt;                       0    0    0    0    0        0
@@ -26303,7 +27023,7 @@
     -e '          map{[$_,substr($_,39,3)address@hidden;}' perlebcdic.pod
 </pre>
 <p>If you would rather see it in POSIX-BC order then change the number
-39 in the last line to 44, like this:
+34 in the last line to 44, like this:
 </p>
 <dl compact="compact">
 <dt>recipe 6</dt>
@@ -26318,6 +27038,288 @@
      -e '          sort{$a-&gt;[1] &lt;=&gt; $b-&gt;[1]}' \
      -e '          map{[$_,substr($_,44,3)address@hidden;}' perlebcdic.pod
 </pre>
+<table class="menu" border="0" cellspacing="0">
+<tr><td align="left" valign="top">&bull; <a 
href="#perlebcdic-Table-in-hex_002c-sorted-in-1047-order" 
accesskey="1">perlebcdic Table in hex, sorted in 1047 
order</a>:</td><td>&nbsp;&nbsp;</td><td align="left" valign="top">
+</td></tr>
+</table>
+
+<hr>
+<a name="perlebcdic-Table-in-hex_002c-sorted-in-1047-order"></a>
+<div class="header">
+<p>
+Up: <a href="#perlebcdic-SINGLE-OCTET-TABLES" accesskey="u" 
rel="up">perlebcdic SINGLE OCTET TABLES</a> &nbsp; [<a href="#SEC_Contents" 
title="Table of contents" rel="contents">Contents</a>]</p>
+</div>
+<a name="Table-in-hex_002c-sorted-in-1047-order"></a>
+<h4 class="subsection">19.4.1 Table in hex, sorted in 1047 order</h4>
+
+<p>Since this document was first written, the convention has become more
+and more to use hexadecimal notation for code points.  To do this with
+the recipes and to also sort is a multi-step process, so here, for
+convenience, is the table from above, re-sorted to be in Code Page 1047
+order, and using hex notation.
+</p>
+<pre class="verbatim">                          ISO
+                         8859-1             POS-         CCSID
+                         CCSID  CCSID CCSID IX-          1047
+  chr                     0819   0037 1047  BC  UTF-8  UTF-EBCDIC
+ ---------------------------------------------------------------------
+ &lt;NUL&gt;                       00   00   00   00   00       00
+ &lt;SOH&gt;                       01   01   01   01   01       01
+ &lt;STX&gt;                       02   02   02   02   02       02
+ &lt;ETX&gt;                       03   03   03   03   03       03
+ &lt;ST&gt;                        9C   04   04   04   C2.9C    04
+ &lt;HT&gt;                        09   05   05   05   09       05
+ &lt;SSA&gt;                       86   06   06   06   C2.86    06
+ &lt;DEL&gt;                       7F   07   07   07   7F       07
+ &lt;EPA&gt;                       97   08   08   08   C2.97    08
+ &lt;RI&gt;                        8D   09   09   09   C2.8D    09
+ &lt;SS2&gt;                       8E   0A   0A   0A   C2.8E    0A
+ &lt;VT&gt;                        0B   0B   0B   0B   0B       0B
+ &lt;FF&gt;                        0C   0C   0C   0C   0C       0C
+ &lt;CR&gt;                        0D   0D   0D   0D   0D       0D
+ &lt;SO&gt;                        0E   0E   0E   0E   0E       0E
+ &lt;SI&gt;                        0F   0F   0F   0F   0F       0F
+ &lt;DLE&gt;                       10   10   10   10   10       10
+ &lt;DC1&gt;                       11   11   11   11   11       11
+ &lt;DC2&gt;                       12   12   12   12   12       12
+ &lt;DC3&gt;                       13   13   13   13   13       13
+ &lt;OSC&gt;                       9D   14   14   14   C2.9D    14
+ &lt;LF&gt;                        0A   25   15   15   0A       15    **
+ &lt;BS&gt;                        08   16   16   16   08       16
+ &lt;ESA&gt;                       87   17   17   17   C2.87    17
+ &lt;CAN&gt;                       18   18   18   18   18       18
+ &lt;EOM&gt;                       19   19   19   19   19       19
+ &lt;PU2&gt;                       92   1A   1A   1A   C2.92    1A
+ &lt;SS3&gt;                       8F   1B   1B   1B   C2.8F    1B
+ &lt;FS&gt;                        1C   1C   1C   1C   1C       1C
+ &lt;GS&gt;                        1D   1D   1D   1D   1D       1D
+ &lt;RS&gt;                        1E   1E   1E   1E   1E       1E
+ &lt;US&gt;                        1F   1F   1F   1F   1F       1F
+ &lt;PAD&gt;                       80   20   20   20   C2.80    20
+ &lt;HOP&gt;                       81   21   21   21   C2.81    21
+ &lt;BPH&gt;                       82   22   22   22   C2.82    22
+ &lt;NBH&gt;                       83   23   23   23   C2.83    23
+ &lt;IND&gt;                       84   24   24   24   C2.84    24
+ &lt;NEL&gt;                       85   15   25   25   C2.85    25     **
+ &lt;ETB&gt;                       17   26   26   26   17       26
+ &lt;ESC&gt;                       1B   27   27   27   1B       27
+ &lt;HTS&gt;                       88   28   28   28   C2.88    28
+ &lt;HTJ&gt;                       89   29   29   29   C2.89    29
+ &lt;VTS&gt;                       8A   2A   2A   2A   C2.8A    2A
+ &lt;PLD&gt;                       8B   2B   2B   2B   C2.8B    2B
+ &lt;PLU&gt;                       8C   2C   2C   2C   C2.8C    2C
+ &lt;ENQ&gt;                       05   2D   2D   2D   05       2D
+ &lt;ACK&gt;                       06   2E   2E   2E   06       2E
+ &lt;BEL&gt;                       07   2F   2F   2F   07       2F
+ &lt;DCS&gt;                       90   30   30   30   C2.90    30
+ &lt;PU1&gt;                       91   31   31   31   C2.91    31
+ &lt;SYN&gt;                       16   32   32   32   16       32
+ &lt;STS&gt;                       93   33   33   33   C2.93    33
+ &lt;CCH&gt;                       94   34   34   34   C2.94    34
+ &lt;MW&gt;                        95   35   35   35   C2.95    35
+ &lt;SPA&gt;                       96   36   36   36   C2.96    36
+ &lt;EOT&gt;                       04   37   37   37   04       37
+ &lt;SOS&gt;                       98   38   38   38   C2.98    38
+ &lt;SGC&gt;                       99   39   39   39   C2.99    39
+ &lt;SCI&gt;                       9A   3A   3A   3A   C2.9A    3A
+ &lt;CSI&gt;                       9B   3B   3B   3B   C2.9B    3B
+ &lt;DC4&gt;                       14   3C   3C   3C   14       3C
+ &lt;NAK&gt;                       15   3D   3D   3D   15       3D
+ &lt;PM&gt;                        9E   3E   3E   3E   C2.9E    3E
+ &lt;SUB&gt;                       1A   3F   3F   3F   1A       3F
+ &lt;SPACE&gt;                     20   40   40   40   20       40
+ &lt;NON-BREAKING SPACE&gt;        A0   41   41   41   C2.A0    80.41
+ &lt;a WITH CIRCUMFLEX&gt;         E2   42   42   42   C3.A2    8B.43
+ &lt;a WITH DIAERESIS&gt;          E4   43   43   43   C3.A4    8B.45
+ &lt;a WITH GRAVE&gt;              E0   44   44   44   C3.A0    8B.41
+ &lt;a WITH ACUTE&gt;              E1   45   45   45   C3.A1    8B.42
+ &lt;a WITH TILDE&gt;              E3   46   46   46   C3.A3    8B.44
+ &lt;a WITH RING ABOVE&gt;         E5   47   47   47   C3.A5    8B.46
+ &lt;c WITH CEDILLA&gt;            E7   48   48   48   C3.A7    8B.48
+ &lt;n WITH TILDE&gt;              F1   49   49   49   C3.B1    8B.58
+ &lt;CENT SIGN&gt;                 A2   4A   4A   B0   C2.A2    80.43  ##
+ .                           2E   4B   4B   4B   2E       4B
+ &lt;                           3C   4C   4C   4C   3C       4C
+ (                           28   4D   4D   4D   28       4D
+ +                           2B   4E   4E   4E   2B       4E
+ |                           7C   4F   4F   4F   7C       4F
+ &amp;                           26   50   50   50   26       50
+ &lt;e WITH ACUTE&gt;              E9   51   51   51   C3.A9    8B.4A
+ &lt;e WITH CIRCUMFLEX&gt;         EA   52   52   52   C3.AA    8B.51
+ &lt;e WITH DIAERESIS&gt;          EB   53   53   53   C3.AB    8B.52
+ &lt;e WITH GRAVE&gt;              E8   54   54   54   C3.A8    8B.49
+ &lt;i WITH ACUTE&gt;              ED   55   55   55   C3.AD    8B.54
+ &lt;i WITH CIRCUMFLEX&gt;         EE   56   56   56   C3.AE    8B.55
+ &lt;i WITH DIAERESIS&gt;          EF   57   57   57   C3.AF    8B.56
+ &lt;i WITH GRAVE&gt;              EC   58   58   58   C3.AC    8B.53
+ &lt;SMALL LETTER SHARP S&gt;      DF   59   59   59   C3.9F    8A.73
+ !                           21   5A   5A   5A   21       5A
+ $                           24   5B   5B   5B   24       5B
+ *                           2A   5C   5C   5C   2A       5C
+ )                           29   5D   5D   5D   29       5D
+ ;                           3B   5E   5E   5E   3B       5E
+ ^                           5E   B0   5F   6A   5E       5F     ** ##
+ -                           2D   60   60   60   2D       60
+ /                           2F   61   61   61   2F       61
+ &lt;A WITH CIRCUMFLEX&gt;         C2   62   62   62   C3.82    8A.43
+ &lt;A WITH DIAERESIS&gt;          C4   63   63   63   C3.84    8A.45
+ &lt;A WITH GRAVE&gt;              C0   64   64   64   C3.80    8A.41
+ &lt;A WITH ACUTE&gt;              C1   65   65   65   C3.81    8A.42
+ &lt;A WITH TILDE&gt;              C3   66   66   66   C3.83    8A.44
+ &lt;A WITH RING ABOVE&gt;         C5   67   67   67   C3.85    8A.46
+ &lt;C WITH CEDILLA&gt;            C7   68   68   68   C3.87    8A.48
+ &lt;N WITH TILDE&gt;              D1   69   69   69   C3.91    8A.58
+ &lt;BROKEN BAR&gt;                A6   6A   6A   D0   C2.A6    80.47  ##
+ ,                           2C   6B   6B   6B   2C       6B
+ %                           25   6C   6C   6C   25       6C
+ _                           5F   6D   6D   6D   5F       6D
+ &gt;                           3E   6E   6E   6E   3E       6E
+ ?                           3F   6F   6F   6F   3F       6F
+ &lt;o WITH STROKE&gt;             F8   70   70   70   C3.B8    8B.67
+ &lt;E WITH ACUTE&gt;              C9   71   71   71   C3.89    8A.4A
+ &lt;E WITH CIRCUMFLEX&gt;         CA   72   72   72   C3.8A    8A.51
+ &lt;E WITH DIAERESIS&gt;          CB   73   73   73   C3.8B    8A.52
+ &lt;E WITH GRAVE&gt;              C8   74   74   74   C3.88    8A.49
+ &lt;I WITH ACUTE&gt;              CD   75   75   75   C3.8D    8A.54
+ &lt;I WITH CIRCUMFLEX&gt;         CE   76   76   76   C3.8E    8A.55
+ &lt;I WITH DIAERESIS&gt;          CF   77   77   77   C3.8F    8A.56
+ &lt;I WITH GRAVE&gt;              CC   78   78   78   C3.8C    8A.53
+ `                           60   79   79   4A   60       79     ##
+ :                           3A   7A   7A   7A   3A       7A
+ #                           23   7B   7B   7B   23       7B
+ @                           40   7C   7C   7C   40       7C
+ '                           27   7D   7D   7D   27       7D
+ =                           3D   7E   7E   7E   3D       7E
+ &quot;                           22   7F   7F   7F   22       7F
+ &lt;O WITH STROKE&gt;             D8   80   80   80   C3.98    8A.67
+ a                           61   81   81   81   61       81
+ b                           62   82   82   82   62       82
+ c                           63   83   83   83   63       83
+ d                           64   84   84   84   64       84
+ e                           65   85   85   85   65       85
+ f                           66   86   86   86   66       86
+ g                           67   87   87   87   67       87
+ h                           68   88   88   88   68       88
+ i                           69   89   89   89   69       89
+ &lt;LEFT POINTING GUILLEMET&gt;   AB   8A   8A   8A   C2.AB    80.52
+ &lt;RIGHT POINTING GUILLEMET&gt;  BB   8B   8B   8B   C2.BB    80.6A
+ &lt;SMALL LETTER eth&gt;          F0   8C   8C   8C   C3.B0    8B.57
+ &lt;y WITH ACUTE&gt;              FD   8D   8D   8D   C3.BD    8B.71
+ &lt;SMALL LETTER thorn&gt;        FE   8E   8E   8E   C3.BE    8B.72
+ &lt;PLUS-OR-MINUS SIGN&gt;        B1   8F   8F   8F   C2.B1    80.58
+ &lt;DEGREE SIGN&gt;               B0   90   90   90   C2.B0    80.57
+ j                           6A   91   91   91   6A       91
+ k                           6B   92   92   92   6B       92
+ l                           6C   93   93   93   6C       93
+ m                           6D   94   94   94   6D       94
+ n                           6E   95   95   95   6E       95
+ o                           6F   96   96   96   6F       96
+ p                           70   97   97   97   70       97
+ q                           71   98   98   98   71       98
+ r                           72   99   99   99   72       99
+ &lt;FEMININE ORDINAL&gt;          AA   9A   9A   9A   C2.AA    80.51
+ &lt;MASC. ORDINAL INDICATOR&gt;   BA   9B   9B   9B   C2.BA    80.69
+ &lt;SMALL LIGATURE ae&gt;         E6   9C   9C   9C   C3.A6    8B.47
+ &lt;CEDILLA&gt;                   B8   9D   9D   9D   C2.B8    80.67
+ &lt;CAPITAL LIGATURE AE&gt;       C6   9E   9E   9E   C3.86    8A.47
+ &lt;CURRENCY SIGN&gt;             A4   9F   9F   9F   C2.A4    80.45
+ &lt;MICRO SIGN&gt;                B5   A0   A0   A0   C2.B5    80.64
+ ~                           7E   A1   A1   FF   7E       A1     ##
+ s                           73   A2   A2   A2   73       A2
+ t                           74   A3   A3   A3   74       A3
+ u                           75   A4   A4   A4   75       A4
+ v                           76   A5   A5   A5   76       A5
+ w                           77   A6   A6   A6   77       A6
+ x                           78   A7   A7   A7   78       A7
+ y                           79   A8   A8   A8   79       A8
+ z                           7A   A9   A9   A9   7A       A9
+ &lt;INVERTED &quot;!&quot; &gt;             A1   AA   AA   AA   C2.A1    80.42
+ &lt;INVERTED QUESTION MARK&gt;    BF   AB   AB   AB   C2.BF    80.73
+ &lt;CAPITAL LETTER ETH&gt;        D0   AC   AC   AC   C3.90    8A.57
+ [                           5B   BA   AD   BB   5B       AD     ** ##
+ &lt;CAPITAL LETTER THORN&gt;      DE   AE   AE   AE   C3.9E    8A.72
+ &lt;REGISTERED TRADE MARK&gt;     AE   AF   AF   AF   C2.AE    80.55
+ &lt;NOT SIGN&gt;                  AC   5F   B0   BA   C2.AC    80.53  ** ##
+ &lt;POUND SIGN&gt;                A3   B1   B1   B1   C2.A3    80.44
+ &lt;YEN SIGN&gt;                  A5   B2   B2   B2   C2.A5    80.46
+ &lt;MIDDLE DOT&gt;                B7   B3   B3   B3   C2.B7    80.66
+ &lt;COPYRIGHT SIGN&gt;            A9   B4   B4   B4   C2.A9    80.4A
+ &lt;SECTION SIGN&gt;              A7   B5   B5   B5   C2.A7    80.48
+ &lt;PARAGRAPH SIGN&gt;            B6   B6   B6   B6   C2.B6    80.65
+ &lt;FRACTION ONE QUARTER&gt;      BC   B7   B7   B7   C2.BC    80.70
+ &lt;FRACTION ONE HALF&gt;         BD   B8   B8   B8   C2.BD    80.71
+ &lt;FRACTION THREE QUARTERS&gt;   BE   B9   B9   B9   C2.BE    80.72
+ &lt;Y WITH ACUTE&gt;              DD   AD   BA   AD   C3.9D    8A.71  ** ##
+ &lt;DIAERESIS&gt;                 A8   BD   BB   79   C2.A8    80.49  ** ##
+ &lt;MACRON&gt;                    AF   BC   BC   A1   C2.AF    80.56  ##
+ ]                           5D   BB   BD   BD   5D       BD     **
+ &lt;ACUTE ACCENT&gt;              B4   BE   BE   BE   C2.B4    80.63
+ &lt;MULTIPLICATION SIGN&gt;       D7   BF   BF   BF   C3.97    8A.66
+ {                           7B   C0   C0   FB   7B       C0     ##
+ A                           41   C1   C1   C1   41       C1
+ B                           42   C2   C2   C2   42       C2
+ C                           43   C3   C3   C3   43       C3
+ D                           44   C4   C4   C4   44       C4
+ E                           45   C5   C5   C5   45       C5
+ F                           46   C6   C6   C6   46       C6
+ G                           47   C7   C7   C7   47       C7
+ H                           48   C8   C8   C8   48       C8
+ I                           49   C9   C9   C9   49       C9
+ &lt;SOFT HYPHEN&gt;               AD   CA   CA   CA   C2.AD    80.54
+ &lt;o WITH CIRCUMFLEX&gt;         F4   CB   CB   CB   C3.B4    8B.63
+ &lt;o WITH DIAERESIS&gt;          F6   CC   CC   CC   C3.B6    8B.65
+ &lt;o WITH GRAVE&gt;              F2   CD   CD   CD   C3.B2    8B.59
+ &lt;o WITH ACUTE&gt;              F3   CE   CE   CE   C3.B3    8B.62
+ &lt;o WITH TILDE&gt;              F5   CF   CF   CF   C3.B5    8B.64
+ }                           7D   D0   D0   FD   7D       D0     ##
+ J                           4A   D1   D1   D1   4A       D1
+ K                           4B   D2   D2   D2   4B       D2
+ L                           4C   D3   D3   D3   4C       D3
+ M                           4D   D4   D4   D4   4D       D4
+ N                           4E   D5   D5   D5   4E       D5
+ O                           4F   D6   D6   D6   4F       D6
+ P                           50   D7   D7   D7   50       D7
+ Q                           51   D8   D8   D8   51       D8
+ R                           52   D9   D9   D9   52       D9
+ &lt;SUPERSCRIPT ONE&gt;           B9   DA   DA   DA   C2.B9    80.68
+ &lt;u WITH CIRCUMFLEX&gt;         FB   DB   DB   DB   C3.BB    8B.6A
+ &lt;u WITH DIAERESIS&gt;          FC   DC   DC   DC   C3.BC    8B.70
+ &lt;u WITH GRAVE&gt;              F9   DD   DD   C0   C3.B9    8B.68  ##
+ &lt;u WITH ACUTE&gt;              FA   DE   DE   DE   C3.BA    8B.69
+ &lt;y WITH DIAERESIS&gt;          FF   DF   DF   DF   C3.BF    8B.73
+ \                           5C   E0   E0   BC   5C       E0     ##
+ &lt;DIVISION SIGN&gt;             F7   E1   E1   E1   C3.B7    8B.66
+ S                           53   E2   E2   E2   53       E2
+ T                           54   E3   E3   E3   54       E3
+ U                           55   E4   E4   E4   55       E4
+ V                           56   E5   E5   E5   56       E5
+ W                           57   E6   E6   E6   57       E6
+ X                           58   E7   E7   E7   58       E7
+ Y                           59   E8   E8   E8   59       E8
+ Z                           5A   E9   E9   E9   5A       E9
+ &lt;SUPERSCRIPT TWO&gt;           B2   EA   EA   EA   C2.B2    80.59
+ &lt;O WITH CIRCUMFLEX&gt;         D4   EB   EB   EB   C3.94    8A.63
+ &lt;O WITH DIAERESIS&gt;          D6   EC   EC   EC   C3.96    8A.65
+ &lt;O WITH GRAVE&gt;              D2   ED   ED   ED   C3.92    8A.59
+ &lt;O WITH ACUTE&gt;              D3   EE   EE   EE   C3.93    8A.62
+ &lt;O WITH TILDE&gt;              D5   EF   EF   EF   C3.95    8A.64
+ 0                           30   F0   F0   F0   30       F0
+ 1                           31   F1   F1   F1   31       F1
+ 2                           32   F2   F2   F2   32       F2
+ 3                           33   F3   F3   F3   33       F3
+ 4                           34   F4   F4   F4   34       F4
+ 5                           35   F5   F5   F5   35       F5
+ 6                           36   F6   F6   F6   36       F6
+ 7                           37   F7   F7   F7   37       F7
+ 8                           38   F8   F8   F8   38       F8
+ 9                           39   F9   F9   F9   39       F9
+ &lt;SUPERSCRIPT THREE&gt;         B3   FA   FA   FA   C2.B3    80.62
+ &lt;U WITH CIRCUMFLEX&gt;         DB   FB   FB   DD   C3.9B    8A.6A  ##
+ &lt;U WITH DIAERESIS&gt;          DC   FC   FC   FC   C3.9C    8A.70
+ &lt;U WITH GRAVE&gt;              D9   FD   FD   E0   C3.99    8A.68  ##
+ &lt;U WITH ACUTE&gt;              DA   FE   FE   FE   C3.9A    8A.69
+ &lt;APC&gt;                       9F   FF   FF   5F   C2.9F    FF     ##
+</pre>
 <hr>
 <a name="perlebcdic-IDENTIFYING-CHARACTER-CODE-SETS"></a>
 <div class="header">
@@ -26327,26 +27329,31 @@
 <a name="IDENTIFYING-CHARACTER-CODE-SETS"></a>
 <h3 class="section">19.5 IDENTIFYING CHARACTER CODE SETS</h3>
 
-<p>To determine the character set you are running under from perl one
-could use the return value of ord() or chr() to test one or more
-character values.  For example:
+<p>It is possible to determine which character set you are operating under.
+But first you need to be really really sure you need to do this.  Your
+code will be simpler and probably just as portable if you don&rsquo;t have
+to test the character set and do different things, depending.  There are
+actually only very few circumstances where it&rsquo;s not easy to write
+straight-line code portable to all character sets.  See
+<a href="#perluniintro-Unicode-and-EBCDIC">perluniintro Unicode and EBCDIC</a> 
for how to portably specify
+characters.
+</p>
+<p>But there are some cases where you may want to know which character set
+you are running under.  One possible example is doing
+<a href="#perlebcdic-SORTING">sorting</a> in inner loops where performance is 
critical.
+</p>
+<p>To determine if you are running under ASCII or EBCDIC, you can use the
+return value of <code>ord()</code> or <code>chr()</code> to test one or more 
character
+values.  For example:
 </p>
 <pre class="verbatim">    $is_ascii  = &quot;A&quot; eq chr(65);
     $is_ebcdic = &quot;A&quot; eq chr(193);
+    $is_ascii  = ord(&quot;A&quot;) == 65;
+    $is_ebcdic = ord(&quot;A&quot;) == 193;
 </pre>
-<p>Also, &quot;\t&quot; is a <code>HORIZONTAL TABULATION</code> character so 
that:
-</p>
-<pre class="verbatim">    $is_ascii  = ord(&quot;\t&quot;) == 9;
-    $is_ebcdic = ord(&quot;\t&quot;) == 5;
-</pre>
-<p>To distinguish EBCDIC code pages try looking at one or more of
-the characters that differ between them.  For example:
-</p>
-<pre class="verbatim">    $is_ebcdic_37   = &quot;\n&quot; eq chr(37);
-    $is_ebcdic_1047 = &quot;\n&quot; eq chr(21);
-</pre>
-<p>Or better still choose a character that is uniquely encoded in any
-of the code sets, e.g.:
+<p>There&rsquo;s even less need to distinguish between EBCDIC code pages, but 
to
+do so try looking at one or more of the characters that differ between
+them.
 </p>
 <pre class="verbatim">    $is_ascii           = ord('[') == 91;
     $is_ebcdic_37       = ord('[') == 186;
@@ -26358,11 +27365,12 @@
 <pre class="verbatim">    $is_ascii = &quot;\r&quot; ne chr(13);  #  WRONG
     $is_ascii = &quot;\n&quot; ne chr(10);  #  ILL ADVISED
 </pre>
-<p>Obviously the first of these will fail to distinguish most ASCII platforms
-from either a CCSID 0037, a 1047, or a POSIX-BC EBCDIC platform since 
&quot;\r&quot; eq
-chr(13) under all of those coded character sets.  But note too that
-because &quot;\n&quot; is chr(13) and &quot;\r&quot; is chr(10) on the 
Macintosh (which is an
-ASCII platform) the second <code>$is_ascii</code> test will lead to trouble 
there.
+<p>Obviously the first of these will fail to distinguish most ASCII
+platforms from either a CCSID 0037, a 1047, or a POSIX-BC EBCDIC
+platform since <code>&quot;\r&quot;&nbsp;eq&nbsp;chr(13)</code><!-- /@w --> 
under all of those coded character
+sets.  But note too that because <code>&quot;\n&quot;</code> is 
<code>chr(13)</code> and <code>&quot;\r&quot;</code> is
+<code>chr(10)</code> on old Macintosh (which is an ASCII platform) the second
+<code>$is_ascii</code> test will lead to trouble there.
 </p>
 <p>To determine whether or not perl was built under an EBCDIC
 code page you can use the Config module like so:
@@ -26402,6 +27410,8 @@
 <p>These functions take an input numeric code point in one encoding and
 return what its equivalent value is in the other.
 </p>
+<p>See <a href="utf8.html#Top">(utf8)</a>.
+</p>
 <hr>
 <a name="perlebcdic-tr_002f_002f_002f"></a>
 <div class="header">
@@ -26413,14 +27423,14 @@
 
 <p>In order to convert a string of characters from one character set to
 another a simple list of numbers, such as in the right columns in the
-above table, along with perl&rsquo;s tr/// operator is all that is needed.
+above table, along with Perl&rsquo;s <code>tr///</code> operator is all that 
is needed.
 The data in the table are in ASCII/Latin1 order, hence the EBCDIC columns
 provide easy-to-use ASCII/Latin1 to EBCDIC operations that are also easily
 reversed.
 </p>
 <p>For example, to convert ASCII/Latin1 to code page 037 take the output of the
-second numbers column from the output of recipe 2 (modified to add 
&rsquo;\&rsquo;
-characters), and use it in tr/// like so:
+second numbers column from the output of recipe 2 (modified to add
+<code>&quot;\&quot;</code> characters), and use it in <code>tr///</code> like 
so:
 </p>
 <pre class="verbatim">    $cp_037 =
     '\x00\x01\x02\x03\x37\x2D\x2E\x2F\x16\x05\x25\x0B\x0C\x0D\x0E\x0F' .
@@ -26471,7 +27481,7 @@
 available from the shell or from the C library.  Consult your system&rsquo;s
 documentation for information on iconv.
 </p>
-<p>On OS/390 or z/OS see the iconv(1) manpage.  One way to invoke the iconv
+<p>On OS/390 or z/OS see the <a 
href="http://man.he.net/man1/iconv";>iconv(1)</a> manpage.  One way to invoke 
the <code>iconv</code>
 shell utility from within perl would be to:
 </p>
 <pre class="verbatim">    # OS/390 or z/OS example
@@ -26482,7 +27492,7 @@
 <pre class="verbatim">    # OS/390 or z/OS example
     $ebcdic_data = `echo '$ascii_data'| iconv -f ISO8859-1 -t IBM-1047`
 </pre>
-<p>For other perl-based conversion options see the Convert::* modules on CPAN.
+<p>For other Perl-based conversion options see the <code>Convert::*</code> 
modules on CPAN.
 </p>
 <hr>
 <a name="perlebcdic-C-RTL"></a>
@@ -26493,7 +27503,7 @@
 <a name="C-RTL"></a>
 <h4 class="subsection">19.6.4 C RTL</h4>
 
-<p>The OS/390 and z/OS C run-time libraries provide _atoe() and _etoa() 
functions.
+<p>The OS/390 and z/OS C run-time libraries provide <code>_atoe()</code> and 
<code>_etoa()</code> functions.
 </p>
 <hr>
 <a name="perlebcdic-OPERATOR-DIFFERENCES"></a>
@@ -26512,7 +27522,7 @@
 <pre class="verbatim">    @alphabet = ('A'..'Z');   #  $#alphabet == 25
 </pre>
 <p>The bitwise operators such as &amp; ^ | may return different results
-when operating on string or character data in a perl program running
+when operating on string or character data in a Perl program running
 on an EBCDIC platform than when run on an ASCII platform.  Here is
 an example adapted from the one in <a href="#perlop-NAME">perlop NAME</a>:
 </p>
@@ -26524,14 +27534,14 @@
 </pre>
 <p>An interesting property of the 32 C0 control characters
 in the ASCII table is that they can &quot;literally&quot; be constructed
-as control characters in perl, e.g. <code>(chr(0)</code> eq 
<code>\c@</code>)&gt;
+as control characters in Perl, e.g. <code>(chr(0)</code> eq 
<code>\c@</code>)&gt;
 <code>(chr(1)</code> eq <code>\cA</code>)&gt;, and so on.  Perl on EBCDIC 
platforms has been
-ported to take <code>\c@</code> to chr(0) and <code>\cA</code> to chr(1), etc. 
as well, but the
+ported to take <code>\c@</code> to <code>chr(0)</code> and <code>\cA</code> to 
<code>chr(1)</code>, etc. as well, but the
 characters that result depend on which code page you are
 using.  The table below uses the standard acronyms for the controls.
 The POSIX-BC and 1047 sets are
 identical throughout this range and differ from the 0037 set at only
-one spot (21 decimal).  Note that the <code>LINE FEED</code> character
+one spot (21 decimal).  Note that the line terminator character
 may be generated by <code>\cJ</code> on ASCII platforms but by 
<code>\cU</code> on 1047 or POSIX-BC
 platforms and cannot be generated as a <code>&quot;\c.letter.&quot;</code> 
control character on
 0037 platforms.  Note also that <code>\c\</code> cannot be the final element 
in a string
@@ -26539,7 +27549,9 @@
 SEPARATOR</code> concatenated with <em>X</em> for all <em>X</em>.
 The outlier <code>\c?</code> on ASCII, which yields a non-C0 control 
<code>DEL</code>,
 yields the outlier control <code>APC</code> on EBCDIC, the one that 
isn&rsquo;t in the
-block of contiguous controls.
+block of contiguous controls.  Note that a subtlety of this is that
+<code>\c?</code> on ASCII platforms is an ASCII character, while it isn&rsquo;t
+equivalent to any ASCII character in EBCDIC platforms.
 </p>
 <pre class="verbatim"> chr   ord   8859-1    0037    1047 &amp;&amp; POSIX-BC
  -----------------------------------------------------------------------
@@ -26579,8 +27591,8 @@
 </pre>
 <p><code>*</code> Note: <code>\c?</code> maps to ordinal 127 
(<code>DEL</code>) on ASCII platforms, but
 since ordinal 127 is a not a control character on EBCDIC machines,
-<code>\c?</code> instead maps to <code>APC</code>, which is 255 in 0037 and 
1047, and 95 in
-POSIX-BC.
+<code>\c?</code> instead maps on them to <code>APC</code>, which is 255 in 
0037 and 1047,
+and 95 in POSIX-BC.
 </p>
 <hr>
 <a name="perlebcdic-FUNCTION-DIFFERENCES"></a>
@@ -26592,25 +27604,29 @@
 <h3 class="section">19.8 FUNCTION DIFFERENCES</h3>
 
 <dl compact="compact">
-<dt>chr()</dt>
+<dt><code>chr()</code></dt>
 <dd><a name="perlebcdic-chr_0028_0029"></a>
-<p>chr() must be given an EBCDIC code number argument to yield a desired
+<p><code>chr()</code> must be given an EBCDIC code number argument to yield a 
desired
 character return value on an EBCDIC platform.  For example:
 </p>
 <pre class="verbatim">    $CAPITAL_LETTER_A = chr(193);
 </pre>
+<p>The largest code point that is representable in UTF-EBCDIC is
+U+7FFF_FFFF.  If you do <code>chr()</code> on a larger value, a runtime error
+(similar to division by 0) will happen.
+</p>
 </dd>
-<dt>ord()</dt>
+<dt><code>ord()</code></dt>
 <dd><a name="perlebcdic-ord_0028_0029"></a>
-<p>ord() will return EBCDIC code number values on an EBCDIC platform.
+<p><code>ord()</code> will return EBCDIC code number values on an EBCDIC 
platform.
 For example:
 </p>
 <pre class="verbatim">    $the_number_193 = ord(&quot;A&quot;);
 </pre>
 </dd>
-<dt>pack()</dt>
+<dt><code>pack()</code></dt>
 <dd><a name="perlebcdic-pack_0028_0029"></a>
-<p>The c and C templates for pack() are dependent upon character set
+<p>The <code>&quot;c&quot;</code> and <code>&quot;C&quot;</code> templates for 
<code>pack()</code> are dependent upon character set
 encoding.  Examples of usage on EBCDIC include:
 </p>
 <pre class="verbatim">    $foo = pack(&quot;CCCC&quot;,193,194,195,196);
@@ -26621,30 +27637,47 @@
     $foo = pack(&quot;ccxxcc&quot;,193,194,195,196);
     # $foo eq &quot;AB\0\0CD&quot;
 </pre>
+<p>The <code>&quot;U&quot;</code> template has been ported to mean 
&quot;Unicode&quot; on all platforms so
+that
+</p>
+<pre class="verbatim">    pack(&quot;U&quot;, 65) eq 'A'
+</pre>
+<p>is true on all platforms.  If you want native code points for the low
+256, use the <code>&quot;W&quot;</code> template.  This means that the 
equivalences
+</p>
+<pre class="verbatim">    pack(&quot;W&quot;, ord($character)) eq $character
+    unpack(&quot;W&quot;, $character) == ord $character
+</pre>
+<p>will hold.
+</p>
+<p>The largest code point that is representable in UTF-EBCDIC is
+U+7FFF_FFFF.  If you try to pack a larger value into a character, a
+runtime error (similar to division by 0) will happen.
+</p>
 </dd>
-<dt>print()</dt>
+<dt><code>print()</code></dt>
 <dd><a name="perlebcdic-print_0028_0029"></a>
 <p>One must be careful with scalars and strings that are passed to
 print that contain ASCII encodings.  One common place
 for this to occur is in the output of the MIME type header for
-CGI script writing.  For example, many perl programming guides
+CGI script writing.  For example, many Perl programming guides
 recommend something similar to:
 </p>
 <pre class="verbatim">    print 
&quot;Content-type:\ttext/html\015\012\015\012&quot;;
     # this may be wrong on EBCDIC
 </pre>
-<p>Under the IBM OS/390 USS Web Server or WebSphere on z/OS for example
-you should instead write that as:
+<p>You can instead write
 </p>
 <pre class="verbatim">    print &quot;Content-type:\ttext/html\r\n\r\n&quot;; 
# OK for DGW et al
 </pre>
+<p>and have it work portably.
+</p>
 <p>That is because the translation from EBCDIC to ASCII is done
-by the web server in this case (such code will not be appropriate for
-the Macintosh however).  Consult your web server&rsquo;s documentation for
+by the web server in this case.  Consult your web server&rsquo;s documentation 
for
 further details.
 </p>
 </dd>
-<dt>printf()</dt>
+<dt><code>printf()</code></dt>
 <dd><a name="perlebcdic-printf_0028_0029"></a>
 <p>The formats that can convert characters to numbers and vice versa
 will be different from their ASCII counterparts when executed
@@ -26653,27 +27686,37 @@
 <pre class="verbatim">    printf(&quot;%c%c%c&quot;,193,194,195);  # prints ABC
 </pre>
 </dd>
-<dt>sort()</dt>
+<dt><code>sort()</code></dt>
 <dd><a name="perlebcdic-sort_0028_0029"></a>
 <p>EBCDIC sort results may differ from ASCII sort results especially for
-mixed case strings.  This is discussed in more detail below.
+mixed case strings.  This is discussed in more detail <a 
href="#perlebcdic-SORTING">below</a>.
 </p>
 </dd>
-<dt>sprintf()</dt>
+<dt><code>sprintf()</code></dt>
 <dd><a name="perlebcdic-sprintf_0028_0029"></a>
-<p>See the discussion of printf() above.  An example of the use
+<p>See the discussion of <code><a 
href="#perlebcdic-printf_0028_0029">printf()</a></code> above.  An example of 
the use
 of sprintf would be:
 </p>
 <pre class="verbatim">    $CAPITAL_LETTER_A = sprintf(&quot;%c&quot;,193);
 </pre>
 </dd>
-<dt>unpack()</dt>
+<dt><code>unpack()</code></dt>
 <dd><a name="perlebcdic-unpack_0028_0029"></a>
-<p>See the discussion of pack() above.
+<p>See the discussion of <code><a 
href="#perlebcdic-pack_0028_0029">pack()</a></code> above.
 </p>
 </dd>
 </dl>
 
+<p>Note that it is possible to write portable code for these by specifying
+things in Unicode numbers, and using a conversion function:
+</p>
+<pre class="verbatim">    printf(&quot;%c&quot;,utf8::unicode_to_native(65));  
# prints A on all
+                                               # platforms
+    print utf8::native_to_unicode(ord(&quot;A&quot;));   # Likewise, prints 65
+</pre>
+<p>See <a href="#perluniintro-Unicode-and-EBCDIC">perluniintro Unicode and 
EBCDIC</a> and <a href="#perlebcdic-CONVERSIONS">CONVERSIONS</a>
+for other options.
+</p>
 <hr>
 <a name="perlebcdic-REGULAR-EXPRESSION-DIFFERENCES"></a>
 <div class="header">
@@ -26683,22 +27726,35 @@
 <a name="REGULAR-EXPRESSION-DIFFERENCES"></a>
 <h3 class="section">19.9 REGULAR EXPRESSION DIFFERENCES</h3>
 
-<p>As of perl 5.005_03 the letter range regular expressions such as
-[A-Z] and [a-z] have been especially coded to not pick up gap
-characters.  For example, characters such as ÃÂ´ <code>o WITH 
CIRCUMFLEX</code>
-that lie between I and J would not be matched by the
-regular expression range <code>/[H-K]/</code>.  This works in
-the other direction, too, if either of the range end points is
-explicitly numeric: <code>[\x89-\x91]</code> will match <code>\x8e</code>, even
-though <code>\x89</code> is <code>i</code> and <code>\x91 </code> is 
<code>j</code>, and <code>\x8e</code>
-is a gap character from the alphabetic viewpoint.
-</p>
-<p>If you do want to match the alphabet gap characters in a single octet
-regular expression try matching the hex or octal code such
-as <code>/\313/</code> on EBCDIC or <code>/\364/</code> on ASCII platforms to
-have your regular expression match <code>o WITH CIRCUMFLEX</code>.
+<p>You can write your regular expressions just like someone on an ASCII
+platform would do.  But keep in mind that using octal or hex notation to
+specify a particular code point will give you the character that the
+EBCDIC code page natively maps to it.   (This is also true of all
+double-quoted strings.)  If you want to write portably, just use the
+<code>\N{U+...}</code> notation everywhere where you would have used 
<code>\x{...}</code>,
+and don&rsquo;t use octal notation at all.
+</p>
+<p>Starting in Perl v5.22, this applies to ranges in bracketed character
+classes.  If you say, for example, <code>qr/[\N{U+20}-\N{U+7F}]/</code>, it 
means
+the characters <code>\N{U+20}</code>, <code>\N{U+21}</code>, ..., 
<code>\N{U+7F}</code>.  This range
+is all the printable characters that the ASCII character set contains.
+</p>
+<p>Prior to v5.22, you couldn&rsquo;t specify any ranges portably, except
+(starting in Perl v5.5.3) all subsets of the <code>[A-Z]</code> and 
<code>[a-z]</code>
+ranges are specially coded to not pick up gap characters.  For example,
+characters such as &quot;ÃÂ´&quot; (<code>o WITH CIRCUMFLEX</code>) that lie 
between
+&quot;I&quot; and &quot;J&quot; would not be matched by the regular expression 
range
+<code>/[H-K]/</code>.  But if either of the range end points is explicitly 
numeric
+(and neither is specified by <code>\N{U+...}</code>), the gap characters are
+matched:
+</p>
+<pre class="verbatim">    /[\x89-\x91]/
+</pre>
+<p>will match <code>\x8e</code>, even though <code>\x89</code> is 
&quot;i&quot; and <code>\x91 </code> is &quot;j&quot;,
+and <code>\x8e</code> is a gap character, from the alphabetic viewpoint.
 </p>
-<p>Another construct to be wary of is the inappropriate use of hex or
+<p>Another construct to be wary of is the inappropriate use of hex (unless
+you use <code>\N{U+...}</code>) or
 octal constants in regular expressions.  Consider the following
 set of subs:
 </p>
@@ -26727,22 +27783,56 @@
         $char =~ /[\240-\377]/;
     }
 </pre>
-<p>These are valid only on ASCII platforms, but can be easily rewritten to
-work on any platform as follows:
+<p>These are valid only on ASCII platforms.  Starting in Perl v5.22, simply
+changing the octal constants to equivalent <code>\N{U+...}</code> values makes
+them portable:
+</p>
+<pre class="verbatim">    sub is_c0 {
+        my $char = substr(shift,0,1);
+        $char =~ /[\N{U+00}-\N{U+1F}]/;
+    }
+
+    sub is_print_ascii {
+        my $char = substr(shift,0,1);
+        $char =~ /[\N{U+20}-\N{U+7E}]/;
+    }
+
+    sub is_delete {
+        my $char = substr(shift,0,1);
+        $char eq &quot;\N{U+7F}&quot;;
+    }
+
+    sub is_c1 {
+        my $char = substr(shift,0,1);
+        $char =~ /[\N{U+80}-\N{U+9F}]/;
+    }
+
+    sub is_latin_1 {    # But not ASCII; not C1
+        my $char = substr(shift,0,1);
+        $char =~ /[\N{U+A0}-\N{U+FF}]/;
+    }
+</pre>
+<p>And here are some alternative portable ways to write them:
 </p>
 <pre class="verbatim">    sub Is_c0 {
         my $char = substr(shift,0,1);
-        return $char =~ /[[:cntrl:]]/
-               &amp;&amp; $char =~ /[[:ascii:]]/
-               &amp;&amp; ! Is_delete($char);
+        return $char =~ /[[:cntrl:]]/a &amp;&amp; ! Is_delete($char);
+
+        # Alternatively:
+        # return $char =~ /[[:cntrl:]]/
+        #        &amp;&amp; $char =~ /[[:ascii:]]/
+        #        &amp;&amp; ! Is_delete($char);
     }
 
     sub Is_print_ascii {
         my $char = substr(shift,0,1);
 
-        return $char =~ /[[:print:]]/ &amp;&amp; $char =~ /[[:ascii:]]/;
+        return $char =~ /[[:print:]]/a;
 
         # Alternatively:
+        # return $char =~ /[[:print:]]/ &amp;&amp; $char =~ /[[:ascii:]]/;
+
+        # Or
         # return $char
         #      =~ /[ 
!&quot;\#\$%&amp;'()*+,\-.\/0-9:;&lt;=&gt;address@hidden|}~]/;
     }
@@ -26762,8 +27852,8 @@
         use feature 'unicode_strings';
         my $char = substr(shift,0,1);
         return ord($char) &lt; 256
-               &amp;&amp; $char !~ [[:ascii:]]
-               &amp;&amp; $char !~ [[:cntrl:]];
+               &amp;&amp; $char !~ /[[:ascii:]]/
+               &amp;&amp; $char !~ /[[:cntrl:]]/;
     }
 </pre>
 <p>Another way to write <code>Is_latin_1()</code> would be
@@ -26771,12 +27861,22 @@
 </p>
 <pre class="verbatim">    sub Is_latin_1 {
         my $char = substr(shift,0,1);
-        $char =~ /[ÃÂ 
ÃÂ¡ÃÂ¢ÃÂ£ÃÂ¤ÃÂ¥ÃÂ¦ÃÂ§ÃÂ¨ÃÂ©ÃÂªÃÂ«ÃÂ¬ÃÂÃÂ®ÃÂ¯ÃÂ°ÃÂ±ÃÂ²ÃÂ³ÃÂ´ÃÂµÃÂ¶ÃÂ·ÃÂ¸ÃÂ¹ÃÂºÃÂ»ÃÂ¼ÃÂ½ÃÂ¾ÃÂ¿ÃÂÃÂÃÂÃÂÃÂÃÂ

ÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂ
 
ÃÂ¡ÃÂ¢ÃÂ£ÃÂ¤ÃÂ¥ÃÂ¦ÃÂ§ÃÂ¨ÃÂ©ÃÂªÃÂ«ÃÂ¬ÃÂÃÂ®ÃÂ¯ÃÂ°ÃÂ±ÃÂ²ÃÂ³ÃÂ´ÃÂµÃÂ¶ÃÂ·ÃÂ¸ÃÂ¹ÃÂºÃÂ»ÃÂ¼ÃÂ½ÃÂ¾ÃÂ¿]/;
+        $char =~ /[ÃÂ 
ÃÂ¡ÃÂ¢ÃÂ£ÃÂ¤ÃÂ¥ÃÂ¦ÃÂ§ÃÂ¨ÃÂ©ÃÂªÃÂ«ÃÂ¬ÃÂÃÂ®ÃÂ¯ÃÂ°ÃÂ±ÃÂ²ÃÂ³ÃÂ´ÃÂµÃÂ¶ÃÂ·ÃÂ¸ÃÂ¹ÃÂºÃÂ»ÃÂ¼ÃÂ½ÃÂ¾ÃÂ¿ÃÂÃÂÃÂÃÂÃÂÃÂ
ÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂ]
+                  
[ÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂ 
ÃÂ¡ÃÂ¢ÃÂ£ÃÂ¤ÃÂ¥ÃÂ¦ÃÂ§ÃÂ¨ÃÂ©ÃÂªÃÂ«ÃÂ¬ÃÂÃÂ®ÃÂ¯ÃÂ°ÃÂ±ÃÂ²ÃÂ³ÃÂ´ÃÂµÃÂ¶ÃÂ·ÃÂ¸ÃÂ¹ÃÂºÃÂ»ÃÂ¼ÃÂ½ÃÂ¾ÃÂ¿]/x;
     }
 </pre>
 <p>Although that form may run into trouble in network transit (due to the
-presence of 8 bit characters) or on non ISO-Latin character sets.
+presence of 8 bit characters) or on non ISO-Latin character sets.  But
+it does allow <code>Is_c1</code> to be rewritten so it works on Perls that 
don&rsquo;t
+have <code>'unicode_strings'</code> (earlier than v5.14):
 </p>
+<pre class="verbatim">    sub Is_latin_1 {    # But not ASCII; not C1
+        my $char = substr(shift,0,1);
+        return ord($char) &lt; 256
+               &amp;&amp; $char !~ /[[:ascii:]]/
+               &amp;&amp; ! Is_latin1($char);
+    }
+</pre>
 <hr>
 <a name="perlebcdic-SOCKETS"></a>
 <div class="header">
@@ -26802,8 +27902,13 @@
 <h3 class="section">19.11 SORTING</h3>
 
 <p>One big difference between ASCII-based character sets and EBCDIC ones
-are the relative positions of upper and lower case letters and the
-letters compared to the digits.  If sorted on an ASCII-based platform the
+are the relative positions of the characters when sorted in native
+order.  Of most concern are the upper- and lowercase letters, the
+digits, and the underscore (<code>&quot;_&quot;</code>).  On ASCII platforms 
the native sort
+order has the digits come before the uppercase letters which come before
+the underscore which comes before the lowercase letters.  On EBCDIC, the
+underscore comes first, then the lowercase letters, then the uppercase
+ones, and the digits last.  If sorted on an ASCII-based platform, the
 two-letter abbreviation for a physician comes before the two letter
 abbreviation for drive; that is:
 </p>
@@ -26812,13 +27917,14 @@
 </pre>
 <p>The property of lowercase before uppercase letters in EBCDIC is
 even carried to the Latin 1 EBCDIC pages such as 0037 and 1047.
-An example would be that ÃÂ <code>E WITH DIAERESIS</code> (203) comes
-before ÃÂ« <code>e WITH DIAERESIS</code> (235) on an ASCII platform, but
+An example would be that &quot;ÃÂ&quot; (<code>E WITH DIAERESIS</code>, 203) 
comes
+before &quot;ÃÂ«&quot; (<code>e WITH DIAERESIS</code>, 235) on an ASCII 
platform, but
 the latter (83) comes before the former (115) on an EBCDIC platform.
-(Astute readers will note that the uppercase version of ÃÂ
-<code>SMALL LETTER SHARP S</code> is simply &quot;SS&quot; and that the upper 
case version of
-ÃÂ¿ <code>y WITH DIAERESIS</code> is not in the 0..255 range but it is
-at U+x0178 in Unicode, or <code>&quot;\x{178}&quot;</code> in a Unicode 
enabled Perl).
+(Astute readers will note that the uppercase version of &quot;ÃÂ&quot;
+<code>SMALL LETTER SHARP S</code> is simply &quot;SS&quot; and that the upper 
case versions
+of &quot;ÃÂ¿&quot; (small <code>y WITH DIAERESIS</code>) and &quot;ÃÂµ&quot; 
(<code>MICRO SIGN</code>)
+are not in the 0..255 range but are in Unicode, in a Unicode enabled
+Perl).
 </p>
 <p>The sort order will cause differences between results obtained on
 ASCII platforms versus EBCDIC platforms.  What follows are some suggestions
@@ -26827,9 +27933,9 @@
 <table class="menu" border="0" cellspacing="0">
 <tr><td align="left" valign="top">&bull; <a 
href="#perlebcdic-Ignore-ASCII-vs_002e-EBCDIC-sort-differences_002e" 
accesskey="1">perlebcdic Ignore ASCII vs. EBCDIC sort 
differences.</a>:</td><td>&nbsp;&nbsp;</td><td align="left" valign="top">
 </td></tr>
-<tr><td align="left" valign="top">&bull; <a 
href="#perlebcdic-MONO-CASE-then-sort-data_002e" accesskey="2">perlebcdic MONO 
CASE then sort data.</a>:</td><td>&nbsp;&nbsp;</td><td align="left" 
valign="top">
+<tr><td align="left" valign="top">&bull; <a 
href="#perlebcdic-Use-a-sort-helper-function" accesskey="2">perlebcdic Use a 
sort helper function</a>:</td><td>&nbsp;&nbsp;</td><td align="left" 
valign="top">
 </td></tr>
-<tr><td align="left" valign="top">&bull; <a 
href="#perlebcdic-Convert_002c-sort-data_002c-then-re-convert_002e" 
accesskey="3">perlebcdic Convert, sort data, then re 
convert.</a>:</td><td>&nbsp;&nbsp;</td><td align="left" valign="top">
+<tr><td align="left" valign="top">&bull; <a 
href="#perlebcdic-MONO-CASE-then-sort-data-_0028for-non_002ddigits_002c-non_002dunderscore_0029"
 accesskey="3">perlebcdic MONO CASE then sort data (for non-digits, 
non-underscore)</a>:</td><td>&nbsp;&nbsp;</td><td align="left" valign="top">
 </td></tr>
 <tr><td align="left" valign="top">&bull; <a 
href="#perlebcdic-Perform-sorting-on-one-type-of-platform-only_002e" 
accesskey="4">perlebcdic Perform sorting on one type of platform 
only.</a>:</td><td>&nbsp;&nbsp;</td><td align="left" valign="top">
 </td></tr>
@@ -26839,7 +27945,7 @@
 <a name="perlebcdic-Ignore-ASCII-vs_002e-EBCDIC-sort-differences_002e"></a>
 <div class="header">
 <p>
-Next: <a href="#perlebcdic-MONO-CASE-then-sort-data_002e" accesskey="n" 
rel="next">perlebcdic MONO CASE then sort data.</a>, Up: <a 
href="#perlebcdic-SORTING" accesskey="u" rel="up">perlebcdic SORTING</a> &nbsp; 
[<a href="#SEC_Contents" title="Table of contents" 
rel="contents">Contents</a>]</p>
+Next: <a href="#perlebcdic-Use-a-sort-helper-function" accesskey="n" 
rel="next">perlebcdic Use a sort helper function</a>, Up: <a 
href="#perlebcdic-SORTING" accesskey="u" rel="up">perlebcdic SORTING</a> &nbsp; 
[<a href="#SEC_Contents" title="Table of contents" 
rel="contents">Contents</a>]</p>
 </div>
 <a name="Ignore-ASCII-vs_002e-EBCDIC-sort-differences_002e"></a>
 <h4 class="subsection">19.11.1 Ignore ASCII vs. EBCDIC sort differences.</h4>
@@ -26848,53 +27954,88 @@
 some user education.
 </p>
 <hr>
-<a name="perlebcdic-MONO-CASE-then-sort-data_002e"></a>
+<a name="perlebcdic-Use-a-sort-helper-function"></a>
 <div class="header">
 <p>
-Next: <a href="#perlebcdic-Convert_002c-sort-data_002c-then-re-convert_002e" 
accesskey="n" rel="next">perlebcdic Convert, sort data, then re convert.</a>, 
Previous: <a 
href="#perlebcdic-Ignore-ASCII-vs_002e-EBCDIC-sort-differences_002e" 
accesskey="p" rel="prev">perlebcdic Ignore ASCII vs. EBCDIC sort 
differences.</a>, Up: <a href="#perlebcdic-SORTING" accesskey="u" 
rel="up">perlebcdic SORTING</a> &nbsp; [<a href="#SEC_Contents" title="Table of 
contents" rel="contents">Contents</a>]</p>
+Next: <a 
href="#perlebcdic-MONO-CASE-then-sort-data-_0028for-non_002ddigits_002c-non_002dunderscore_0029"
 accesskey="n" rel="next">perlebcdic MONO CASE then sort data (for non-digits, 
non-underscore)</a>, Previous: <a 
href="#perlebcdic-Ignore-ASCII-vs_002e-EBCDIC-sort-differences_002e" 
accesskey="p" rel="prev">perlebcdic Ignore ASCII vs. EBCDIC sort 
differences.</a>, Up: <a href="#perlebcdic-SORTING" accesskey="u" 
rel="up">perlebcdic SORTING</a> &nbsp; [<a href="#SEC_Contents" title="Table of 
contents" rel="contents">Contents</a>]</p>
 </div>
-<a name="MONO-CASE-then-sort-data_002e"></a>
-<h4 class="subsection">19.11.2 MONO CASE then sort data.</h4>
+<a name="Use-a-sort-helper-function"></a>
+<h4 class="subsection">19.11.2 Use a sort helper function</h4>
 
-<p>In order to minimize the expense of mono casing mixed-case text, try to
-<code>tr///</code> towards the character set case most employed within the 
data.
-If the data are primarily UPPERCASE non Latin 1 then apply tr/[a-z]/[A-Z]/
-then sort().  If the data are primarily lowercase non Latin 1 then
-apply tr/[A-Z]/[a-z]/ before sorting.  If the data are primarily UPPERCASE
-and include Latin-1 characters then apply:
-</p>
-<pre class="verbatim">   tr/[a-z]/[A-Z]/;
-   tr/[ÃÂ 
ÃÂ¡ÃÂ¢ÃÂ£ÃÂ¤ÃÂ¥ÃÂ¦ÃÂ§ÃÂ¨ÃÂ©ÃÂªÃÂ«ÃÂ¬ÃÂÃÂ®ÃÂ¯ÃÂ°ÃÂ±ÃÂ²ÃÂ³ÃÂ´ÃÂµÃÂ¶ÃÂ¸ÃÂ¹ÃÂºÃÂ»ÃÂ¼ÃÂ½ÃÂ¾]/[ÃÂÃÂÃÂÃÂÃÂÃÂ

ÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂ/;
-   s/ÃÂ/SS/g;
-</pre>
-<p>then sort().  Do note however that such Latin-1 manipulation does not
-address the ÃÂ¿ <code>y WITH DIAERESIS</code> character that will remain at
-code point 255 on ASCII platforms, but 223 on most EBCDIC platforms
-where it will sort to a place less than the EBCDIC numerals.  With a
-Unicode-enabled Perl you might try:
+<p>This is completely general, but the most computationally expensive
+strategy.  Choose one or the other character set and transform to that
+for every sort comparision.  Here&rsquo;s a complete example that transforms
+to ASCII sort order:
 </p>
-<pre class="verbatim">    tr/^?/\x{178}/;
+<pre class="verbatim"> sub native_to_uni($) {
+    my $string = shift;
+
+    # Saves time on an ASCII platform
+    return $string if ord 'A' ==  65;
+
+    my $output = &quot;&quot;;
+    for my $i (0 .. length($string) - 1) {
+        $output
+           .= chr(utf8::native_to_unicode(ord(substr($string, $i, 1))));
+    }
+
+    # Preserve utf8ness of input onto the output, even if it didn't need
+    # to be utf8
+    utf8::upgrade($output) if utf8::is_utf8($string);
+
+    return $output;
+ }
+
+ sub ascii_order {   # Sort helper
+    return native_to_uni($a) cmp native_to_uni($b);
+ }
+
+ sort ascii_order @list;
 </pre>
-<p>The strategy of mono casing data before sorting does not preserve the case
-of the data and may not be acceptable for that reason.
-</p>
 <hr>
-<a name="perlebcdic-Convert_002c-sort-data_002c-then-re-convert_002e"></a>
+<a 
name="perlebcdic-MONO-CASE-then-sort-data-_0028for-non_002ddigits_002c-non_002dunderscore_0029"></a>
 <div class="header">
 <p>
-Next: <a href="#perlebcdic-Perform-sorting-on-one-type-of-platform-only_002e" 
accesskey="n" rel="next">perlebcdic Perform sorting on one type of platform 
only.</a>, Previous: <a href="#perlebcdic-MONO-CASE-then-sort-data_002e" 
accesskey="p" rel="prev">perlebcdic MONO CASE then sort data.</a>, Up: <a 
href="#perlebcdic-SORTING" accesskey="u" rel="up">perlebcdic SORTING</a> &nbsp; 
[<a href="#SEC_Contents" title="Table of contents" 
rel="contents">Contents</a>]</p>
+Next: <a href="#perlebcdic-Perform-sorting-on-one-type-of-platform-only_002e" 
accesskey="n" rel="next">perlebcdic Perform sorting on one type of platform 
only.</a>, Previous: <a href="#perlebcdic-Use-a-sort-helper-function" 
accesskey="p" rel="prev">perlebcdic Use a sort helper function</a>, Up: <a 
href="#perlebcdic-SORTING" accesskey="u" rel="up">perlebcdic SORTING</a> &nbsp; 
[<a href="#SEC_Contents" title="Table of contents" 
rel="contents">Contents</a>]</p>
 </div>
-<a name="Convert_002c-sort-data_002c-then-re-convert_002e"></a>
-<h4 class="subsection">19.11.3 Convert, sort data, then re convert.</h4>
+<a 
name="MONO-CASE-then-sort-data-_0028for-non_002ddigits_002c-non_002dunderscore_0029"></a>
+<h4 class="subsection">19.11.3 MONO CASE then sort data (for non-digits, 
non-underscore)</h4>
 
-<p>This is the most expensive proposition that does not employ a network
-connection.
+<p>If you don&rsquo;t care about where digits and underscore sort to, you can 
do
+something like this
+</p>
+<pre class="verbatim"> sub case_insensitive_order {   # Sort helper
+    return lc($a) cmp lc($b)
+ }
+
+ sort case_insensitive_order @list;
+</pre>
+<p>If performance is an issue, and you don&rsquo;t care if the output is in the
+same case as the input, Use <code>tr///</code> to transform to the case most
+employed within the data.  If the data are primarily UPPERCASE
+non-Latin1, then apply <code>tr/[a-z]/[A-Z]/</code>, and then 
<code>sort()</code>.  If the
+data are primarily lowercase non Latin1 then apply <code>tr/[A-Z]/[a-z]/</code>
+before sorting.  If the data are primarily UPPERCASE and include Latin-1
+characters then apply:
 </p>
+<pre class="verbatim">   tr/[a-z]/[A-Z]/;
+   tr/[ÃÂ 
ÃÂ¡ÃÂ¢ÃÂ£ÃÂ¤ÃÂ¥ÃÂ¦ÃÂ§ÃÂ¨ÃÂ©ÃÂªÃÂ«ÃÂ¬ÃÂÃÂ®ÃÂ¯ÃÂ°ÃÂ±ÃÂ²ÃÂ³ÃÂ´ÃÂµÃÂ¶ÃÂ¸ÃÂ¹ÃÂºÃÂ»ÃÂ¼ÃÂ½ÃÂ¾]/[ÃÂÃÂÃÂÃÂÃÂÃÂ

ÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂÃÂ/;
+   s/ÃÂ/SS/g;
+</pre>
+<p>then <code>sort()</code>.  If you have a choice, it&rsquo;s better to 
lowercase things
+to avoid the problems of the two Latin-1 characters whose uppercase is
+outside Latin-1: &quot;ÃÂ¿&quot; (small <code>y WITH DIAERESIS</code>) and 
&quot;ÃÂµ&quot;
+(<code>MICRO SIGN</code>).  If you do need to upppercase, you can; with a
+Unicode-enabled Perl, do:
+</p>
+<pre class="verbatim">    tr/ÃÂ¿/\x{178}/;
+    tr/ÃÂµ/\x{39C}/;
+</pre>
 <hr>
 <a name="perlebcdic-Perform-sorting-on-one-type-of-platform-only_002e"></a>
 <div class="header">
 <p>
-Previous: <a 
href="#perlebcdic-Convert_002c-sort-data_002c-then-re-convert_002e" 
accesskey="p" rel="prev">perlebcdic Convert, sort data, then re convert.</a>, 
Up: <a href="#perlebcdic-SORTING" accesskey="u" rel="up">perlebcdic SORTING</a> 
&nbsp; [<a href="#SEC_Contents" title="Table of contents" 
rel="contents">Contents</a>]</p>
+Previous: <a 
href="#perlebcdic-MONO-CASE-then-sort-data-_0028for-non_002ddigits_002c-non_002dunderscore_0029"
 accesskey="p" rel="prev">perlebcdic MONO CASE then sort data (for non-digits, 
non-underscore)</a>, Up: <a href="#perlebcdic-SORTING" accesskey="u" 
rel="up">perlebcdic SORTING</a> &nbsp; [<a href="#SEC_Contents" title="Table of 
contents" rel="contents">Contents</a>]</p>
 </div>
 <a name="Perform-sorting-on-one-type-of-platform-only_002e"></a>
 <h4 class="subsection">19.11.4 Perform sorting on one type of platform 
only.</h4>
@@ -26948,66 +28089,25 @@
 
     http://www.pvhp.com/%7epvhp/
 </pre>
-<p>where 7E is the hexadecimal ASCII code point for &rsquo;~&rsquo;.  Here is 
an example
-of decoding such a URL under CCSID 1047:
+<p>where 7E is the hexadecimal ASCII code point for &quot;~&quot;.  Here is an 
example
+of decoding such a URL in any EBCDIC code page:
 </p>
 <pre class="verbatim">    $url = 'http://www.pvhp.com/%7Epvhp/';
-    # this array assumes code page 1047
-    my @a2e_1047 = (
-          0,  1,  2,  3, 55, 45, 46, 47, 22,  5, 21, 11, 12, 13, 14, 15,
-         16, 17, 18, 19, 60, 61, 50, 38, 24, 25, 63, 39, 28, 29, 30, 31,
-         64, 90,127,123, 91,108, 80,125, 77, 93, 92, 78,107, 96, 75, 97,
-        240,241,242,243,244,245,246,247,248,249,122, 94, 76,126,110,111,
-        124,193,194,195,196,197,198,199,200,201,209,210,211,212,213,214,
-        215,216,217,226,227,228,229,230,231,232,233,173,224,189, 95,109,
-        121,129,130,131,132,133,134,135,136,137,145,146,147,148,149,150,
-        151,152,153,162,163,164,165,166,167,168,169,192, 79,208,161,  7,
-         32, 33, 34, 35, 36, 37,  6, 23, 40, 41, 42, 43, 44,  9, 10, 27,
-         48, 49, 26, 51, 52, 53, 54,  8, 56, 57, 58, 59,  4, 20, 62,255,
-         65,170, 74,177,159,178,106,181,187,180,154,138,176,202,175,188,
-        144,143,234,250,190,160,182,179,157,218,155,139,183,184,185,171,
-        100,101, 98,102, 99,103,158,104,116,113,114,115,120,117,118,119,
-        172,105,237,238,235,239,236,191,128,253,254,251,252,186,174, 89,
-         68, 69, 66, 70, 67, 71,156, 72, 84, 81, 82, 83, 88, 85, 86, 87,
-        140, 73,205,206,203,207,204,225,112,221,222,219,220,141,142,223
-    );
-    $url =~ s/%([0-9a-fA-F]{2})/pack(&quot;c&quot;,$a2e_1047[hex($1)])/ge;
+    $url =~ s/%([0-9a-fA-F]{2})/
+              pack(&quot;c&quot;,utf8::unicode_to_native(hex($1)))/xge;
 </pre>
 <p>Conversely, here is a partial solution for the task of encoding such
-a URL under the 1047 code page:
+a URL in any EBCDIC code page:
 </p>
 <pre class="verbatim">    $url = 'http://www.pvhp.com/~pvhp/';
-    # this array assumes code page 1047
-    my @e2a_1047 = (
-          0,  1,  2,  3,156,  9,134,127,151,141,142, 11, 12, 13, 14, 15,
-         16, 17, 18, 19,157, 10,  8,135, 24, 25,146,143, 28, 29, 30, 31,
-        128,129,130,131,132,133, 23, 27,136,137,138,139,140,  5,  6,  7,
-        144,145, 22,147,148,149,150,  4,152,153,154,155, 20, 21,158, 26,
-         32,160,226,228,224,225,227,229,231,241,162, 46, 60, 40, 43,124,
-         38,233,234,235,232,237,238,239,236,223, 33, 36, 42, 41, 59, 94,
-         45, 47,194,196,192,193,195,197,199,209,166, 44, 37, 95, 62, 63,
-        248,201,202,203,200,205,206,207,204, 96, 58, 35, 64, 39, 61, 34,
-        216, 97, 98, 99,100,101,102,103,104,105,171,187,240,253,254,177,
-        176,106,107,108,109,110,111,112,113,114,170,186,230,184,198,164,
-        181,126,115,116,117,118,119,120,121,122,161,191,208, 91,222,174,
-        172,163,165,183,169,167,182,188,189,190,221,168,175, 93,180,215,
-        123, 65, 66, 67, 68, 69, 70, 71, 72, 73,173,244,246,242,243,245,
-        125, 74, 75, 76, 77, 78, 79, 80, 81, 82,185,251,252,249,250,255,
-         92,247, 83, 84, 85, 86, 87, 88, 89, 90,178,212,214,210,211,213,
-         48, 49, 50, 51, 52, 53, 54, 55, 56, 57,179,219,220,217,218,159
-    );
     # The following regular expression does not address the
     # mappings for: ('.' =&gt; '%2E', '/' =&gt; '%2F', ':' =&gt; '%3A')
     $url =~ s/([\t &quot;#%&amp;\(\),;&lt;=&gt;address@hidden|}~])/
-                sprintf(&quot;%%%02X&quot;,$e2a_1047[ord($1)])/xge;
+               
sprintf(&quot;%%%02X&quot;,utf8::native_to_unicode(ord($1)))/xge;
 </pre>
 <p>where a more complete solution would split the URL into components
 and apply a full s/// substitution only to the appropriate parts.
 </p>
-<p>In the remaining examples a @e2a or @a2e array may be employed
-but the assignment will not be shown explicitly.  For code page 1047
-you could use the @a2e_1047 or @e2a_1047 arrays just shown.
-</p>
 <hr>
 <a name="perlebcdic-uu-encoding-and-decoding"></a>
 <div class="header">
@@ -27017,9 +28117,10 @@
 <a name="uu-encoding-and-decoding"></a>
 <h4 class="subsection">19.12.2 uu encoding and decoding</h4>
 
-<p>The <code>u</code> template to pack() or unpack() will render EBCDIC data 
in EBCDIC
-characters equivalent to their ASCII counterparts.  For example, the
-following will print &quot;Yes indeed\n&quot; on either an ASCII or EBCDIC 
computer:
+<p>The <code>u</code> template to <code>pack()</code> or <code>unpack()</code> 
will render EBCDIC data in
+EBCDIC characters equivalent to their ASCII counterparts.  For example,
+the following will print &quot;Yes indeed\n&quot; on either an ASCII or EBCDIC
+computer:
 </p>
 <pre class="verbatim">    $all_byte_chrs = '';
     for (0..255) { $all_byte_chrs .= chr($_); }
@@ -27040,19 +28141,17 @@
         print &quot;indeed\n&quot;;
     }
 </pre>
-<p>Here is a very spartan uudecoder that will work on EBCDIC provided
-that the @e2a array is filled in appropriately:
+<p>Here is a very spartan uudecoder that will work on EBCDIC:
 </p>
 <pre class="verbatim">    #!/usr/local/bin/perl
-    @e2a = ( # this must be filled in
-           );
     $_ = &lt;&gt; until ($mode,$file) = /^begin\s*(\d*)\s*(\S*)/;
     open(OUT, &quot;&gt; $file&quot;) if $file ne &quot;&quot;;
     while(&lt;&gt;) {
         last if /^end/;
         next if /[a-z]/;
-        next unless int(((($e2a[ord()] - 32 ) &amp; 077) + 2) / 3) ==
-            int(length() / 4);
+        next unless int((((utf8::native_to_unicode(ord()) - 32 ) &amp; 077)
+                                                               + 2) / 3)
+                    == int(length() / 4);
         print OUT unpack(&quot;u&quot;, $_);
     }
     close(OUT);
@@ -27071,39 +28170,41 @@
 the printable set using:
 </p>
 <pre class="verbatim">    # This QP encoder works on ASCII only
-    $qp_string =~ 
s/([=\x00-\x1F\x80-\xFF])/sprintf(&quot;=%02X&quot;,ord($1))/ge;
+    $qp_string =~ s/([=\x00-\x1F\x80-\xFF])/
+                    sprintf(&quot;=%02X&quot;,ord($1))/xge;
 </pre>
-<p>Whereas a QP encoder that works on both ASCII and EBCDIC platforms
-would look somewhat like the following (where the EBCDIC branch @e2a
-array is omitted for brevity):
-</p>
-<pre class="verbatim">    if (ord('A') == 65) {    # ASCII
-        $delete = &quot;\x7F&quot;;    # ASCII
-        @e2a = (0 .. 255)    # ASCII to ASCII identity map
-    }
-    else {                   # EBCDIC
-        $delete = &quot;\x07&quot;;    # EBCDIC
-        @e2a =               # EBCDIC to ASCII map (as shown above)
-    }
+<p>Starting in Perl v5.22, this is trivially changeable to work portably on
+both ASCII and EBCDIC platforms.
+</p>
+<pre class="verbatim">    # This QP encoder works on both ASCII and EBCDIC
+    $qp_string =~ s/([=\N{U+00}-\N{U+1F}\N{U+80}-\N{U+FF}])/
+                    sprintf(&quot;=%02X&quot;,ord($1))/xge;
+</pre>
+<p>For earlier Perls, a QP encoder that works on both ASCII and EBCDIC
+platforms would look somewhat like the following:
+</p>
+<pre class="verbatim">    $delete = 
utf8::unicode_to_native(ord(&quot;\x7F&quot;));
     $qp_string =~
-      s/([^ 
!&quot;\#\$%&amp;'()*+,\-.\/0-9:;&lt;&gt;address@hidden|}~$delete])/
-         sprintf(&quot;=%02X&quot;,$e2a[ord($1)])/xge;
+      s/([^[:print:]$delete])/
+         sprintf(&quot;=%02X&quot;,utf8::native_to_unicode(ord($1)))/xage;
 </pre>
 <p>(although in production code the substitutions might be done
-in the EBCDIC branch with the @e2a array and separately in the
-ASCII branch without the expense of the identity map).
+in the EBCDIC branch with the function call and separately in the
+ASCII branch without the expense of the identity map; in Perl v5.22, the
+identity map is optimized out so there is no expense, but the
+alternative above is simpler and is also available in v5.22).
 </p>
 <p>Such QP strings can be decoded with:
 </p>
 <pre class="verbatim">    # This QP decoder is limited to ASCII only
-    $string =~ s/=([0-9A-Fa-f][0-9A-Fa-f])/chr hex $1/ge;
+    $string =~ s/=([[:xdigit:][[:xdigit:])/chr hex $1/ge;
     $string =~ s/=[\n\r]+$//;
 </pre>
 <p>Whereas a QP decoder that works on both ASCII and EBCDIC platforms
-would look somewhat like the following (where the @a2e array is
-omitted for brevity):
+would look somewhat like the following:
 </p>
-<pre class="verbatim">    $string =~ s/=([0-9A-Fa-f][0-9A-Fa-f])/chr $a2e[hex 
$1]/ge;
+<pre class="verbatim">    $string =~ s/=([[:xdigit:][:xdigit:]])/
+                                chr utf8::native_to_unicode(hex $1)/xge;
     $string =~ s/=[\n\r]+$//;
 </pre>
 <hr>
@@ -27146,10 +28247,11 @@
 <a name="Hashing-order-and-checksums"></a>
 <h3 class="section">19.13 Hashing order and checksums</h3>
 
-<p>To the extent that it is possible to write code that depends on
-hashing order there may be differences between hashes as stored
-on an ASCII-based platform and hashes stored on an EBCDIC-based platform.
-XXX
+<p>Perl deliberately randomizes hash order for security purposes on both
+ASCII and EBCDIC platforms.
+</p>
+<p>EBCDIC checksums will differ for the same file translated into ASCII
+and vice versa.
 </p>
 <hr>
 <a name="perlebcdic-I18N-AND-L10N"></a>
@@ -27162,7 +28264,7 @@
 
 <p>Internationalization (I18N) and localization (L10N) are supported at least
 in principle even on EBCDIC platforms.  The details are system-dependent
-and discussed under the <a href="#perlebcdic-OS-ISSUES">perlebcdic OS 
ISSUES</a> section below.
+and discussed under the <a href="#perlebcdic-OS-ISSUES">OS ISSUES</a> section 
below.
 </p>
 <hr>
 <a name="perlebcdic-MULTI_002dOCTET-CHARACTER-SETS"></a>
@@ -27173,9 +28275,8 @@
 <a name="MULTI_002dOCTET-CHARACTER-SETS"></a>
 <h3 class="section">19.15 MULTI-OCTET CHARACTER SETS</h3>
 
-<p>Perl may work with an internal UTF-EBCDIC encoding form for wide characters
-on EBCDIC platforms in a manner analogous to the way that it works with
-the UTF-8 internal encoding form on ASCII based platforms.
+<p>Perl works with UTF-EBCDIC, a multi-byte encoding.  In Perls earlier
+than v5.22, there may be various bugs in this regard.
 </p>
 <p>Legacy multi byte EBCDIC code pages XXX.
 </p>
@@ -27236,7 +28337,12 @@
 <p>Perl runs under Unix Systems Services or USS.
 </p>
 <dl compact="compact">
-<dt>chcp</dt>
+<dt><code>sigaction</code></dt>
+<dd><a name="perlebcdic-sigaction"></a>
+<p><code>SA_SIGINFO</code> can have segmentation faults.
+</p>
+</dd>
+<dt><code>chcp</code></dt>
 <dd><a name="perlebcdic-chcp"></a>
 <p><strong>chcp</strong> is supported as a shell utility for displaying and 
changing
 one&rsquo;s code page.  See also <a 
href="http://man.he.net/man1/chcp";>chcp(1)</a>.
@@ -27255,17 +28361,25 @@
 <p>See also the OS390::Stdio module on CPAN.
 </p>
 </dd>
-<dt>OS/390, z/OS iconv</dt>
-<dd><a name="perlebcdic-OS_002f390_002c-z_002fOS-iconv"></a>
+<dt><code>iconv</code></dt>
+<dd><a name="perlebcdic-iconv-1"></a>
 <p><strong>iconv</strong> is supported as both a shell utility and a C RTL 
routine.
-See also the iconv(1) and iconv(3) manual pages.
+See also the <a href="http://man.he.net/man1/iconv";>iconv(1)</a> and <a 
href="http://man.he.net/man3/iconv";>iconv(3)</a> manual pages.
 </p>
 </dd>
 <dt>locales</dt>
 <dd><a name="perlebcdic-locales"></a>
-<p>On OS/390 or z/OS see <a href="locale.html#Top">(locale)</a> for 
information on locales.  The L10N files
-are in <samp>/usr/nls/locale</samp>.  $Config{d_setlocale} is 
&rsquo;define&rsquo; on OS/390
-or z/OS.
+<p>Locales are supported.  There may be glitches when a locale is another
+EBCDIC code page which has some of the
+<a href="#perlebcdic-The-13-variant-characters">code-page variant 
characters</a> in other
+positions.
+</p>
+<p>There aren&rsquo;t currently any real UTF-8 locales, even though some locale
+names contain the string &quot;UTF-8&quot;.
+</p>
+<p>See <a href="#perllocale-NAME">perllocale NAME</a> for information on 
locales.  The L10N files
+are in <samp>/usr/nls/locale</samp>.  <code>$Config{d_setlocale}</code> is 
<code>'define'</code> on
+OS/390 or z/OS.
 </p>
 </dd>
 </dl>
@@ -27290,18 +28404,47 @@
 <a name="BUGS-2"></a>
 <h3 class="section">19.17 BUGS</h3>
 
-<p>This pod document contains literal Latin 1 characters and may encounter
-translation difficulties.  In particular one popular nroff implementation
-was known to strip accented characters to their unaccented counterparts
-while attempting to view this document through the <strong>pod2man</strong> 
program
-(for example, you may see a plain <code>y</code> rather than one with a 
diaeresis
-as in ÃÂ¿).  Another nroff truncated the resultant manpage at
-the first occurrence of 8 bit characters.
-</p>
-<p>Not all shells will allow multiple <code>-e</code> string arguments to perl 
to
-be concatenated together properly as recipes 0, 2, 4, 5, and 6 might
+<ul>
+<li> The <code>cmp</code> (and hence <code>sort</code>) operators do not 
necessarily give the
+correct results when both operands are UTF-EBCDIC encoded strings and
+there is a mixture of ASCII and/or control characters, along with other
+characters.
+
+</li><li> Ranges containing <code>\N{...}</code> in the <code>tr///</code> 
(and <code>y///</code>)
+transliteration operators are treated differently than the equivalent
+ranges in regular expression pattersn.  They should, but don&rsquo;t, cause
+the values in the ranges to all be treated as Unicode code points, and
+not native ones.  (<a href="#perlre-Version-8-Regular-Expressions">perlre 
Version 8 Regular Expressions</a> gives
+details as to how it should work.)
+
+</li><li> Not all shells will allow multiple <code>-e</code> string arguments 
to perl to
+be concatenated together properly as recipes in this document
+0, 2, 4, 5, and 6 might
 seem to imply.
+
+</li><li> There are some bugs in the <code>pack</code>/<code>unpack</code> 
<code>&quot;U0&quot;</code> template
+
+</li><li> There are a significant number of test failures in the CPAN modules
+shipped with Perl v5.22.  These are only in modules not primarily
+maintained by Perl 5 porters.  Some of these are failures in the tests
+only: they don&rsquo;t realize that it is proper to get different results on
+EBCDIC platforms.  And some of the failures are real bugs.  If you
+compile and do a <code>make test</code> on Perl, all tests on the 
<code>/cpan</code>
+directory are skipped.
+
+<p>In particular, the extensions <a 
href="Unicode-Collate.html#Top">(Unicode-Collate)</a> and
+<a href="Unicode-Normalize.html#Top">(Unicode-Normalize)</a> are not supported 
under EBCDIC; likewise for the
+(now deprecated) <a href="encoding.html#Top">(encoding)</a> pragma.
+</p>
+<p><a href="Encode.html#Top">(Encode)</a> partially works.
 </p>
+</li><li> In earlier versions, when byte and character data were concatenated,
+the new string was sometimes created by
+decoding the byte strings as <em>ISO 8859-1 (Latin-1)</em>, even if the
+old Unicode string used EBCDIC.
+
+</li></ul>
+
 <hr>
 <a name="perlebcdic-SEE-ALSO"></a>
 <div class="header">
@@ -27378,6 +28521,8 @@
 registered service marks used in this document are the property of
 their respective owners.
 </p>
+<p>Now maintained by Perl5 Porters.
+</p>
 <hr>
 <a name="perlembed"></a>
 <div class="header">
@@ -28608,7 +29753,7 @@
 When a Perl interpreter normally starts up, it tells the system it wants
 to use the system&rsquo;s default locale.  This is often, but not necessarily,
 the &quot;C&quot; or &quot;POSIX&quot; locale.  Absent a 
<code>&quot;use&nbsp;locale&quot;</code><!-- /@w --> within the perl
-code, this mostly has no effect (but see <a 
href="#perllocale-Not-within-the-scope-of-any-_0022use-locale_0022-variant">perllocale
 <strong>Not within the scope of any <code>&quot;use locale&quot;</code> 
variant</strong></a>).  Also, there is not a problem if the
+code, this mostly has no effect (but see <a 
href="#perllocale-Not-within-the-scope-of-_0022use-locale_0022">perllocale 
<strong>Not within the scope of <code>&quot;use 
locale&quot;</code></strong></a>).  Also, there is not a problem if the
 locale you want to use in your embedded Perl is the same as the system
 default.  However, this doesn&rsquo;t work if you have set up and want to use
 a locale that isn&rsquo;t the system default one.  Starting in Perl v5.20, you
@@ -28843,11 +29988,6 @@
 <code>experimental::regex_sets</code>.
 </p>
 </dd>
-<dt><code>\s</code> in regexp matches vertical tab</dt>
-<dd><a name="perlexperiment-_005cs-in-regexp-matches-vertical-tab"></a>
-<p>Introduced in Perl 5.18
-</p>
-</dd>
 <dt>Subroutine signatures</dt>
 <dd><a name="perlexperiment-Subroutine-signatures"></a>
 <p>Introduced in Perl 5.20.0
@@ -28870,6 +30010,55 @@
 <a href="https://rt.perl.org:443/rt3/Ticket/Display.html?id=120162";>[perl 
#120162]</a>.
 </p>
 </dd>
+<dt>Aliasing via reference</dt>
+<dd><a name="perlexperiment-Aliasing-via-reference"></a>
+<p>Introduced in Perl 5.22.0
+</p>
+<p>Using this feature triggers warnings in the category
+<code>experimental::refaliasing</code>.
+</p>
+<p>The ticket for this feature is
+<a href="https://rt.perl.org/rt3/Ticket/Display.html?id=122947";>[perl 
#122947]</a>.
+</p>
+<p>See also: <a href="#perlref-Assigning-to-References">perlref Assigning to 
References</a>
+</p>
+</dd>
+<dt>The &quot;const&quot; attribute</dt>
+<dd><a name="perlexperiment-The-_0022const_0022-attribute"></a>
+<p>Introduced in Perl 5.22.0
+</p>
+<p>Using this feature triggers warnings in the category
+<code>experimental::const_attr</code>.
+</p>
+<p>The ticket for this feature is
+<a href="https://rt.perl.org/rt3/Ticket/Display.html?id=123630";>[perl 
#123630]</a>.
+</p>
+<p>See also: <a href="#perlsub-Constant-Functions">perlsub Constant 
Functions</a>
+</p>
+</dd>
+<dt>use re &rsquo;strict&rsquo;;</dt>
+<dd><a name="perlexperiment-use-re-_0027strict_0027_003b"></a>
+<p>Introduced in Perl 5.22.0
+</p>
+<p>Using this feature triggers warnings in the category
+<code>experimental::re_strict</code>.
+</p>
+<p>See <a href="re.html#g_t_0027strict_0027-mode">(re)'strict' mode</a>
+</p>
+</dd>
+<dt>String- and number-specific bitwise operators</dt>
+<dd><a 
name="perlexperiment-String_002d-and-number_002dspecific-bitwise-operators"></a>
+<p>Introduced in: Perl 5.22.0
+</p>
+<p>See also: <a href="#perlop-Bitwise-String-Operators">perlop Bitwise String 
Operators</a>
+</p>
+<p>Using this feature triggers warnings in the category
+<code>experimental::bitwise</code>.
+</p>
+<p>The ticket for this feature is
+<a href="https://rt.perl.org/rt3/Ticket/Display.html?id=123707";>[perl 
#123707]</a>.
+</p>
+</dd>
 <dt>The &lt;:win32&gt; IO pseudolayer</dt>
 <dd><a name="perlexperiment-The-_003c_003awin32_003e-IO-pseudolayer"></a>
 <p>The ticket for this feature is
@@ -29007,6 +30196,11 @@
 <p>Accepted in Perl 5.20.0
 </p>
 </dd>
+<dt><code>\s</code> in regexp matches vertical tab</dt>
+<dd><a name="perlexperiment-_005cs-in-regexp-matches-vertical-tab"></a>
+<p>Accepted in Perl 5.22.0
+</p>
+</dd>
 </dl>
 
 <hr>
@@ -29153,6 +30347,8 @@
 </td></tr>
 <tr><td align="left" valign="top">&bull; <a 
href="#perlfilter-CONCLUSION">perlfilter 
CONCLUSION</a>:</td><td>&nbsp;&nbsp;</td><td align="left" valign="top">
 </td></tr>
+<tr><td align="left" valign="top">&bull; <a 
href="#perlfilter-LIMITATIONS">perlfilter 
LIMITATIONS</a>:</td><td>&nbsp;&nbsp;</td><td align="left" valign="top">
+</td></tr>
 <tr><td align="left" valign="top">&bull; <a 
href="#perlfilter-THINGS-TO-LOOK-OUT-FOR">perlfilter THINGS TO LOOK OUT 
FOR</a>:</td><td>&nbsp;&nbsp;</td><td align="left" valign="top">
 </td></tr>
 <tr><td align="left" valign="top">&bull; <a 
href="#perlfilter-REQUIREMENTS">perlfilter 
REQUIREMENTS</a>:</td><td>&nbsp;&nbsp;</td><td align="left" valign="top">
@@ -29734,7 +30930,7 @@
 <a name="perlfilter-CONCLUSION"></a>
 <div class="header">
 <p>
-Next: <a href="#perlfilter-THINGS-TO-LOOK-OUT-FOR" accesskey="n" 
rel="next">perlfilter THINGS TO LOOK OUT FOR</a>, Previous: <a 
href="#perlfilter-USING-CONTEXT_003a-THE-DEBUG-FILTER" accesskey="p" 
rel="prev">perlfilter USING CONTEXT: THE DEBUG FILTER</a>, Up: <a 
href="#perlfilter" accesskey="u" rel="up">perlfilter</a> &nbsp; [<a 
href="#SEC_Contents" title="Table of contents" rel="contents">Contents</a>]</p>
+Next: <a href="#perlfilter-LIMITATIONS" accesskey="n" rel="next">perlfilter 
LIMITATIONS</a>, Previous: <a 
href="#perlfilter-USING-CONTEXT_003a-THE-DEBUG-FILTER" accesskey="p" 
rel="prev">perlfilter USING CONTEXT: THE DEBUG FILTER</a>, Up: <a 
href="#perlfilter" accesskey="u" rel="up">perlfilter</a> &nbsp; [<a 
href="#SEC_Contents" title="Table of contents" rel="contents">Contents</a>]</p>
 </div>
 <a name="CONCLUSION"></a>
 <h3 class="section">22.10 CONCLUSION</h3>
@@ -29779,13 +30975,41 @@
 syntax you want your filter to have.
 </p>
 <hr>
+<a name="perlfilter-LIMITATIONS"></a>
+<div class="header">
+<p>
+Next: <a href="#perlfilter-THINGS-TO-LOOK-OUT-FOR" accesskey="n" 
rel="next">perlfilter THINGS TO LOOK OUT FOR</a>, Previous: <a 
href="#perlfilter-CONCLUSION" accesskey="p" rel="prev">perlfilter 
CONCLUSION</a>, Up: <a href="#perlfilter" accesskey="u" rel="up">perlfilter</a> 
&nbsp; [<a href="#SEC_Contents" title="Table of contents" 
rel="contents">Contents</a>]</p>
+</div>
+<a name="LIMITATIONS"></a>
+<h3 class="section">22.11 LIMITATIONS</h3>
+
+<p>Source filters only work on the string level, thus are highly limited
+in its ability to change source code on the fly. It cannot detect
+comments, quoted strings, heredocs, it is no replacement for a real
+parser.
+The only stable usage for source filters are encryption, compression,
+or the byteloader, to translate binary code back to source code.
+</p>
+<p>See for example the limitations in Switch, which uses source filters,
+and thus is does not work inside a string eval, the presence of
+regexes with embedded newlines that are specified with raw /.../
+delimiters and don&rsquo;t have a modifier //x are indistinguishable from
+code chunks beginning with the division operator /. As a workaround
+you must use m/.../ or m?...? for such patterns. Also, the presence of
+regexes specified with raw ?...? delimiters may cause mysterious
+errors. The workaround is to use m?...? instead.  See
+http://search.cpan.org/perldoc?Switch#LIMITATIONS
+</p>
+<p>Currently internal buffer lengths are limited to 32-bit only.
+</p>
+<hr>
 <a name="perlfilter-THINGS-TO-LOOK-OUT-FOR"></a>
 <div class="header">
 <p>
-Next: <a href="#perlfilter-REQUIREMENTS" accesskey="n" rel="next">perlfilter 
REQUIREMENTS</a>, Previous: <a href="#perlfilter-CONCLUSION" accesskey="p" 
rel="prev">perlfilter CONCLUSION</a>, Up: <a href="#perlfilter" accesskey="u" 
rel="up">perlfilter</a> &nbsp; [<a href="#SEC_Contents" title="Table of 
contents" rel="contents">Contents</a>]</p>
+Next: <a href="#perlfilter-REQUIREMENTS" accesskey="n" rel="next">perlfilter 
REQUIREMENTS</a>, Previous: <a href="#perlfilter-LIMITATIONS" accesskey="p" 
rel="prev">perlfilter LIMITATIONS</a>, Up: <a href="#perlfilter" accesskey="u" 
rel="up">perlfilter</a> &nbsp; [<a href="#SEC_Contents" title="Table of 
contents" rel="contents">Contents</a>]</p>
 </div>
 <a name="THINGS-TO-LOOK-OUT-FOR"></a>
-<h3 class="section">22.11 THINGS TO LOOK OUT FOR</h3>
+<h3 class="section">22.12 THINGS TO LOOK OUT FOR</h3>
 
 <dl compact="compact">
 <dt>Some Filters Clobber the <code>DATA</code> Handle</dt>
@@ -29806,7 +31030,7 @@
 Next: <a href="#perlfilter-AUTHOR" accesskey="n" rel="next">perlfilter 
AUTHOR</a>, Previous: <a href="#perlfilter-THINGS-TO-LOOK-OUT-FOR" 
accesskey="p" rel="prev">perlfilter THINGS TO LOOK OUT FOR</a>, Up: <a 
href="#perlfilter" accesskey="u" rel="up">perlfilter</a> &nbsp; [<a 
href="#SEC_Contents" title="Table of contents" rel="contents">Contents</a>]</p>
 </div>
 <a name="REQUIREMENTS"></a>
-<h3 class="section">22.12 REQUIREMENTS</h3>
+<h3 class="section">22.13 REQUIREMENTS</h3>
 
 <p>The Source Filters distribution is available on CPAN, in 
 </p>
@@ -29824,7 +31048,7 @@
 Next: <a href="#perlfilter-Copyrights" accesskey="n" rel="next">perlfilter 
Copyrights</a>, Previous: <a href="#perlfilter-REQUIREMENTS" accesskey="p" 
rel="prev">perlfilter REQUIREMENTS</a>, Up: <a href="#perlfilter" accesskey="u" 
rel="up">perlfilter</a> &nbsp; [<a href="#SEC_Contents" title="Table of 
contents" rel="contents">Contents</a>]</p>
 </div>
 <a name="AUTHOR-9"></a>
-<h3 class="section">22.13 AUTHOR</h3>
+<h3 class="section">22.14 AUTHOR</h3>
 
 <p>Paul Marquess &lt;address@hidden&gt;
 </p>
@@ -29835,7 +31059,7 @@
 Previous: <a href="#perlfilter-AUTHOR" accesskey="p" rel="prev">perlfilter 
AUTHOR</a>, Up: <a href="#perlfilter" accesskey="u" rel="up">perlfilter</a> 
&nbsp; [<a href="#SEC_Contents" title="Table of contents" 
rel="contents">Contents</a>]</p>
 </div>
 <a name="Copyrights"></a>
-<h3 class="section">22.14 Copyrights</h3>
+<h3 class="section">22.15 Copyrights</h3>
 
 <p>This article originally appeared in The Perl Journal #11, and is
 copyright 1998 The Perl Journal. It appears courtesy of Jon Orwant and
@@ -31797,7 +33021,7 @@
 number of characters removed from all its arguments.  It&rsquo;s often used to
 remove the newline from the end of an input record when you&rsquo;re worried
 that the final record may be missing its newline.  When in paragraph
-mode (<code>$/ = &quot;&quot;</code>), it removes all trailing newlines from 
the string.
+mode (<code>$/ = ''</code>), it removes all trailing newlines from the string.
 When in slurp mode (<code>$/ = undef</code>) or fixed-length record mode 
(<code>$/</code> is
 a reference to an integer or the like; see <a href="#perlvar-NAME">perlvar 
NAME</a>) chomp() won&rsquo;t
 remove anything.
@@ -32587,7 +33811,7 @@
         print &quot;$key=$value\n&quot;;
     }
 </pre>
-<p>Starting with Perl 5.14, <code>each</code> can take a scalar EXPR, which 
must hold
+<p>Starting with Perl 5.14, <code>each</code> can take a scalar EXPR, which 
must hold a
 reference to an unblessed hash or array.  The argument will be dereferenced
 automatically.  This aspect of <code>each</code> is considered highly 
experimental.
 The exact behaviour may change in a future version of Perl.
@@ -32895,7 +34119,7 @@
 <pre class="verbatim">    exec {'/bin/csh'} '-sh';  # pretend it's a login 
shell
 </pre>
 <p>When the arguments get executed via the system shell, results are
-subject to its quirks and capabilities.  See <a 
href="#perlop-_0060STRING_0060">perlop `STRING`</a>
+subject to its quirks and capabilities.  See <a 
href="#perlop-_0060STRING_0060">perlop <code>`<em>STRING</em>`</code></a>
 for details.
 </p>
 <p>Using an indirect object with <code>exec</code> or <code>system</code> is 
also more
@@ -33167,6 +34391,12 @@
             &quot;not have a real file descriptor\n&quot;;
     }
 </pre>
+<p>The behavior of <code>fileno</code> on a directory handle depends on the 
operating
+system.  On a system with dirfd(3) or similar, <code>fileno</code> on a 
directory
+handle returns the underlying file descriptor associated with the
+handle; on systems with no such support, it returns the undefined value,
+and sets <code>$!</code> (errno).
+</p>
 </dd>
 <dt>flock FILEHANDLE,OPERATION</dt>
 <dd><a name="perlfunc-flock-FILEHANDLE_002cOPERATION"></a>
@@ -33531,7 +34761,8 @@
  $comment,  $gcos,     $dir,       $shell,   $expire ) = getpw*
  # 5        6          7           8         9
 </pre>
-<p>(If the entry doesn&rsquo;t exist you get an empty list.)
+<p>(If the entry doesn&rsquo;t exist, the return value is a single meaningless 
true
+value.)
 </p>
 <p>The exact meaning of the $gcos field varies but it usually contains
 the real name of the user (as opposed to the login name) and other
@@ -34153,9 +35384,9 @@
 to <code>a-z</code> respectively.
 </p>
 </dd>
-<dt>Otherwise, if <code>use locale</code> (but not <code>use locale 
':not_characters'</code>) is in effect:</dt>
-<dd><a 
name="perlfunc-Otherwise_002c-if-use-locale-_0028but-not-use-locale-_0027_003anot_005fcharacters_0027_0029-is-in-effect_003a"></a>
-<p>Respects current LC_CTYPE locale for code points &lt; 256; and uses Unicode
+<dt>Otherwise, if <code>use locale</code> for <code>LC_CTYPE</code> is in 
effect:</dt>
+<dd><a 
name="perlfunc-Otherwise_002c-if-use-locale-for-LC_005fCTYPE-is-in-effect_003a"></a>
+<p>Respects current <code>LC_CTYPE</code> locale for code points &lt; 256; and 
uses Unicode
 rules for the remaining code points (this last can only happen if
 the UTF8 flag is also set).  See <a href="#perllocale-NAME">perllocale 
NAME</a>.
 </p>
@@ -34169,8 +35400,10 @@
 itself, because 0xDF may not be LATIN SMALL LETTER SHARP S in the
 current locale, and Perl has no way of knowing if that character even
 exists in the locale, much less what code point it is.  Perl returns
-the input character unchanged, for all instances (and there aren&rsquo;t
-many) where the 255/256 boundary would otherwise be crossed.
+a result that is above 255 (almost always the input character unchanged,
+for all instances (and there aren&rsquo;t many) where the 255/256 boundary
+would otherwise be crossed; and starting in v5.22, it raises a
+<a 
href="#perldiag-Can_0027t-do-_0025s_0028_0022_0025s_0022_0029-on-non_002dUTF_002d8-locale_003b-resolved-to-_0022_0025s_0022_002e">locale</a>
 warning.
 </p>
 </dd>
 <dt>Otherwise, If EXPR has the UTF8 flag set:</dt>
@@ -35145,7 +36378,8 @@
 </p>
 <p>This means that when <code>use strict 'vars'</code> is in effect, 
<code>our</code> lets you use
 a package variable without qualifying it with the package name, but only within
-the lexical scope of the <code>our</code> declaration. This applies 
immediately&ndash;even
+the lexical scope of the <code>our</code>
+declaration.  This applies immediately&ndash;even
 within the same statement.
 </p>
 <pre class="verbatim">    package Foo;
@@ -35299,7 +36533,8 @@
     D  A float of long-double precision in native format.
          (Long doubles are available only if your system supports
           long double values _and_ if Perl has been compiled to
-          support those.  Raises an exception otherwise.)
+          support those.  Raises an exception otherwise.
+          Note that there are different long double formats.)
 
     p  A pointer to a null-terminated string.
     P  A pointer to a structure (fixed-length string).
@@ -35580,6 +36815,8 @@
 <pre class="verbatim">   0x56 0x78 0x12 0x34
    0x34 0x12 0x78 0x56
 </pre>
+<p>These are called mid-endian, middle-endian, mixed-endian, or just weird.
+</p>
 <p>You can determine your system endianness with this incantation:
 </p>
 <pre class="verbatim">   printf(&quot;%#02x &quot;, $_) for 
unpack(&quot;W*&quot;, pack L=&gt;0x12345678); 
@@ -35595,12 +36832,25 @@
 <pre class="verbatim">    $ perl -V:byteorder
 </pre>
 <p>Byteorders <code>&quot;1234&quot;</code> and 
<code>&quot;12345678&quot;</code> are little-endian; 
<code>&quot;4321&quot;</code>
-and <code>&quot;87654321&quot;</code> are big-endian.
+and <code>&quot;87654321&quot;</code> are big-endian.  Systems with 
multiarchitecture binaries
+will have <code>&quot;ffff&quot;</code>, signifying that static information 
doesn&rsquo;t work,
+one must use runtime probing.
 </p>
 <p>For portably packed integers, either use the formats <code>n</code>, 
<code>N</code>, <code>v</code>, 
 and <code>V</code> or else use the <code>&gt;</code> and <code>&lt;</code> 
modifiers described
 immediately below.  See also <a href="#perlport-NAME">perlport NAME</a>.
 </p>
+</li><li> Also floating point numbers have endianness.  Usually (but not 
always)
+this agrees with the integer endianness.  Even though most platforms
+these days use the IEEE 754 binary format, there are differences,
+especially if the long doubles are involved.  You can see the
+<code>Config</code> variables <code>doublekind</code> and 
<code>longdblkind</code> (also <code>doublesize</code>,
+<code>longdblsize</code>): the &quot;kind&quot; values are enums, unlike 
<code>byteorder</code>.
+
+<p>Portability-wise the best option is probably to keep to the IEEE 754
+64-bit doubles, and of agreed-upon endianness.  Another possibility
+is the <code>&quot;%a&quot;</code>) format of <code>printf</code>.
+</p>
 </li><li> Starting with Perl 5.10.0, integer and floating-point formats, along 
with
 the <code>p</code> and <code>P</code> formats and <code>()</code> groups, may 
all be followed by the 
 <code>&gt;</code> or <code>&lt;</code> endianness modifiers to respectively 
enforce big-
@@ -35652,7 +36902,7 @@
 will not in general equal $foo.
 </p>
 </li><li> Pack and unpack can operate in two modes: character mode 
(<code>C0</code> mode) where
-the packed string is processed per character, and UTF-8 mode (<code>U0</code> 
mode)
+the packed string is processed per character, and UTF-8 byte mode 
(<code>U0</code> mode)
 where the packed string is processed in its UTF-8-encoded Unicode form on
 a byte-by-byte basis.  Character mode is the default
 unless the format string starts with <code>U</code>.  You
@@ -35727,6 +36977,11 @@
 assumes additional <code>&quot;&quot;</code> arguments.  If TEMPLATE requires 
fewer arguments
 than given, extra arguments are ignored.
 
+</li><li> Attempting to pack the special floating point values 
<code>Inf</code> and <code>NaN</code>
+(infinity, also in negative, and not-a-number) into packed integer values
+(like <code>&quot;L&quot;</code>) is a fatal error.  The reason for this is 
that there simply
+isn&rsquo;t any sensible mapping for these special values into integers.
+
 </li></ul>
 
 <p>Examples:
@@ -35988,10 +37243,11 @@
 of the list will be interpreted as the <code>printf</code> format.  This
 means that <code>printf(@_)</code> will use <code>$_[0]</code> as the format.  
See
 <a href="#perlfunc-sprintf-FORMAT_002c-LIST">sprintf</a> for an
-explanation of the format argument.  If <code>use locale</code> (including
-<code>use locale ':not_characters'</code>) is in effect and
+explanation of the format argument.  If <code>use locale</code> for 
<code>LC_NUMERIC</code>
+Look for this throught pod
+is in effect and
 POSIX::setlocale() has been called, the character used for the decimal
-separator in formatted floating-point numbers is affected by the LC_NUMERIC
+separator in formatted floating-point numbers is affected by the 
<code>LC_NUMERIC</code>
 locale setting.  See <a href="#perllocale-NAME">perllocale NAME</a> and <a 
href="POSIX.html#Top">(POSIX)</a>.
 </p>
 <p>For historical reasons, if you omit the list, <code>$_</code> is used as 
the format;
@@ -36008,9 +37264,13 @@
 </dd>
 <dt>prototype FUNCTION</dt>
 <dd><a name="perlfunc-prototype-FUNCTION"></a>
+</dd>
+<dt>prototype</dt>
+<dd><a name="perlfunc-prototype"></a>
 <p>Returns the prototype of a function as a string (or <code>undef</code> if 
the
 function has no prototype).  FUNCTION is a reference to, or the name of,
-the function whose prototype you want to retrieve.
+the function whose prototype you want to retrieve.  If FUNCTION is omitted,
+$_ is used.
 </p>
 <p>If FUNCTION is a string starting with <code>CORE::</code>, the rest is 
taken as a
 name for a Perl builtin.  If the builtin&rsquo;s arguments
@@ -37481,17 +38741,29 @@
 <pre class="verbatim">    @result = sort { $a &lt;=&gt; $b } grep { $_ == $_ } 
@input;
 </pre>
 </dd>
-<dt>splice ARRAY or EXPR,OFFSET,LENGTH,LIST</dt>
-<dd><a name="perlfunc-splice-ARRAY-or-EXPR_002cOFFSET_002cLENGTH_002cLIST"></a>
+<dt>splice ARRAY,OFFSET,LENGTH,LIST</dt>
+<dd><a name="perlfunc-splice-ARRAY_002cOFFSET_002cLENGTH_002cLIST"></a>
+</dd>
+<dt>splice ARRAY,OFFSET,LENGTH</dt>
+<dd><a name="perlfunc-splice-ARRAY_002cOFFSET_002cLENGTH"></a>
+</dd>
+<dt>splice ARRAY,OFFSET</dt>
+<dd><a name="perlfunc-splice-ARRAY_002cOFFSET"></a>
 </dd>
-<dt>splice ARRAY or EXPR,OFFSET,LENGTH</dt>
-<dd><a name="perlfunc-splice-ARRAY-or-EXPR_002cOFFSET_002cLENGTH"></a>
+<dt>splice ARRAY</dt>
+<dd><a name="perlfunc-splice-ARRAY"></a>
 </dd>
-<dt>splice ARRAY or EXPR,OFFSET</dt>
-<dd><a name="perlfunc-splice-ARRAY-or-EXPR_002cOFFSET"></a>
+<dt>splice EXPR,OFFSET,LENGTH,LIST</dt>
+<dd><a name="perlfunc-splice-EXPR_002cOFFSET_002cLENGTH_002cLIST"></a>
 </dd>
-<dt>splice ARRAY or EXPR</dt>
-<dd><a name="perlfunc-splice-ARRAY-or-EXPR"></a>
+<dt>splice EXPR,OFFSET,LENGTH</dt>
+<dd><a name="perlfunc-splice-EXPR_002cOFFSET_002cLENGTH"></a>
+</dd>
+<dt>splice EXPR,OFFSET</dt>
+<dd><a name="perlfunc-splice-EXPR_002cOFFSET"></a>
+</dd>
+<dt>splice EXPR</dt>
+<dd><a name="perlfunc-splice-EXPR"></a>
 <p>Removes the elements designated by OFFSET and LENGTH from an array, and
 replaces them with the elements of LIST, if any.  In list context,
 returns the elements removed from the array.  In scalar context,
@@ -37581,7 +38853,8 @@
 list of its component characters.
 </p>
 <p>As a special case for <code>split</code>, the empty pattern given in
-<a href="#perlop-m_002fPATTERN_002fmsixpodualgc">match operator</a> syntax 
(<code>//</code>) specifically matches the empty string, which is contrary to 
its usual
+<a href="#perlop-m_002fPATTERN_002fmsixpodualngc">match operator</a> syntax 
(<code>//</code>)
+specifically matches the empty string, which is contrary to its usual
 interpretation as the last successful match.
 </p>
 <p>If PATTERN is <code>/^/</code>, then it is treated as if it used the
@@ -37753,6 +39026,8 @@
    %p    a pointer (outputs the Perl value's address in hexadecimal)
    %n    special: *stores* the number of characters output so far
          into the next argument in the parameter list
+   %a    hexadecimal floating point
+   %A    like %a, but using upper-case letters
 </pre>
 <p>Finally, for backward (and we do mean &quot;backward&quot;) compatibility, 
Perl
 permits these unnecessary but widely-supported conversions:
@@ -37767,7 +39042,9 @@
 by <code>%e</code>, <code>%E</code>, <code>%g</code> and <code>%G</code> for 
numbers with the modulus of the
 exponent less than 100 is system-dependent: it may be three or less
 (zero-padded as necessary).  In other words, 1.23 times ten to the
-99th may be either &quot;1.23e99&quot; or &quot;1.23e099&quot;.
+99th may be either &quot;1.23e99&quot; or &quot;1.23e099&quot;.  Similarly for 
<code>%a</code> and <code>%A</code>:
+the exponent or the hexadecimal digits may float: especially the
+&quot;long doubles&quot; Perl configuration option may cause surprises.
 </p>
 <p>Between the <code>%</code> and the format letter, you may specify several
 additional attributes controlling the interpretation of the format.
@@ -38052,7 +39329,7 @@
 <p>If <code>use locale</code> (including <code>use locale 
'not_characters'</code>) is in effect
 and POSIX::setlocale() has been called,
 the character used for the decimal separator in formatted floating-point
-numbers is affected by the LC_NUMERIC locale.  See <a 
href="#perllocale-NAME">perllocale NAME</a>
+numbers is affected by the <code>LC_NUMERIC</code> locale.  See <a 
href="#perllocale-NAME">perllocale NAME</a>
 and <a href="POSIX.html#Top">(POSIX)</a>.
 </p>
 </dd>
@@ -38293,15 +39570,19 @@
 </dd>
 <dt>study</dt>
 <dd><a name="perlfunc-study"></a>
-<p>Takes extra time to study SCALAR (<code>$_</code> if unspecified) in 
anticipation of
-doing many pattern matches on the string before it is next modified.
+<p>May take extra time to study SCALAR (<code>$_</code> if unspecified) in 
anticipation
+of doing many pattern matches on the string before it is next modified.
 This may or may not save time, depending on the nature and number of
 patterns you are searching and the distribution of character
 frequencies in the string to be searched; you probably want to compare
 run times with and without it to see which is faster.  Those loops
 that scan for many short constant strings (including the constant
 parts of more complex patterns) will benefit most.
-(The way <code>study</code> works is this: a linked list of every
+</p>
+<p>Note that since Perl version 5.16 this function has been a no-op, but
+this might change in a future release.
+</p>
+<p>(The way <code>study</code> works is this: a linked list of every
 character in the string to be searched is made, so we know, for
 example, where all the <code>'k'</code> characters are.  From each search 
string,
 the rarest character is selected, based on some static frequency tables
@@ -38662,7 +39943,7 @@
 <code>wait</code> call.  To get the actual exit value, shift right by eight 
(see
 below).  See also &lsquo;exec&rsquo;.  This is <em>not</em> what you want to 
use to capture
 the output from a command; for that you should use merely backticks or
-<code>qx//</code>, as described in <a href="#perlop-_0060STRING_0060">perlop 
`STRING`</a>.  Return value of -1
+<code>qx//</code>, as described in <a href="#perlop-_0060STRING_0060">perlop 
<code>`<em>STRING</em>`</code></a>.  Return value of -1
 indicates a failure to start the program or an error of the wait(2) system
 call (inspect $! for the reason).
 </p>
@@ -38700,7 +39981,7 @@
 </p>
 <p>When <code>system</code>&rsquo;s arguments are executed indirectly by the 
shell, 
 results and return codes are subject to its quirks.
-See <a href="#perlop-_0060STRING_0060">perlop `STRING`</a> and 
&lsquo;exec&rsquo; for details.
+See <a href="#perlop-_0060STRING_0060">perlop 
<code>`<em>STRING</em>`</code></a> and &lsquo;exec&rsquo; for details.
 </p>
 <p>Since <code>system</code> does a <code>fork</code> and <code>wait</code> it 
may affect a <code>SIGCHLD</code>
 handler.  See <a href="#perlipc-NAME">perlipc NAME</a> for details.
@@ -41601,7 +42882,7 @@
 value (UV), a double (NV), a string (PV), and another scalar (SV).
 (&quot;PV&quot; stands for &quot;Pointer Value&quot;.  You might think that it 
is misnamed
 because it is described as pointing only to strings.  However, it is
-possible to have it point to other things  For example, it could point
+possible to have it point to other things.  For example, it could point
 to an array of UVs.  But,
 using it for non-strings requires care, as the underlying assumption of
 much of the internals is that PVs are just for strings.  Often, for
@@ -41892,22 +43173,40 @@
 at <code>SvPVX(sv) - SvIV(sv)</code> in memory and the PV pointer is pointing
 into the middle of this allocated storage.
 </p>
-<p>This is best demonstrated by example:
+<p>This is best demonstrated by example.  Normally copy-on-write will prevent
+the substitution from operator from using this hack, but if you can craft a
+string for which copy-on-write is not possible, you can see it in play.  In
+the current implementation, the final byte of a string buffer is used as a
+copy-on-write reference count.  If the buffer is not big enough, then
+copy-on-write is skipped.  First have a look at an empty string:
+</p>
+<pre class="verbatim">  % ./perl -Ilib -MDevel::Peek -le '$a=&quot;&quot;; $a 
.= &quot;&quot;; Dump $a'
+  SV = PV(0x7ffb7c008a70) at 0x7ffb7c030390
+    REFCNT = 1
+    FLAGS = (POK,pPOK)
+    PV = 0x7ffb7bc05b50 &quot;&quot;\0
+    CUR = 0
+    LEN = 10
+</pre>
+<p>Notice here the LEN is 10.  (It may differ on your platform.)  Extend the
+length of the string to one less than 10, and do a substitution:
 </p>
-<pre class="verbatim">  % ./perl -Ilib -MDevel::Peek -le 
'$a=&quot;12345&quot;; $a=~s/.//; Dump($a)'
-  SV = PVIV(0x8128450) at 0x81340f0
+<pre class="verbatim">  % ./perl -Ilib -MDevel::Peek -le '$a=&quot;&quot;; 
$a.=&quot;123456789&quot;; $a=~s/.//; Dump($a)'
+  SV = PV(0x7ffa04008a70) at 0x7ffa04030390
     REFCNT = 1
     FLAGS = (POK,OOK,pPOK)
-    IV = 1  (OFFSET)
-    PV = 0x8135781 ( &quot;1&quot; . ) &quot;2345&quot;\0
-    CUR = 4
-    LEN = 5
+    OFFSET = 1
+    PV = 0x7ffa03c05b61 ( &quot;\1&quot; . ) &quot;23456789&quot;\0
+    CUR = 8
+    LEN = 9
 </pre>
-<p>Here the number of bytes chopped off (1) is put into IV, and
-<code>Devel::Peek::Dump</code> helpfully reminds us that this is an offset.  
The
+<p>Here the number of bytes chopped off (1) is shown next as the OFFSET.  The
 portion of the string between the &quot;real&quot; and the &quot;fake&quot; 
beginnings is
 shown in parentheses, and the values of <code>SvCUR</code> and 
<code>SvLEN</code> reflect
-the fake beginning, not the real one.
+the fake beginning, not the real one.  (The first character of the string
+buffer happens to have changed to &quot;\1&quot; here, not &quot;1&quot;, 
because the current
+implementation stores the offset count in the string buffer.  This is
+subject to change.)
 </p>
 <p>Something similar to the offset hack is performed on AVs to enable
 efficient shifting and splicing off the beginning of the array; while
@@ -42840,14 +44139,15 @@
  --------------------------   ------         -------------
  \0 PERL_MAGIC_sv             vtbl_sv        Special scalar variable
  #  PERL_MAGIC_arylen         vtbl_arylen    Array length ($#ary)
- %  PERL_MAGIC_rhash          (none)         extra data for restricted
+ %  PERL_MAGIC_rhash          (none)         Extra data for restricted
                                              hashes
- &amp;  PERL_MAGIC_proto          (none)         my sub prototype CV
+ *  PERL_MAGIC_debugvar       vtbl_debugvar  $DB::single, signal, trace
+                                             vars
  .  PERL_MAGIC_pos            vtbl_pos       pos() lvalue
- :  PERL_MAGIC_symtab         (none)         extra data for symbol
+ :  PERL_MAGIC_symtab         (none)         Extra data for symbol
                                              tables
- &lt;  PERL_MAGIC_backref        vtbl_backref   for weak ref data
- @  PERL_MAGIC_arylen_p       (none)         to move arylen out of XPVAV
+ &lt;  PERL_MAGIC_backref        vtbl_backref   For weak ref data
+ @  PERL_MAGIC_arylen_p       (none)         To move arylen out of XPVAV
  B  PERL_MAGIC_bm             vtbl_regexp    Boyer-Moore 
                                              (fast string search)
  c  PERL_MAGIC_overload_table vtbl_ovrld     Holds overload table 
@@ -42875,7 +44175,7 @@
  P  PERL_MAGIC_tied           vtbl_pack      Tied array or hash
  p  PERL_MAGIC_tiedelem       vtbl_packelem  Tied array or hash element
  q  PERL_MAGIC_tiedscalar     vtbl_packelem  Tied scalar or handle
- r  PERL_MAGIC_qr             vtbl_regexp    precompiled qr// regex
+ r  PERL_MAGIC_qr             vtbl_regexp    Precompiled qr// regex
  S  PERL_MAGIC_sig            (none)         %SIG hash
  s  PERL_MAGIC_sigelem        vtbl_sigelem   %SIG hash element
  t  PERL_MAGIC_taint          vtbl_taint     Taintedness
@@ -42890,7 +44190,9 @@
  y  PERL_MAGIC_defelem        vtbl_defelem   Shadow &quot;foreach&quot; 
iterator
                                              variable / smart parameter
                                              vivification
- ]  PERL_MAGIC_checkcall      vtbl_checkcall inlining/mutation of call
+ \  PERL_MAGIC_lvref          vtbl_lvref     Lvalue reference
+                                             constructor
+ ]  PERL_MAGIC_checkcall      vtbl_checkcall Inlining/mutation of call
                                              to this CV
  ~  PERL_MAGIC_ext            (none)         Available for use by
                                              extensions
@@ -43030,7 +44332,7 @@
 <p>The perl tie function associates a variable with an object that implements
 the various GET, SET, etc methods.  To perform the equivalent of the perl
 tie function from an XSUB, you must mimic this behaviour.  The code below
-carries out the necessary steps - firstly it creates a new hash, and then
+carries out the necessary steps &ndash; firstly it creates a new hash, and then
 creates a second hash which it blesses into the class which will implement
 the tie methods.  Lastly it ties the two hashes together, and returns a
 reference to the new tied hash.  Note that the code below does NOT call the
@@ -43540,7 +44842,9 @@
 <em>target</em>s have <code>SVs_PADTMP</code> set.  But this has never been 
fully true.
 <code>SVs_PADMY</code> could be set on a variable that no longer resides in 
any pad.
 While <em>target</em>s do have <code>SVs_PADTMP</code> set, it can also be set 
on variables
-that have never resided in a pad, but nonetheless act like <em>target</em>s.
+that have never resided in a pad, but nonetheless act like <em>target</em>s.  
As
+of perl 5.21.5, the <code>SVs_PADMY</code> flag is no longer used and is 
defined as
+0.  <code>SvPADMY()</code> now returns true for anything without 
<code>SVs_PADTMP</code>.
 </p>
 <p>The correspondence between OPs and <em>target</em>s is not 1-to-1.  
Different
 OPs in the compile tree of the unit can use the same target, if this
@@ -43819,15 +45123,37 @@
 op is a <code>LISTOP</code>, which has any number of children.  In this case, 
the
 first child is pointed to by <code>op_first</code> and the last child by
 <code>op_last</code>.  The children in between can be found by iteratively
-following the <code>op_sibling</code> pointer from the first child to the last.
+following the <code>OpSIBLING</code> pointer from the first child to the last 
(but
+see below).
 </p>
-<p>There are also two other op types: a <code>PMOP</code> holds a regular 
expression,
+<p>There are also some other op types: a <code>PMOP</code> holds a regular 
expression,
 and has no children, and a <code>LOOP</code> may or may not have children.  If 
the
 <code>op_children</code> field is non-zero, it behaves like a 
<code>LISTOP</code>.  To
 complicate matters, if a <code>UNOP</code> is actually a <code>null</code> op 
after
 optimization (see <a 
href="#perlguts-Compile-pass-2_003a-context-propagation">Compile pass 2: 
context propagation</a>) it will still
 have children in accordance with its former type.
 </p>
+<p>Finally, there is a <code>LOGOP</code>, or logic op. Like a 
<code>LISTOP</code>, this has one
+or more children, but it doesn&rsquo;t have an <code>op_last</code> field: so 
you have to
+follow <code>op_first</code> and then the <code>OpSIBLING</code> chain itself 
to find the
+last child. Instead it has an <code>op_other</code> field, which is comparable 
to
+the <code>op_next</code> field described below, and represents an alternate
+execution path. Operators like <code>and</code>, <code>or</code> and 
<code>?</code> are <code>LOGOP</code>s. Note
+that in general, <code>op_other</code> may not point to any of the direct 
children
+of the <code>LOGOP</code>.
+</p>
+<p>Starting in version 5.21.2, perls built with the experimental
+define <code>-DPERL_OP_PARENT</code> add an extra boolean flag for each op,
+<code>op_moresib</code>.  When not set, this indicates that this is the last 
op in an
+<code>OpSIBLING</code> chain. This frees up the <code>op_sibling</code> field 
on the last
+sibling to point back to the parent op. Under this build, that field is
+also renamed <code>op_sibparent</code> to reflect its joint role. The macro
+<code>OpSIBLING(o)</code> wraps this special behaviour, and always returns 
NULL on
+the last sibling.  With this build the <code>op_parent(o)</code> function can 
be
+used to find the parent of any op. Thus for forward compatibility, you
+should always use the <code>OpSIBLING(o)</code> macro rather than accessing
+<code>op_sibling</code> directly.
+</p>
 <p>Another way to examine the tree is to use a compiler back-end module, such
 as <a href="B-Concise.html#Top">(B-Concise)</a>.
 </p>
@@ -44159,11 +45485,18 @@
 to use <code>dVAR</code> in your coding to &quot;declare the global 
variables&quot;
 when you are using them.  dTHX does this for you automatically.
 </p>
-<p>To see whether you have non-const data you can use a BSD-compatible 
<code>nm</code>:
+<p>To see whether you have non-const data you can use a BSD (or GNU)
+compatible <code>nm</code>:
 </p>
 <pre class="verbatim">  nm libperl.a | grep -v ' [TURtr] '
 </pre>
-<p>If this displays any <code>D</code> or <code>d</code> symbols, you have 
non-const data.
+<p>If this displays any <code>D</code> or <code>d</code> symbols (or possibly 
<code>C</code> or <code>c</code>),
+you have non-const data.  The symbols the <code>grep</code> removed are as 
follows:
+<code>Tt</code> are <em>text</em>, or code, the <code>Rr</code> are 
<em>read-only</em> (const) data,
+and the <code>U</code> is &lt;undefined&gt;, external symbols referred to.
+</p>
+<p>The test <samp>t/porting/libperl.t</samp> does this kind of symbol sanity
+checking on <code>libperl.a</code>.
 </p>
 <p>For backward compatibility reasons defining just PERL_GLOBAL_STRUCT
 doesn&rsquo;t actually hide all symbols inside a big global struct: some
@@ -44623,6 +45956,9 @@
 </pre>
 <p>The IVdf will expand to whatever is the correct format for the IVs.
 </p>
+<p>Note that there are different &quot;long doubles&quot;: Perl will use
+whatever the compiler has.
+</p>
 <p>If you are printing addresses of pointers, use UVxf combined
 with PTR2UV(), do not use %lx or %p.
 </p>
@@ -44703,7 +46039,7 @@
 <h4 class="subsection">28.10.4 Source Documentation</h4>
 
 <p>There&rsquo;s an effort going on to document the internal functions and
-automatically produce reference manuals from them - <a 
href="perlapi.html#Top">(perlapi)</a> is one
+automatically produce reference manuals from them &ndash; <a 
href="perlapi.html#Top">(perlapi)</a> is one
 such manual which details all the functions which are available to XS
 writers.  <a href="perlintern.html#Top">(perlintern)</a> is the autogenerated 
manual for the functions
 which are not part of the API and are supposedly for internal use only.
@@ -44775,7 +46111,9 @@
 </td></tr>
 <tr><td align="left" valign="top">&bull; <a 
href="#perlguts-How-do-I-convert-a-string-to-UTF_002d8_003f" 
accesskey="5">perlguts How do I convert a string to 
UTF-8?</a>:</td><td>&nbsp;&nbsp;</td><td align="left" valign="top">
 </td></tr>
-<tr><td align="left" valign="top">&bull; <a 
href="#perlguts-Is-there-anything-else-I-need-to-know_003f" 
accesskey="6">perlguts Is there anything else I need to 
know?</a>:</td><td>&nbsp;&nbsp;</td><td align="left" valign="top">
+<tr><td align="left" valign="top">&bull; <a 
href="#perlguts-How-do-I-compare-strings_003f" accesskey="6">perlguts How do I 
compare strings?</a>:</td><td>&nbsp;&nbsp;</td><td align="left" valign="top">
+</td></tr>
+<tr><td align="left" valign="top">&bull; <a 
href="#perlguts-Is-there-anything-else-I-need-to-know_003f" 
accesskey="7">perlguts Is there anything else I need to 
know?</a>:</td><td>&nbsp;&nbsp;</td><td align="left" valign="top">
 </td></tr>
 </table>
 
@@ -44810,6 +46148,12 @@
 a variable number of bytes to represent a character.  You can learn more
 about Unicode and Perl&rsquo;s Unicode model in <a 
href="#perlunicode-NAME">perlunicode NAME</a>.
 </p>
+<p>(On EBCDIC platforms, Perl uses instead UTF-EBCDIC, which is a form of
+UTF-8 adapted for EBCDIC platforms.  Below, we just talk about UTF-8.
+UTF-EBCDIC is like UTF-8, but the details are different.  The macros
+hide the differences from you, just remember that the particular numbers
+and bit patterns presented below will differ in UTF-EBCDIC.)
+</p>
 <hr>
 <a name="perlguts-How-can-I-recognise-a-UTF_002d8-string_003f"></a>
 <div class="header">
@@ -44823,14 +46167,15 @@
 non-UTF-8 data.  The Unicode character 200, (<code>0xC8</code> for you hex 
types)
 capital E with a grave accent, is represented by the two bytes
 <code>v196.172</code>.  Unfortunately, the non-Unicode string 
<code>chr(196).chr(172)</code>
-has that byte sequence as well.  So you can&rsquo;t tell just by looking - this
+has that byte sequence as well.  So you can&rsquo;t tell just by looking 
&ndash; this
 is what makes Unicode input an interesting problem.
 </p>
 <p>In general, you either have to know what you&rsquo;re dealing with, or you
 have to guess.  The API function <code>is_utf8_string</code> can help; 
it&rsquo;ll tell
-you if a string contains only valid UTF-8 characters.  However, it can&rsquo;t
-do the work for you.  On a character-by-character basis,
-<code>is_utf8_char_buf</code>
+you if a string contains only valid UTF-8 characters, and the chances
+of a non-UTF-8 string looking like valid UTF-8 become very small very
+quickly with increasing string length.  On a character-by-character
+basis, <code>isUTF8_CHAR</code>
 will tell you whether the current character in a string is valid UTF-8. 
 </p>
 <hr>
@@ -44847,8 +46192,9 @@
 byte, just like good ol&rsquo; ASCII.  Character 128 is stored as
 <code>v194.128</code>; this continues up to character 191, which is
 <code>v194.191</code>.  Now we&rsquo;ve run out of bits (191 is binary
-<code>10111111</code>) so we move on; 192 is <code>v195.128</code>.  And
+<code>10111111</code>) so we move on; character 192 is <code>v195.128</code>.  
And
 so it goes on, moving to three bytes at character 2048.
+<a href="#perlunicode-Unicode-Encodings">perlunicode Unicode Encodings</a> has 
pictures of how this works.
 </p>
 <p>Assuming you know you&rsquo;re dealing with a UTF-8 string, you can find out
 how long the first character in it is with the <code>UTF8SKIP</code> macro:
@@ -44867,7 +46213,7 @@
 </p>
 <p>All bytes in a multi-byte UTF-8 character will have the high bit set,
 so you can test if you need to do something special with this
-character like this (the UTF8_IS_INVARIANT() is a macro that tests
+character like this (the <code>UTF8_IS_INVARIANT()</code> is a macro that tests
 whether the byte is encoded as a single byte even in UTF-8):
 </p>
 <pre class="verbatim">    U8 *utf;
@@ -44886,7 +46232,7 @@
 value of the character; the inverse function <code>uvchr_to_utf8</code> is 
available
 for putting a UV into UTF-8:
 </p>
-<pre class="verbatim">    if (!UTF8_IS_INVARIANT(uv))
+<pre class="verbatim">    if (!UVCHR_IS_INVARIANT(uv))
         /* Must treat this as UTF8 */
         utf8 = uvchr_to_utf8(utf8, uv);
     else
@@ -44901,6 +46247,11 @@
 that character, you can never match a <code>chr(200)</code> in a non-UTF-8 
string.
 So don&rsquo;t do that!
 </p>
+<p>(Note that we don&rsquo;t have to test for invariant characters in the
+examples above.  The functions work on any well-formed UTF-8 input.
+It&rsquo;s just that its faster to avoid the function overhead when it&rsquo;s 
not
+needed.)
+</p>
 <hr>
 <a name="perlguts-How-does-Perl-store-UTF_002d8-strings_003f"></a>
 <div class="header">
@@ -44910,14 +46261,12 @@
 <a name="How-does-Perl-store-UTF_002d8-strings_003f"></a>
 <h4 class="subsection">28.11.4 How does Perl store UTF-8 strings?</h4>
 
-<p>Currently, Perl deals with Unicode strings and non-Unicode strings
+<p>Currently, Perl deals with UTF-8 strings and non-UTF-8 strings
 slightly differently.  A flag in the SV, <code>SVf_UTF8</code>, indicates that 
the
 string is internally encoded as UTF-8.  Without it, the byte value is the
-codepoint number and vice versa (in other words, the string is encoded
-as iso-8859-1, but <code>use feature 'unicode_strings'</code> is needed to get 
iso-8859-1
-semantics).  This flag is only meaningful if the SV is <code>SvPOK</code>
-or immediately after stringification via <code>SvPV</code> or a similar
-macro.  You can check and manipulate this flag with the
+codepoint number and vice versa.  This flag is only meaningful if the SV
+is <code>SvPOK</code> or immediately after stringification via 
<code>SvPV</code> or a
+similar macro.  You can check and manipulate this flag with the
 following macros:
 </p>
 <pre class="verbatim">    SvUTF8(sv)
@@ -44925,16 +46274,16 @@
     SvUTF8_off(sv)
 </pre>
 <p>This flag has an important effect on Perl&rsquo;s treatment of the string: 
if
-Unicode data is not properly distinguished, regular expressions,
+UTF-8 data is not properly distinguished, regular expressions,
 <code>length</code>, <code>substr</code> and other string handling operations 
will have
-undesirable results.
+undesirable (wrong) results.
 </p>
 <p>The problem comes when you have, for instance, a string that isn&rsquo;t
-flagged as UTF-8, and contains a byte sequence that could be UTF-8 -
+flagged as UTF-8, and contains a byte sequence that could be UTF-8 &ndash;
 especially when combining non-UTF-8 and UTF-8 strings.
 </p>
-<p>Never forget that the <code>SVf_UTF8</code> flag is separate to the PV 
value; you
-need be sure you don&rsquo;t accidentally knock it off while you&rsquo;re
+<p>Never forget that the <code>SVf_UTF8</code> flag is separate from the PV 
value; you
+need to be sure you don&rsquo;t accidentally knock it off while you&rsquo;re
 manipulating SVs.  More specifically, you cannot expect to do this:
 </p>
 <pre class="verbatim">    SV *sv;
@@ -44952,30 +46301,51 @@
 accordingly:
 </p>
 <pre class="verbatim">    p = SvPV(sv, len);
-    frobnicate(p);
+    is_utf8 = SvUTF8(sv);
+    frobnicate(p, is_utf8);
     nsv = newSVpvn(p, len);
-    if (SvUTF8(sv))
+    if (is_utf8)
         SvUTF8_on(nsv);
 </pre>
-<p>In fact, your <code>frobnicate</code> function should be made aware of 
whether or
-not it&rsquo;s dealing with UTF-8 data, so that it can handle the string
-appropriately.
+<p>In the above, your <code>frobnicate</code> function has been changed to be 
made
+aware of whether or not it&rsquo;s dealing with UTF-8 data, so that it can
+handle the string appropriately.
 </p>
 <p>Since just passing an SV to an XS function and copying the data of
 the SV is not enough to copy the UTF8 flags, even less right is just
-passing a <code>char *</code> to an XS function.
+passing a <code>char&nbsp;*</code><!-- /@w --> to an XS function.
+</p>
+<p>For full generality, use the <a 
href="perlapi.html#DO_005fUTF8">(perlapi)DO_UTF8</a> macro to see if the
+string in an SV is to be <em>treated</em> as UTF-8.  This takes into account
+if the call to the XS function is being made from within the scope of
+<a href="bytes.html#Top">(bytes)<code>use&nbsp;bytes</code><!-- /@w --></a>.  
If so, the underlying bytes that comprise the
+UTF-8 string are to be exposed, rather than the character they
+represent.  But this pragma should only really be used for debugging and
+perhaps low-level testing at the byte level.  Hence most XS code need
+not concern itself with this, but various areas of the perl core do need
+to support it.
+</p>
+<p>And this isn&rsquo;t the whole story.  Starting in Perl v5.12, strings that
+aren&rsquo;t encoded in UTF-8 may also be treated as Unicode under various
+conditions (see <a 
href="#perlunicode-ASCII-Rules-versus-Unicode-Rules">perlunicode ASCII Rules 
versus Unicode Rules</a>).
+This is only really a problem for characters whose ordinals are between
+128 and 255, and their behavior varies under ASCII versus Unicode rules
+in ways that your code cares about (see <a 
href="#perlunicode-The-_0022Unicode-Bug_0022">perlunicode The &quot;Unicode 
Bug&quot;</a>).
+There is no published API for dealing with this, as it is subject to
+change, but you can look at the code for <code>pp_lc</code> in 
<samp>pp.c</samp> for an
+example as to how it&rsquo;s currently done.
 </p>
 <hr>
 <a name="perlguts-How-do-I-convert-a-string-to-UTF_002d8_003f"></a>
 <div class="header">
 <p>
-Next: <a href="#perlguts-Is-there-anything-else-I-need-to-know_003f" 
accesskey="n" rel="next">perlguts Is there anything else I need to know?</a>, 
Previous: <a href="#perlguts-How-does-Perl-store-UTF_002d8-strings_003f" 
accesskey="p" rel="prev">perlguts How does Perl store UTF-8 strings?</a>, Up: 
<a href="#perlguts-Unicode-Support" accesskey="u" rel="up">perlguts Unicode 
Support</a> &nbsp; [<a href="#SEC_Contents" title="Table of contents" 
rel="contents">Contents</a>]</p>
+Next: <a href="#perlguts-How-do-I-compare-strings_003f" accesskey="n" 
rel="next">perlguts How do I compare strings?</a>, Previous: <a 
href="#perlguts-How-does-Perl-store-UTF_002d8-strings_003f" accesskey="p" 
rel="prev">perlguts How does Perl store UTF-8 strings?</a>, Up: <a 
href="#perlguts-Unicode-Support" accesskey="u" rel="up">perlguts Unicode 
Support</a> &nbsp; [<a href="#SEC_Contents" title="Table of contents" 
rel="contents">Contents</a>]</p>
 </div>
 <a name="How-do-I-convert-a-string-to-UTF_002d8_003f"></a>
 <h4 class="subsection">28.11.5 How do I convert a string to UTF-8?</h4>
 
 <p>If you&rsquo;re mixing UTF-8 and non-UTF-8 strings, it is necessary to 
upgrade
-one of the strings to UTF-8.  If you&rsquo;ve got an SV, the easiest way to do
+the non-UTF-8 strings to UTF-8.  If you&rsquo;ve got an SV, the easiest way to 
do
 this is:
 </p>
 <pre class="verbatim">    sv_utf8_upgrade(sv);
@@ -44997,28 +46367,54 @@
 in a single byte.
 </p>
 <hr>
+<a name="perlguts-How-do-I-compare-strings_003f"></a>
+<div class="header">
+<p>
+Next: <a href="#perlguts-Is-there-anything-else-I-need-to-know_003f" 
accesskey="n" rel="next">perlguts Is there anything else I need to know?</a>, 
Previous: <a href="#perlguts-How-do-I-convert-a-string-to-UTF_002d8_003f" 
accesskey="p" rel="prev">perlguts How do I convert a string to UTF-8?</a>, Up: 
<a href="#perlguts-Unicode-Support" accesskey="u" rel="up">perlguts Unicode 
Support</a> &nbsp; [<a href="#SEC_Contents" title="Table of contents" 
rel="contents">Contents</a>]</p>
+</div>
+<a name="How-do-I-compare-strings_003f"></a>
+<h4 class="subsection">28.11.6 How do I compare strings?</h4>
+
+<p><a href="perlapi.html#sv_005fcmp">(perlapi)sv_cmp</a> and <a 
href="perlapi.html#sv_005fcmp_005fflags">(perlapi)sv_cmp_flags</a> do a 
lexigraphic
+comparison of two SV&rsquo;s, and handle UTF-8ness properly.  Note, however,
+that Unicode specifies a much fancier mechanism for collation, available
+via the <a href="Unicode-Collate.html#Top">(Unicode-Collate)</a> module.
+</p>
+<p>To just compare two strings for equality/non-equality, you can just use
+<a href="perlapi.html#memEQ">(perlapi)<code>memEQ()</code></a> and <a 
href="perlapi.html#memEQ">(perlapi)<code>memNE()</code></a> as usual,
+except the strings must be both UTF-8 or not UTF-8 encoded.
+</p>
+<p>To compare two strings case-insensitively, use
+<a href="perlapi.html#foldEQ_005futf8">(perlapi)<code>foldEQ_utf8()</code></a> 
(the strings don&rsquo;t have to have
+the same UTF-8ness).
+</p>
+<hr>
 <a name="perlguts-Is-there-anything-else-I-need-to-know_003f"></a>
 <div class="header">
 <p>
-Previous: <a href="#perlguts-How-do-I-convert-a-string-to-UTF_002d8_003f" 
accesskey="p" rel="prev">perlguts How do I convert a string to UTF-8?</a>, Up: 
<a href="#perlguts-Unicode-Support" accesskey="u" rel="up">perlguts Unicode 
Support</a> &nbsp; [<a href="#SEC_Contents" title="Table of contents" 
rel="contents">Contents</a>]</p>
+Previous: <a href="#perlguts-How-do-I-compare-strings_003f" accesskey="p" 
rel="prev">perlguts How do I compare strings?</a>, Up: <a 
href="#perlguts-Unicode-Support" accesskey="u" rel="up">perlguts Unicode 
Support</a> &nbsp; [<a href="#SEC_Contents" title="Table of contents" 
rel="contents">Contents</a>]</p>
 </div>
 <a name="Is-there-anything-else-I-need-to-know_003f"></a>
-<h4 class="subsection">28.11.6 Is there anything else I need to know?</h4>
+<h4 class="subsection">28.11.7 Is there anything else I need to know?</h4>
 
 <p>Not really.  Just remember these things:
 </p>
 <ul>
-<li> There&rsquo;s no way to tell if a string is UTF-8 or not.  You can tell 
if an SV
-is UTF-8 by looking at its <code>SvUTF8</code> flag after stringifying it
-with <code>SvPV</code> or a similar macro.  Don&rsquo;t forget to set the flag 
if
-something should be UTF-8.  Treat the flag as part of the PV, even though
-it&rsquo;s not - if you pass on the PV to somewhere, pass on the flag too.
+<li> There&rsquo;s no way to tell if a <code>char&nbsp;*</code><!-- /@w --> or 
<code>U8&nbsp;*</code><!-- /@w --> string is UTF-8
+or not.  But you can tell if an SV is to be treated as UTF-8 by calling
+<code>DO_UTF8</code> on it, after stringifying it with <code>SvPV</code> or a 
similar
+macro.  And, you can tell if SV is actually UTF-8 (even if it is not to
+be treated as such) by looking at its <code>SvUTF8</code> flag (again after
+stringifying it).  Don&rsquo;t forget to set the flag if something should be
+UTF-8.
+Treat the flag as part of the PV, even though it&rsquo;s not &ndash; if you 
pass on
+the PV to somewhere, pass on the flag too.
 
 </li><li> If a string is UTF-8, <strong>always</strong> use 
<code>utf8_to_uvchr_buf</code> to get at the value,
 unless <code>UTF8_IS_INVARIANT(*s)</code> in which case you can use 
<code>*s</code>.
 
-</li><li> When writing a character <code>uv</code> to a UTF-8 string, 
<strong>always</strong> use
-<code>uvchr_to_utf8</code>, unless <code>UTF8_IS_INVARIANT(uv))</code> in 
which case
+</li><li> When writing a character UV to a UTF-8 string, 
<strong>always</strong> use
+<code>uvchr_to_utf8</code>, unless <code>UVCHR_IS_INVARIANT(uv))</code> in 
which case
 you can use <code>*s = uv</code>.
 
 </li><li> Mixing UTF-8 and non-UTF-8 strings is
@@ -45046,8 +46442,8 @@
 <p>This feature is implemented as a new op type, <code>OP_CUSTOM</code>.  The 
Perl
 core does not &quot;know&quot; anything special about this op type, and so it 
will
 not be involved in any optimizations.  This also means that you can
-define your custom ops to be any op structure - unary, binary, list and
-so on - you like.
+define your custom ops to be any op structure &ndash; unary, binary, list and
+so on &ndash; you like.
 </p>
 <p>It&rsquo;s important to know what custom operators won&rsquo;t do for you.  
They
 won&rsquo;t let you add new syntax to Perl, directly.  They won&rsquo;t even 
let you
@@ -45275,7 +46671,10 @@
 </pre>
 </li><li> Make your change
 
-<p>Hack, hack, hack.
+<p>Hack, hack, hack.  Keep in mind that Perl runs on many different
+platforms, with different operating systems that have different
+capabilities, different filesystem organizations, and even different
+character sets.  <a href="#perlhacktips-NAME">perlhacktips NAME</a> gives 
advice on this.
 </p>
 </li><li> Test your change
 
@@ -45759,7 +47158,7 @@
 </li><li> Opening brace lines up with &quot;if&quot; when conditional spans 
multiple lines;
 should be at end-of-line otherwise
 
-</li><li> In function definitions, name starts in column 0 (return value is on
+</li><li> In function definitions, name starts in column 0 (return value-type 
is on
 previous line)
 
 </li><li> Single space after keywords that are followed by parens, no space
@@ -46222,7 +47621,7 @@
 <ul>
 <li> <samp>t/base</samp>, <samp>t/comp</samp> and <samp>t/opbasic</samp>
 
-<p>Since we don&rsquo;t know if require works, or even subroutines, use ad hoc
+<p>Since we don&rsquo;t know if <code>require</code> works, or even 
subroutines, use ad hoc
 tests for these three.  Step carefully to avoid using the feature being
 tested.  Tests in <samp>t/opbasic</samp>, for instance, have been placed there
 rather than in <samp>t/op</samp> because they test functionality which
@@ -46250,8 +47649,60 @@
 <samp>lib/</samp>, so here&rsquo;s some opportunity for some patching.
 </p>
 <p>You must be triply conscious of cross-platform concerns.  This usually
-boils down to using <a href="File-Spec.html#Top">(File-Spec)</a> and avoiding 
things like <code>fork()</code>
-and <code>system()</code> unless absolutely necessary.
+boils down to using <a href="File-Spec.html#Top">(File-Spec)</a>, avoiding 
things like <code>fork()</code>
+and <code>system()</code> unless absolutely necessary, and not assuming that a
+given character has a particular ordinal value (code point) or that its
+UTF-8 representation is composed of particular bytes.
+</p>
+<p>There are several functions available to specify characters and code
+points portably in tests.  The always-preloaded functions
+<code>utf8::unicode_to_native()</code> and its inverse
+<code>utf8::native_to_unicode()</code> take code points and translate
+appropriately.  The file <samp>t/charset_tools.pl</samp> has several functions
+that can be useful.  It has versions of the previous two functions
+that take strings as inputs &ndash; not single numeric code points:
+<code>uni_to_native()</code> and <code>native_to_uni()</code>.  If you must 
look at the
+individual bytes comprising a UTF-8 encoded string,
+<code>byte_utf8a_to_utf8n()</code> takes as input a string of those bytes 
encoded
+for an ASCII platform, and returns the equivalent string in the native
+platform.  For example, <code>byte_utf8a_to_utf8n(&quot;\xC2\xA0&quot;)</code> 
returns the
+byte sequence on the current platform that form the UTF-8 for 
<code>U+00A0</code>,
+since <code>&quot;\xC2\xA0&quot;</code> are the UTF-8 bytes on an ASCII 
platform for that
+code point.  This function returns <code>&quot;\xC2\xA0&quot;</code> on an 
ASCII platform, and
+<code>&quot;\x80\x41&quot;</code> on an EBCDIC 1047 one.
+</p>
+<p>But easiest is, if the character is specifiable as a literal, like
+<code>&quot;A&quot;</code> or <code>&quot;%&quot;</code>, to use that; if not 
so specificable, you can use use
+<code>\N{}</code> , if the side effects aren&rsquo;t troublesome.  Simply 
specify all
+your characters in hex, using <code>\N{U+ZZ}</code> instead of 
<code>\xZZ</code>.  <code>\N{}</code>
+is the Unicode name, and so it
+always gives you the Unicode character.  <code>\N{U+41}</code> is the character
+whose Unicode code point is <code>0x41</code>, hence is <code>'A'</code> on 
all platforms.
+The side effects are:
+</p>
+<dl compact="compact">
+<dt>1)</dt>
+<dd><a name="perlhack-1_0029"></a>
+<p>These select Unicode rules.  That means that in double-quotish strings,
+the string is always converted to UTF-8 to force a Unicode
+interpretation (you can <code>utf8::downgrade()</code> afterwards to convert 
back
+to non-UTF8, if possible).  In regular expression patterns, the
+conversion isn&rsquo;t done, but if the character set modifier would
+otherwise be <code>/d</code>, it is changed to <code>/u</code>.
+</p>
+</dd>
+<dt>2)</dt>
+<dd><a name="perlhack-2_0029"></a>
+<p>If you use the form <code>\N{<em>character name</em>}</code>, the <a 
href="charnames.html#Top">(charnames)</a> module
+gets automatically loaded.  This may not be suitable for the test level
+you are doing.
+</p>
+</dd>
+</dl>
+
+<p>If you are testing locales (see <a href="#perllocale-NAME">perllocale 
NAME</a>), there are helper
+functions in <samp>t/loc_tools.pl</samp> to enable you to see what locales 
there
+are on the current platform.
 </p>
 <table class="menu" border="0" cellspacing="0">
 <tr><td align="left" valign="top">&bull; <a 
href="#perlhack-Special-make-test-targets" accesskey="1">perlhack Special 
<code>make test</code> targets</a>:</td><td>&nbsp;&nbsp;</td><td align="left" 
valign="top">
@@ -46262,6 +47713,8 @@
 </td></tr>
 <tr><td align="left" valign="top">&bull; <a 
href="#perlhack-Using-t_002fharness-for-testing" accesskey="4">perlhack Using 
<samp>t/harness</samp> for testing</a>:</td><td>&nbsp;&nbsp;</td><td 
align="left" valign="top">
 </td></tr>
+<tr><td align="left" valign="top">&bull; <a 
href="#perlhack-Performance-testing" accesskey="5">perlhack Performance 
testing</a>:</td><td>&nbsp;&nbsp;</td><td align="left" valign="top">
+</td></tr>
 </table>
 
 <hr>
@@ -46370,7 +47823,7 @@
 <a name="perlhack-Using-t_002fharness-for-testing"></a>
 <div class="header">
 <p>
-Previous: <a href="#perlhack-Running-tests-by-hand" accesskey="p" 
rel="prev">perlhack Running tests by hand</a>, Up: <a href="#perlhack-TESTING" 
accesskey="u" rel="up">perlhack TESTING</a> &nbsp; [<a href="#SEC_Contents" 
title="Table of contents" rel="contents">Contents</a>]</p>
+Next: <a href="#perlhack-Performance-testing" accesskey="n" 
rel="next">perlhack Performance testing</a>, Previous: <a 
href="#perlhack-Running-tests-by-hand" accesskey="p" rel="prev">perlhack 
Running tests by hand</a>, Up: <a href="#perlhack-TESTING" accesskey="u" 
rel="up">perlhack TESTING</a> &nbsp; [<a href="#SEC_Contents" title="Table of 
contents" rel="contents">Contents</a>]</p>
 </div>
 <a name="Using-t_002fharness-for-testing"></a>
 <h4 class="subsection">29.8.4 Using <samp>t/harness</samp> for testing</h4>
@@ -46485,6 +47938,34 @@
 more environment variables that affect testing.
 </p>
 <hr>
+<a name="perlhack-Performance-testing"></a>
+<div class="header">
+<p>
+Previous: <a href="#perlhack-Using-t_002fharness-for-testing" accesskey="p" 
rel="prev">perlhack Using <samp>t/harness</samp> for testing</a>, Up: <a 
href="#perlhack-TESTING" accesskey="u" rel="up">perlhack TESTING</a> &nbsp; [<a 
href="#SEC_Contents" title="Table of contents" rel="contents">Contents</a>]</p>
+</div>
+<a name="Performance-testing"></a>
+<h4 class="subsection">29.8.5 Performance testing</h4>
+
+<p>The file <samp>t/perf/benchmarks</samp> contains snippets of perl code 
which are
+intended to be benchmarked across a range of perls by the
+<samp>Porting/bench.pl</samp> tool. If you fix or enhance a performance issue, 
you
+may want to add a representative code sample to the file, then run
+<samp>bench.pl</samp> against the previous and current perls to see what 
difference
+it has made, and whether anything else has slowed down as a consequence.
+</p>
+<p>The file <samp>t/perf/opcount.t</samp> is designed to test whether a 
particular
+code snippet has been compiled into an optree containing specified
+numbers of particular op types. This is good for testing whether
+optimisations which alter ops, such as converting an <code>aelem</code> op 
into an
+<code>aelemfast</code> op, are really doing that.
+</p>
+<p>The files <samp>t/perf/speed.t</samp> and <samp>t/re/speed.t</samp> are 
designed to test
+things that run thousands of times slower if a particular optimisation
+is broken (for example, the utf8 length cache on long utf8 strings).
+Add a test that will take a fraction of a second normally, and minutes
+otherwise, causing the test file to time out on failure.
+</p>
+<hr>
 <a name="perlhack-MORE-READING-FOR-GUTS-HACKERS"></a>
 <div class="header">
 <p>
@@ -46901,7 +48382,7 @@
 selection of <code>-W</code> flags (see cflags.SH).
 </p>
 <p>Also study <a href="#perlport-NAME">perlport NAME</a> carefully to avoid 
any bad assumptions about the
-operating system, filesystems, and so forth.
+operating system, filesystems, character set, and so forth.
 </p>
 <p>You may once in a while try a &quot;make microperl&quot; to see whether we 
can
 still compile Perl with just the bare minimum of interfaces.  (See
@@ -46926,7 +48407,7 @@
 does it right.  (Likewise, there are PTR2UV(), PTR2NV(), INT2PTR(), and
 NUM2PTR().)
 </p>
-</li><li> Casting between data function pointers and data pointers
+</li><li> Casting between function pointers and data pointers
 
 <p>Technically speaking casting between function pointers and data
 pointers is unportable and undefined, but practically speaking it seems
@@ -46999,16 +48480,32 @@
 <p>Perl can compile and run under EBCDIC platforms.  See <a 
href="#perlebcdic-NAME">perlebcdic NAME</a>.
 This is transparent for the most part, but because the character sets
 differ, you shouldn&rsquo;t use numeric (decimal, octal, nor hex) constants
-to refer to characters.  You can safely say &rsquo;A&rsquo;, but not 0x41.  
You can
-safely say &rsquo;\n&rsquo;, but not \012.  If a character doesn&rsquo;t have 
a trivial
-input form, you should add it to the list in
-<samp>regen/unicode_constants.pl</samp>, and have Perl create #defines for you,
+to refer to characters.  You can safely say <code>'A'</code>, but not 
<code>0x41</code>.
+You can safely say <code>'\n'</code>, but not <code>\012</code>.  However, you 
can use
+macros defined in <samp>utf8.h</samp> to specify any code point portably.
+<code>LATIN1_TO_NATIVE(0xDF)</code> is going to be the code point that means
+LATIN SMALL LETTER SHARP S on whatever platform you are running on (on
+ASCII platforms it compiles without adding any extra code, so there is
+zero performance hit on those).  The acceptable inputs to
+<code>LATIN1_TO_NATIVE</code> are from <code>0x00</code> through 
<code>0xFF</code>.  If your input
+isn&rsquo;t guaranteed to be in that range, use <code>UNICODE_TO_NATIVE</code> 
instead.
+<code>NATIVE_TO_LATIN1</code> and <code>NATIVE_TO_UNICODE</code> translate the 
opposite
+direction.
+</p>
+<p>If you need the string representation of a character that doesn&rsquo;t 
have a
+mnemonic name in C, you should add it to the list in
+<samp>regen/unicode_constants.pl</samp>, and have Perl create 
<code>#define</code>&rsquo;s for you,
 based on the current platform.
 </p>
+<p>Note that the <code>is<em>FOO</em></code> and <code>to<em>FOO</em></code> 
macros in <samp>handy.h</samp> work
+properly on native code points and strings.
+</p>
 <p>Also, the range &rsquo;A&rsquo; - &rsquo;Z&rsquo; in ASCII is an unbroken 
sequence of 26 upper
 case alphabetic characters.  That is not true in EBCDIC.  Nor for 
&rsquo;a&rsquo; to
 &rsquo;z&rsquo;.  But &rsquo;0&rsquo; - &rsquo;9&rsquo; is an unbroken range 
in both systems.  Don&rsquo;t assume
-anything about other ranges.
+anything about other ranges.  (Note that special handling of ranges in
+regular expression patterns makes it appear to Perl
+code that the aforementioned ranges are all unbroken.)
 </p>
 <p>Many of the comments in the existing code ignore the possibility of
 EBCDIC, and may be wrong therefore, even if the code works.  This is
@@ -47017,11 +48514,11 @@
 </p>
 <p>UTF-8 and UTF-EBCDIC are two different encodings used to represent
 Unicode code points as sequences of bytes.  Macros  with the same names
-(but different definitions) in <code>utf8.h</code> and 
<code>utfebcdic.h</code> are used to
+(but different definitions) in <samp>utf8.h</samp> and 
<samp>utfebcdic.h</samp> are used to
 allow the calling code to think that there is only one such encoding.
 This is almost always referred to as <code>utf8</code>, but it means the EBCDIC
 version as well.  Again, comments in the code may well be wrong even if
-the code itself is right.  For example, the concept of <code>invariant
+the code itself is right.  For example, the concept of UTF-8 <code>invariant
 characters</code> differs between ASCII and EBCDIC.  On ASCII platforms, only
 characters that do not have the high-order bit set (i.e.  whose ordinals
 are strict ASCII, 0 - 127) are invariant, and the documentation and
@@ -47031,14 +48528,39 @@
 <code>NATIVE_IS_INVARIANT()</code> macro appropriately, it works, even if the
 comments are wrong.
 </p>
+<p>As noted in <a href="#perlhack-TESTING">perlhack TESTING</a>, when writing 
test scripts, the file
+<samp>t/charset_tools.pl</samp> contains some helpful functions for writing 
tests
+valid on both ASCII and EBCDIC platforms.  Sometimes, though, a test
+can&rsquo;t use a function and it&rsquo;s inconvenient to have different test
+versions depending on the platform.  There are 20 code points that are
+the same in all 4 character sets currently recognized by Perl (the 3
+EBCDIC code pages plus ISO 8859-1 (ASCII/Latin1)).  These can be used in
+such tests, though there is a small possibility that Perl will become
+available in yet another character set, breaking your test.  All but one
+of these code points are C0 control characters.  The most significant
+controls that are the same are <code>\0</code>, <code>\r</code>, and 
<code>\N{VT}</code> (also
+specifiable as <code>\cK</code>, <code>\x0B</code>, <code>\N{U+0B}</code>, or 
<code>\013</code>).  The single
+non-control is U+00B6 PILCROW SIGN.  The controls that are the same have
+the same bit pattern in all 4 character sets, regardless of the UTF8ness
+of the string containing them.  The bit pattern for U+B6 is the same in
+all 4 for non-UTF8 strings, but differs in each when its containing
+string is UTF-8 encoded.  The only other code points that have some sort
+of sameness across all 4 character sets are the pair 0xDC and 0xFC.
+Together these represent upper- and lowercase LATIN LETTER U WITH
+DIAERESIS, but which is upper and which is lower may be reversed: 0xDC
+is the capital in Latin1 and 0xFC is the small letter, while 0xFC is the
+capital in EBCDIC and 0xDC is the small one.  This factoid may be
+exploited in writing case insensitive tests that are the same across all
+4 character sets.
+</p>
 </li><li> Assuming the character set is just ASCII
 
 <p>ASCII is a 7 bit encoding, but bytes have 8 bits in them.  The 128 extra
 characters have different meanings depending on the locale.  Absent a
 locale, currently these extra characters are generally considered to be
-unassigned, and this has presented some problems.  This is being changed
-starting in 5.12 so that these characters will be considered to be
-Latin-1 (ISO-8859-1).
+unassigned, and this has presented some problems.  This has being
+changed starting in 5.12 so that these characters can be considered to
+be Latin-1 (ISO-8859-1).
 </p>
 </li><li> Mixing #define and #ifdef
 
@@ -47254,6 +48776,37 @@
 <p>But in any case, try to keep the features and operating systems
 separate.
 </p>
+</li><li> Assuming the contents of static memory pointed to by the return 
values
+of Perl wrappers for C library functions doesn&rsquo;t change.  Many C library
+functions return pointers to static storage that can be overwritten by
+subsequent calls to the same or related functions.  Perl has
+light-weight wrappers for some of these functions, and which don&rsquo;t make
+copies of the static memory.  A good example is the interface to the
+environment variables that are in effect for the program.  Perl has
+<code>PerlEnv_getenv</code> to get values from the environment.  But the 
return is
+a pointer to static memory in the C library.  If you are using the value
+to immediately test for something, that&rsquo;s fine, but if you save the
+value and expect it to be unchanged by later processing, you would be
+wrong, but perhaps you wouldn&rsquo;t know it because different C library
+implementations behave differently, and the one on the platform you&rsquo;re
+testing on might work for your situation.  But on some platforms, a
+subsequent call to <code>PerlEnv_getenv</code> or related function WILL 
overwrite
+the memory that your first call points to.  This has led to some
+hard-to-debug problems.  Do a <a 
href="perlapi.html#savepv">(perlapi)savepv</a> to make a copy, thus
+avoiding these problems.  You will have to free the copy when you&rsquo;re
+done to avoid memory leaks.  If you don&rsquo;t have control over when it gets
+freed, you&rsquo;ll need to make the copy in a mortal scalar, like so:
+
+<pre class="verbatim"> if ((s = PerlEnv_getenv(&quot;foo&quot;) == NULL) {
+    ... /* handle NULL case */
+ }
+ else {
+     s = SvPVX(sv_2mortal(newSVpv(s, 0)));
+ }
+</pre>
+<p>The above example works only if <code>&quot;s&quot;</code> is 
<code>NUL</code>-terminated; otherwise
+you have to pass its length to <code>newSVpv</code>.
+</p>
 </li></ul>
 
 <hr>
@@ -47317,6 +48870,18 @@
 simply skipped without any notice.
 <a 
href="https://sourceware.org/bugzilla/show_bug.cgi?id=6530";>https://sourceware.org/bugzilla/show_bug.cgi?id=6530</a>.
 </p>
+</li><li> Do not use atoi()
+
+<p>Use grok_atoUV() instead.  atoi() has ill-defined behavior on overflows,
+and cannot be used for incremental parsing.  It is also affected by locale,
+which is bad.
+</p>
+</li><li> Do not use strtol() or strtoul()
+
+<p>Use grok_atoUV() instead.  strtol() or strtoul() (or their IV/UV-friendly
+macro disguises, Strtol() and Strtoul(), or Atol() and Atoul() are
+affected by locale, which is bad.
+</p>
 </li></ul>
 
 <hr>
@@ -47466,7 +49031,7 @@
 <pre class="verbatim">  (gdb) ptype PL_op
   type = struct op {
       OP *op_next;
-      OP *op_sibling;
+      OP *op_sibparent;
       OP *(*op_ppaddr)(void);
       PADOFFSET op_targ;
       unsigned int op_type : 9;
@@ -47896,8 +49461,24 @@
 on x86, x86-64 and PowerPC and Darwin (OS X) on x86 and x86-64).  The
 special &quot;test.valgrind&quot; target can be used to run the tests under
 valgrind.  Found errors and memory leaks are logged in files named
-<samp>testfile.valgrind</samp>.
+<samp>testfile.valgrind</samp> and by default output is displayed inline.
+</p>
+<p>Example usage:
+</p>
+<pre class="verbatim">    make test.valgrind
+</pre>
+<p>Since valgrind adds significant overhead, tests will take much longer to
+run.  The valgrind tests support being run in parallel to help with this:
+</p>
+<pre class="verbatim">    TEST_JOBS=9 make test.valgrind
+</pre>
+<p>Note that the above two invocations will be very verbose as reachable
+memory and leak-checking is enabled by default.  If you want to just see
+pure errors, try:
 </p>
+<pre class="verbatim">    VG_OPTS='-q --leak-check=no --show-reachable=no' 
TEST_JOBS=9 \
+        make test.valgrind
+</pre>
 <p>Valgrind also provides a cachegrind tool, invoked on perl as:
 </p>
 <pre class="verbatim">    VG_OPTS=--tool=cachegrind make test.valgrind
@@ -48139,13 +49720,15 @@
 </td></tr>
 <tr><td align="left" valign="top">&bull; <a href="#perlhacktips-DDD-over-gdb" 
accesskey="3">perlhacktips DDD over gdb</a>:</td><td>&nbsp;&nbsp;</td><td 
align="left" valign="top">
 </td></tr>
-<tr><td align="left" valign="top">&bull; <a href="#perlhacktips-Poison" 
accesskey="4">perlhacktips Poison</a>:</td><td>&nbsp;&nbsp;</td><td 
align="left" valign="top">
+<tr><td align="left" valign="top">&bull; <a href="#perlhacktips-C-backtrace" 
accesskey="4">perlhacktips C backtrace</a>:</td><td>&nbsp;&nbsp;</td><td 
align="left" valign="top">
+</td></tr>
+<tr><td align="left" valign="top">&bull; <a href="#perlhacktips-Poison" 
accesskey="5">perlhacktips Poison</a>:</td><td>&nbsp;&nbsp;</td><td 
align="left" valign="top">
 </td></tr>
-<tr><td align="left" valign="top">&bull; <a 
href="#perlhacktips-Read_002donly-optrees" accesskey="5">perlhacktips Read-only 
optrees</a>:</td><td>&nbsp;&nbsp;</td><td align="left" valign="top">
+<tr><td align="left" valign="top">&bull; <a 
href="#perlhacktips-Read_002donly-optrees" accesskey="6">perlhacktips Read-only 
optrees</a>:</td><td>&nbsp;&nbsp;</td><td align="left" valign="top">
 </td></tr>
-<tr><td align="left" valign="top">&bull; <a 
href="#perlhacktips-When-is-a-bool-not-a-bool_003f" accesskey="6">perlhacktips 
When is a bool not a bool?</a>:</td><td>&nbsp;&nbsp;</td><td align="left" 
valign="top">
+<tr><td align="left" valign="top">&bull; <a 
href="#perlhacktips-When-is-a-bool-not-a-bool_003f" accesskey="7">perlhacktips 
When is a bool not a bool?</a>:</td><td>&nbsp;&nbsp;</td><td align="left" 
valign="top">
 </td></tr>
-<tr><td align="left" valign="top">&bull; <a 
href="#perlhacktips-The-_002ei-Targets" accesskey="7">perlhacktips The .i 
Targets</a>:</td><td>&nbsp;&nbsp;</td><td align="left" valign="top">
+<tr><td align="left" valign="top">&bull; <a 
href="#perlhacktips-The-_002ei-Targets" accesskey="8">perlhacktips The .i 
Targets</a>:</td><td>&nbsp;&nbsp;</td><td align="left" valign="top">
 </td></tr>
 </table>
 
@@ -48244,7 +49827,7 @@
 <a name="perlhacktips-DDD-over-gdb"></a>
 <div class="header">
 <p>
-Next: <a href="#perlhacktips-Poison" accesskey="n" rel="next">perlhacktips 
Poison</a>, Previous: <a href="#perlhacktips-PERL_005fMEM_005fLOG" 
accesskey="p" rel="prev">perlhacktips PERL_MEM_LOG</a>, Up: <a 
href="#perlhacktips-MISCELLANEOUS-TRICKS" accesskey="u" rel="up">perlhacktips 
MISCELLANEOUS TRICKS</a> &nbsp; [<a href="#SEC_Contents" title="Table of 
contents" rel="contents">Contents</a>]</p>
+Next: <a href="#perlhacktips-C-backtrace" accesskey="n" 
rel="next">perlhacktips C backtrace</a>, Previous: <a 
href="#perlhacktips-PERL_005fMEM_005fLOG" accesskey="p" rel="prev">perlhacktips 
PERL_MEM_LOG</a>, Up: <a href="#perlhacktips-MISCELLANEOUS-TRICKS" 
accesskey="u" rel="up">perlhacktips MISCELLANEOUS TRICKS</a> &nbsp; [<a 
href="#SEC_Contents" title="Table of contents" rel="contents">Contents</a>]</p>
 </div>
 <a name="DDD-over-gdb"></a>
 <h4 class="subsection">30.8.3 DDD over gdb</h4>
@@ -48282,13 +49865,105 @@
 <p>Note: you can define up to 20 conversion shortcuts in the gdb section.
 </p>
 <hr>
+<a name="perlhacktips-C-backtrace"></a>
+<div class="header">
+<p>
+Next: <a href="#perlhacktips-Poison" accesskey="n" rel="next">perlhacktips 
Poison</a>, Previous: <a href="#perlhacktips-DDD-over-gdb" accesskey="p" 
rel="prev">perlhacktips DDD over gdb</a>, Up: <a 
href="#perlhacktips-MISCELLANEOUS-TRICKS" accesskey="u" rel="up">perlhacktips 
MISCELLANEOUS TRICKS</a> &nbsp; [<a href="#SEC_Contents" title="Table of 
contents" rel="contents">Contents</a>]</p>
+</div>
+<a name="C-backtrace"></a>
+<h4 class="subsection">30.8.4 C backtrace</h4>
+
+<p>On some platforms Perl supports retrieving the C level backtrace
+(similar to what symbolic debuggers like gdb do).
+</p>
+<p>The backtrace returns the stack trace of the C call frames,
+with the symbol names (function names), the object names (like 
&quot;perl&quot;),
+and if it can, also the source code locations (file:line).
+</p>
+<p>The supported platforms are Linux, and OS X (some *BSD might
+work at least partly, but they have not yet been tested).
+</p>
+<p>This feature hasn&rsquo;t been tested with multiple threads, but it will
+only show the backtrace of the thread doing the backtracing.
+</p>
+<p>The feature needs to be enabled with <code>Configure -Dusecbacktrace</code>.
+</p>
+<p>The <code>-Dusecbacktrace</code> also enables keeping the debug information 
when
+compiling/linking (often: <code>-g</code>).  Many compilers/linkers do support
+having both optimization and keeping the debug information.  The debug
+information is needed for the symbol names and the source locations.
+</p>
+<p>Static functions might not be visible for the backtrace.
+</p>
+<p>Source code locations, even if available, can often be missing or
+misleading if the compiler has e.g. inlined code.  Optimizer can
+make matching the source code and the object code quite challenging.
+</p>
+<dl compact="compact">
+<dt>Linux</dt>
+<dd><a name="perlhacktips-Linux"></a>
+<p>You <strong>must</strong> have the BFD (-lbfd) library installed, otherwise 
<code>perl</code> will
+fail to link.  The BFD is usually distributed as part of the GNU binutils.
+</p>
+<p>Summary: <code>Configure ... -Dusecbacktrace</code>
+and you need <code>-lbfd</code>.
+</p>
+</dd>
+<dt>OS X</dt>
+<dd><a name="perlhacktips-OS-X"></a>
+<p>The source code locations are supported <strong>only</strong> if you have
+the Developer Tools installed.  (BFD is <strong>not</strong> needed.)
+</p>
+<p>Summary: <code>Configure ... -Dusecbacktrace</code>
+and installing the Developer Tools would be good.
+</p>
+</dd>
+</dl>
+
+<p>Optionally, for trying out the feature, you may want to enable
+automatic dumping of the backtrace just before a warning or croak (die)
+message is emitted, by adding <code>-Accflags=-DUSE_C_BACKTRACE_ON_ERROR</code>
+for Configure.
+</p>
+<p>Unless the above additional feature is enabled, nothing about the
+backtrace functionality is visible, except for the Perl/XS level.
+</p>
+<p>Furthermore, even if you have enabled this feature to be compiled,
+you need to enable it in runtime with an environment variable:
+<code>PERL_C_BACKTRACE_ON_ERROR=10</code>.  It must be an integer higher
+than zero, telling the desired frame count.
+</p>
+<p>Retrieving the backtrace from Perl level (using for example an XS
+extension) would be much less exciting than one would hope: normally
+you would see <code>runops</code>, <code>entersub</code>, and not much else.  
This API is
+intended to be called <strong>from within</strong> the Perl implementation, 
not from
+Perl level execution.
+</p>
+<p>The C API for the backtrace is as follows:
+</p>
+<dl compact="compact">
+<dt>get_c_backtrace</dt>
+<dd><a name="perlhacktips-get_005fc_005fbacktrace"></a>
+</dd>
+<dt>free_c_backtrace</dt>
+<dd><a name="perlhacktips-free_005fc_005fbacktrace"></a>
+</dd>
+<dt>get_c_backtrace_dump</dt>
+<dd><a name="perlhacktips-get_005fc_005fbacktrace_005fdump"></a>
+</dd>
+<dt>dump_c_backtrace</dt>
+<dd><a name="perlhacktips-dump_005fc_005fbacktrace"></a>
+</dd>
+</dl>
+
+<hr>
 <a name="perlhacktips-Poison"></a>
 <div class="header">
 <p>
-Next: <a href="#perlhacktips-Read_002donly-optrees" accesskey="n" 
rel="next">perlhacktips Read-only optrees</a>, Previous: <a 
href="#perlhacktips-DDD-over-gdb" accesskey="p" rel="prev">perlhacktips DDD 
over gdb</a>, Up: <a href="#perlhacktips-MISCELLANEOUS-TRICKS" accesskey="u" 
rel="up">perlhacktips MISCELLANEOUS TRICKS</a> &nbsp; [<a href="#SEC_Contents" 
title="Table of contents" rel="contents">Contents</a>]</p>
+Next: <a href="#perlhacktips-Read_002donly-optrees" accesskey="n" 
rel="next">perlhacktips Read-only optrees</a>, Previous: <a 
href="#perlhacktips-C-backtrace" accesskey="p" rel="prev">perlhacktips C 
backtrace</a>, Up: <a href="#perlhacktips-MISCELLANEOUS-TRICKS" accesskey="u" 
rel="up">perlhacktips MISCELLANEOUS TRICKS</a> &nbsp; [<a href="#SEC_Contents" 
title="Table of contents" rel="contents">Contents</a>]</p>
 </div>
 <a name="Poison"></a>
-<h4 class="subsection">30.8.4 Poison</h4>
+<h4 class="subsection">30.8.5 Poison</h4>
 
 <p>If you see in a debugger a memory area mysteriously full of 0xABABABAB
 or 0xEFEFEFEF, you may be seeing the effect of the Poison() macros, see
@@ -48301,7 +49976,7 @@
 Next: <a href="#perlhacktips-When-is-a-bool-not-a-bool_003f" accesskey="n" 
rel="next">perlhacktips When is a bool not a bool?</a>, Previous: <a 
href="#perlhacktips-Poison" accesskey="p" rel="prev">perlhacktips Poison</a>, 
Up: <a href="#perlhacktips-MISCELLANEOUS-TRICKS" accesskey="u" 
rel="up">perlhacktips MISCELLANEOUS TRICKS</a> &nbsp; [<a href="#SEC_Contents" 
title="Table of contents" rel="contents">Contents</a>]</p>
 </div>
 <a name="Read_002donly-optrees"></a>
-<h4 class="subsection">30.8.5 Read-only optrees</h4>
+<h4 class="subsection">30.8.6 Read-only optrees</h4>
 
 <p>Under ithreads the optree is read only.  If you want to enforce this, to
 check for write accesses from buggy code, compile with
@@ -48325,7 +50000,7 @@
 Next: <a href="#perlhacktips-The-_002ei-Targets" accesskey="n" 
rel="next">perlhacktips The .i Targets</a>, Previous: <a 
href="#perlhacktips-Read_002donly-optrees" accesskey="p" 
rel="prev">perlhacktips Read-only optrees</a>, Up: <a 
href="#perlhacktips-MISCELLANEOUS-TRICKS" accesskey="u" rel="up">perlhacktips 
MISCELLANEOUS TRICKS</a> &nbsp; [<a href="#SEC_Contents" title="Table of 
contents" rel="contents">Contents</a>]</p>
 </div>
 <a name="When-is-a-bool-not-a-bool_003f"></a>
-<h4 class="subsection">30.8.6 When is a bool not a bool?</h4>
+<h4 class="subsection">30.8.7 When is a bool not a bool?</h4>
 
 <p>On pre-C99 compilers, <code>bool</code> is defined as equivalent to 
<code>char</code>.
 Consequently assignment of any larger type to a <code>bool</code> is unsafe 
and may
@@ -48348,7 +50023,7 @@
 Previous: <a href="#perlhacktips-When-is-a-bool-not-a-bool_003f" accesskey="p" 
rel="prev">perlhacktips When is a bool not a bool?</a>, Up: <a 
href="#perlhacktips-MISCELLANEOUS-TRICKS" accesskey="u" rel="up">perlhacktips 
MISCELLANEOUS TRICKS</a> &nbsp; [<a href="#SEC_Contents" title="Table of 
contents" rel="contents">Contents</a>]</p>
 </div>
 <a name="The-_002ei-Targets"></a>
-<h4 class="subsection">30.8.7 The .i Targets</h4>
+<h4 class="subsection">30.8.8 The .i Targets</h4>
 
 <p>You can expand the macros in a <samp>foo.c</samp> file by saying
 </p>
@@ -48723,8 +50398,8 @@
 Matt S Trout, David Golden, Florian Ragwitz, Tatsuhiko Miyagawa,
 Chris <code>BinGOs</code> Williams, Zefram, ÃÂvar ArnfjÃÂ¶rÃÂ° Bjarmason, 
Stevan
 Little, Dave Rolsky, Max Maischein, Abigail, Jesse Luehrs, Tony Cook,
-Dominic Hargreaves, Aaron Crane, Aristotle Pagaltzis, Matthew Horsfall
-and Peter Martini.
+Dominic Hargreaves, Aaron Crane, Aristotle Pagaltzis, Matthew Horsfall,
+Peter Martini, and Sawyer X.
 </p>
 <table class="menu" border="0" cellspacing="0">
 <tr><td align="left" valign="top">&bull; <a href="#perlhist-PUMPKIN_003f" 
accesskey="1">perlhist PUMPKIN?</a>:</td><td>&nbsp;&nbsp;</td><td align="left" 
valign="top">
@@ -49258,6 +50933,13 @@
  BinGOs    5.21.6       2014-Nov-20
  Max M     5.21.7       2014-Dec-20
  Matthew H 5.21.8       2015-Jan-20
+ Sawyer X  5.21.9       2015-Feb-20
+ Steve     5.21.10      2015-Mar-20
+ Steve     5.21.11      2015-Apr-20
+
+ Ricardo   5.22.0-RC1   2015-May-19     The 5.22 maintenance track
+ Ricardo   5.22.0-RC2   2015-May-21
+ Ricardo   5.22.0       2015-Jun-01
 </pre>
 <table class="menu" border="0" cellspacing="0">
 <tr><td align="left" valign="top">&bull; <a 
href="#perlhist-SELECTED-RELEASE-SIZES" accesskey="1">perlhist SELECTED RELEASE 
SIZES</a>:</td><td>&nbsp;&nbsp;</td><td align="left" valign="top">
@@ -49353,6 +51035,7 @@
  5.16.0         5562 109   1077  80  20504 2702   8750 2375   4815 152
  5.18.0         5892 113   1088  79  20077 2760   9365 2439   4943 154
  5.20.0         6243 115   1187  75  19499 2701   9620 2457   5145 159
+ 5.22.0         7819 115   1284  77  19121 2635   9772 2434   5615 176
 </pre>
 <p>The &quot;core&quot;...&quot;doc&quot; mean the following files from the 
Perl source code
 distribution.  The glob notation ** means recursively, (.) means
@@ -49657,25 +51340,25 @@
 
  ======================================================================
 
-                  5.20.0
+                  5.20.0           5.22.0
 
- Configure    552      1
- Cross        118     15
- NetWare      467     61
- Porting     1204     68
- djgpp         18      7
- h2pl          13     15
- hints        355     90
- mad          174      8
- os2          510     70
- plan9        316     17
- qnx            1      4
- symbian      290     54
- utils        241     27
- vms          538     12
- vos            8      7
- win32       1183     64
- x2p          341     19
+ Configure    552      1       570      1
+ Cross        118     15       118     15
+ djgpp         18      7        17      7
+ h2pl          13     15        13     15
+ hints        355     90       356     87
+ mad          174      8         -      -
+ NetWare      467     61       466     61
+ os2          510     70       510     70
+ plan9        316     17       317     17
+ Porting     1204     68      1393     71
+ qnx            1      4         1      4
+ symbian      290     54       291     54
+ utils        241     27       242     27
+ vms          538     12       532     12
+ vos            8      7         8      7
+ win32       1183     64      1201     64
+ x2p          341     19         -      -
 </pre>
 <hr>
 <a name="perlhist-SELECTED-PATCH-SIZES"></a>
@@ -51313,7 +52996,7 @@
 <pre class="verbatim"> =   assignment
  .   string concatenation
  x   string multiplication
- ..  range operator (creates a list of numbers)
+ ..  range operator (creates a list of numbers or strings)
 </pre>
 </dd>
 </dl>
@@ -55053,9 +56736,10 @@
 design deficiencies, and nowadays, there is a series of &quot;UTF-8
 locales&quot;, based on Unicode.  These are locales whose character set is
 Unicode, encoded in UTF-8.  Starting in v5.20, Perl fully supports
-UTF-8 locales, except for sorting and string comparisions.  (Use
+UTF-8 locales, except for sorting and string comparisons.  (Use
 <a href="Unicode-Collate.html#Top">(Unicode-Collate)</a> for these.)  Perl 
continues to support the old
-non UTF-8 locales as well.
+non UTF-8 locales as well.  There are currently no UTF-8 locales for
+EBCDIC platforms.
 </p>
 <p>(Unicode is also creating <code>CLDR</code>, the &quot;Common Locale Data 
Repository&quot;,
 <a href="http://cldr.unicode.org/";>http://cldr.unicode.org/</a> which includes 
more types of information than
@@ -55118,7 +56802,7 @@
 <p>Some platforms have other categories, dealing with such things as
 measurement units and paper sizes.  None of these are used directly by
 Perl, but outside operations that Perl interacts with may use
-these.  See <a 
href="#perllocale-Not-within-the-scope-of-any-_0022use-locale_0022-variant">Not 
within the scope of any &quot;use locale&quot; variant</a> below.
+these.  See <a 
href="#perllocale-Not-within-the-scope-of-_0022use-locale_0022">Not within the 
scope of &quot;use locale&quot;</a> below.
 </p>
 </dd>
 </dl>
@@ -55139,7 +56823,8 @@
 <a name="PREPARING-TO-USE-LOCALES"></a>
 <h3 class="section">38.4 PREPARING TO USE LOCALES</h3>
 
-<p>Perl itself will not use locales unless specifically requested to (but
+<p>Perl itself (outside the <a href="POSIX.html#Top">(POSIX)</a> module) will 
not use locales unless
+specifically requested to (but
 again note that Perl may interact with code that does use them).  Even
 if there is such a request, <strong>all</strong> of the following must be true
 for it to work properly:
@@ -55168,7 +56853,7 @@
 
 <p>If you want a Perl application to process and present your data
 according to a particular locale, the application code should include
-the <code>use&nbsp;locale</code><!-- /@w --> pragma (see <a 
href="#perllocale-The-use-locale-pragma">The use locale pragma</a>) where
+the <code>use&nbsp;locale</code><!-- /@w --> pragma (see <a 
href="#perllocale-The-_0022use-locale_0022-pragma">The &quot;use locale&quot; 
pragma</a>) where
 appropriate, and <strong>at least one</strong> of the following must be true:
 </p>
 <ol>
@@ -55191,7 +56876,7 @@
 <h3 class="section">38.5 USING LOCALES</h3>
 
 <table class="menu" border="0" cellspacing="0">
-<tr><td align="left" valign="top">&bull; <a 
href="#perllocale-The-use-locale-pragma" accesskey="1">perllocale The use 
locale pragma</a>:</td><td>&nbsp;&nbsp;</td><td align="left" valign="top">
+<tr><td align="left" valign="top">&bull; <a 
href="#perllocale-The-_0022use-locale_0022-pragma" accesskey="1">perllocale The 
<code>&quot;use locale&quot;</code> pragma</a>:</td><td>&nbsp;&nbsp;</td><td 
align="left" valign="top">
 </td></tr>
 <tr><td align="left" valign="top">&bull; <a 
href="#perllocale-The-setlocale-function" accesskey="2">perllocale The 
setlocale function</a>:</td><td>&nbsp;&nbsp;</td><td align="left" valign="top">
 </td></tr>
@@ -55216,30 +56901,19 @@
 </table>
 
 <hr>
-<a name="perllocale-The-use-locale-pragma"></a>
+<a name="perllocale-The-_0022use-locale_0022-pragma"></a>
 <div class="header">
 <p>
 Next: <a href="#perllocale-The-setlocale-function" accesskey="n" 
rel="next">perllocale The setlocale function</a>, Up: <a 
href="#perllocale-USING-LOCALES" accesskey="u" rel="up">perllocale USING 
LOCALES</a> &nbsp; [<a href="#SEC_Contents" title="Table of contents" 
rel="contents">Contents</a>]</p>
 </div>
-<a name="The-use-locale-pragma"></a>
-<h4 class="subsection">38.5.1 The use locale pragma</h4>
+<a name="The-_0022use-locale_0022-pragma"></a>
+<h4 class="subsection">38.5.1 The <code>&quot;use locale&quot;</code> 
pragma</h4>
 
-<p>By default, Perl itself ignores the current locale.  The 
<code>use&nbsp;locale</code><!-- /@w -->
+<p>By default, Perl itself (outside the <a href="POSIX.html#Top">(POSIX)</a> 
module)
+ignores the current locale.  The <code>use&nbsp;locale</code><!-- /@w -->
 pragma tells Perl to use the current locale for some operations.
-Starting in v5.16, there is an optional parameter to this pragma:
-</p>
-<pre class="verbatim">    use locale ':not_characters';
-</pre>
-<p>This parameter allows better mixing of locales and Unicode (less useful
-in v5.20 and later), and is
-described fully in <a href="#perllocale-Unicode-and-UTF_002d8">Unicode and 
UTF-8</a>, but briefly, it tells Perl to
-not use the character portions of the locale definition, that is
-the <code>LC_CTYPE</code> and <code>LC_COLLATE</code> categories.  Instead it 
will use the
-native character set (extended by Unicode).  When using this parameter,
-you are responsible for getting the external character set translated
-into the native/Unicode one (which it already will be if it is one of
-the increasingly popular UTF-8 locales).  There are convenient ways of
-doing this, as described in <a 
href="#perllocale-Unicode-and-UTF_002d8">Unicode and UTF-8</a>.
+Starting in v5.16, there are optional parameters to this pragma,
+described below, which restrict which operations are affected by it.
 </p>
 <p>The current locale is set at execution time by
 <a href="#perllocale-The-setlocale-function">setlocale()</a> described below.  
If that function
@@ -55255,18 +56929,13 @@
 <p>The operations that are affected by locale are:
 </p>
 <dl compact="compact">
-<dt><strong>Not within the scope of any <code>&quot;use locale&quot;</code> 
variant</strong></dt>
-<dd><a 
name="perllocale-Not-within-the-scope-of-any-_0022use-locale_0022-variant"></a>
-<p>Only operations originating outside Perl should be affected, as follows:
+<dt><strong>Not within the scope of <code>&quot;use 
locale&quot;</code></strong></dt>
+<dd><a name="perllocale-Not-within-the-scope-of-_0022use-locale_0022"></a>
+<p>Only certain operations originating outside Perl should be affected, as
+follows:
 </p>
 <ul>
-<li> The variables <a href="#perlvar-_0024ERRNO">$!</a> (and its synonyms 
<code>$ERRNO</code> and
-<code>$OS_ERROR</code>) and <a 
href="#perlvar-_0024EXTENDED_005fOS_005fERROR">$^E</a> (and its synonym
-<code>$EXTENDED_OS_ERROR</code>) when used as strings always are in terms of 
the
-current locale and as if within the scope of <a 
href="bytes.html#Top">(bytes)&quot;use bytes&quot;</a>.  This is
-likely to change in Perl v5.22.
-
-</li><li> The current locale is also used when going outside of Perl with
+<li> The current locale is used when going outside of Perl with
 operations like <a href="#perlfunc-system-LIST">system()</a> or
 <a href="#perlop-qx_002fSTRING_002f">qx//</a>, if those operations are
 locale-sensitive.
@@ -55286,24 +56955,32 @@
 
 </li></ul>
 
+<p>Note that all C programs (including the perl interpreter, which is
+written in C) always have an underlying locale.  That locale is the 
&quot;C&quot;
+locale unless changed by a call to <a 
href="#perllocale-The-setlocale-function">setlocale()</a>.  When Perl starts 
up, it changes the underlying locale to the
+one which is indicated by the <a 
href="#perllocale-ENVIRONMENT">ENVIRONMENT</a>.  When using the <a 
href="POSIX.html#Top">(POSIX)</a>
+module or writing XS code, it is important to keep in mind that the
+underlying locale may be something other than &quot;C&quot;, even if the 
program
+hasn&rsquo;t explicitly changed it.
+</p>
 <p>ÃÂ 
 </p>
 </dd>
 <dt><strong>Lingering effects of <code>use&nbsp;locale<!-- /@w 
--></code></strong></dt>
 <dd><a name="perllocale-Lingering-effects-of-use-locale"></a>
 <p>Certain Perl operations that are set-up within the scope of a
-<code>use locale</code> variant retain that effect even outside the scope.
+<code>use locale</code> retain that effect even outside the scope.
 These include:
 </p>
 <ul>
 <li> The output format of a <a href="#perlfunc-write">write()</a> is 
determined by an
 earlier format declaration (<a href="#perlfunc-format">perlfunc format</a>), 
so whether or not the
 output is affected by locale is determined by if the <code>format()</code> is
-within the scope of a <code>use locale</code> variant, not whether the 
<code>write()</code>
+within the scope of a <code>use locale</code>, not whether the 
<code>write()</code>
 is.
 
 </li><li> Regular expression patterns can be compiled using
-<a href="#perlop-qr_002fSTRING_002fmsixpodual">qr//</a> with actual
+<a href="#perlop-qr_002fSTRING_002fmsixpodualn">qr//</a> with actual
 matching deferred to later.  Again, it is whether or not the compilation
 was done within the scope of <code>use locale</code> that determines the match
 behavior, not if the matches are done within such a scope or not.
@@ -55313,10 +56990,10 @@
 <p>ÃÂ 
 </p>
 </dd>
-<dt><strong>Under <code>&quot;use locale 
':not_characters';&quot;</code></strong></dt>
-<dd><a 
name="perllocale-Under-_0022use-locale-_0027_003anot_005fcharacters_0027_003b_0022"></a>
+<dt><strong>Under <code>&quot;use locale&quot;;</code></strong></dt>
+<dd><a name="perllocale-Under-_0022use-locale_0022_003b"></a>
 <ul>
-<li> All the non-Perl operations.
+<li> All the above operations
 
 </li><li> <strong>Format declarations</strong> (<a 
href="#perlfunc-format">perlfunc format</a>) and hence any subsequent
 <code>write()</code>s use <code>LC_NUMERIC</code>.
@@ -55329,16 +57006,6 @@
 and
 <code>sprintf()</code>.
 
-</li></ul>
-
-<p>ÃÂ 
-</p>
-</dd>
-<dt><strong>Under just plain <code>&quot;use locale&quot;;</code></strong></dt>
-<dd><a name="perllocale-Under-just-plain-_0022use-locale_0022_003b"></a>
-<ul>
-<li> All the above operations
-
 </li><li> <strong>The comparison operators</strong> (<code>lt</code>, 
<code>le</code>, <code>cmp</code>, <code>ge</code>, and <code>gt</code>) use
 <code>LC_COLLATE</code>.  <code>sort()</code> is also affected if used without 
an
 explicit comparison function, because it uses <code>cmp</code> by default.
@@ -55356,6 +57023,10 @@
 </li><li> <strong>Regular expressions and case-modification functions</strong> 
(<code>uc()</code>, <code>lc()</code>,
 <code>ucfirst()</code>, and <code>lcfirst()</code>) use <code>LC_CTYPE</code>
 
+</li><li> <strong>The variables <a href="#perlvar-_0024ERRNO">$!</a></strong> 
(and its synonyms <code>$ERRNO</code> and
+<code>$OS_ERROR</code>) <strong>and <a 
href="#perlvar-_0024EXTENDED_005fOS_005fERROR">$^E</a></strong> (and its synonym
+<code>$EXTENDED_OS_ERROR</code>) when used as strings use 
<code>LC_MESSAGES</code>.
+
 </li></ul>
 
 </dd>
@@ -55363,7 +57034,7 @@
 
 <p>The default behavior is restored with the <code>no&nbsp;locale</code><!-- 
/@w --> pragma, or
 upon reaching the end of the block enclosing <code>use locale</code>.
-Note that <code>use locale</code> and <code>use locale 
':not_characters'</code> may be
+Note that <code>use locale</code> calls may be
 nested, and that what is in effect within an inner scope will revert to
 the outer scope&rsquo;s rules at the end of the inner scope.
 </p>
@@ -55371,11 +57042,71 @@
 information is tainted, as it is possible for a locale to be
 untrustworthy.  See <a href="#perllocale-SECURITY">SECURITY</a>.
 </p>
+<p>Starting in Perl v5.16 in a very limited way, and more generally in
+v5.22, you can restrict which category or categories are enabled by this
+particular instance of the pragma by adding parameters to it.  For
+example,
+</p>
+<pre class="verbatim"> use locale qw(:ctype :numeric);
+</pre>
+<p>enables locale awareness within its scope of only those operations
+(listed above) that are affected by <code>LC_CTYPE</code> and 
<code>LC_NUMERIC</code>.
+</p>
+<p>The possible categories are: <code>:collate</code>, <code>:ctype</code>, 
<code>:messages</code>,
+<code>:monetary</code>, <code>:numeric</code>, <code>:time</code>, and the 
pseudo category
+<code>:characters</code> (described below).
+</p>
+<p>Thus you can say
+</p>
+<pre class="verbatim"> use locale ':messages';
+</pre>
+<p>and only <a href="#perlvar-_0024ERRNO">$!</a> and <a 
href="#perlvar-_0024EXTENDED_005fOS_005fERROR">$^E</a>
+will be locale aware.  Everything else is unaffected.
+</p>
+<p>Since Perl doesn&rsquo;t currently do anything with the 
<code>LC_MONETARY</code>
+category, specifying <code>:monetary</code> does effectively nothing.  Some
+systems have other categories, such as <code>LC_PAPER_SIZE</code>, but Perl
+also doesn&rsquo;t know anything about them, and there is no way to specify
+them in this pragma&rsquo;s arguments.
+</p>
+<p>You can also easily say to use all categories but one, by either, for
+example,
+</p>
+<pre class="verbatim"> use locale ':!ctype';
+ use locale ':not_ctype';
+</pre>
+<p>both of which mean to enable locale awarness of all categories but
+<code>LC_CTYPE</code>.  Only one category argument may be specified in a
+<code>use&nbsp;locale</code><!-- /@w --> if it is of the negated form.
+</p>
+<p>Prior to v5.22 only one form of the pragma with arguments is available:
+</p>
+<pre class="verbatim"> use locale ':not_characters';
+</pre>
+<p>(and you have to say <code>not_</code>; you can&rsquo;t use the bang 
<code>!</code> form).  This
+pseudo category is a shorthand for specifying both <code>:collate</code> and
+<code>:ctype</code>.  Hence, in the negated form, it is nearly the same thing 
as
+saying
+</p>
+<pre class="verbatim"> use locale qw(:messages :monetary :numeric :time);
+</pre>
+<p>We use the term &quot;nearly&quot;, because <code>:not_characters</code> 
also turns on
+<code>use&nbsp;feature&nbsp;<span 
class="nolinebreak">'unicode_strings'</span></code><!-- /@w --> within its 
scope.  This form is
+less useful in v5.20 and later, and is described fully in
+<a href="#perllocale-Unicode-and-UTF_002d8">Unicode and UTF-8</a>, but 
briefly, it tells Perl to not use the
+character portions of the locale definition, that is the <code>LC_CTYPE</code> 
and
+<code>LC_COLLATE</code> categories.  Instead it will use the native character 
set
+(extended by Unicode).  When using this parameter, you are responsible
+for getting the external character set translated into the
+native/Unicode one (which it already will be if it is one of the
+increasingly popular UTF-8 locales).  There are convenient ways of doing
+this, as described in <a href="#perllocale-Unicode-and-UTF_002d8">Unicode and 
UTF-8</a>.
+</p>
 <hr>
 <a name="perllocale-The-setlocale-function"></a>
 <div class="header">
 <p>
-Next: <a href="#perllocale-Finding-locales" accesskey="n" 
rel="next">perllocale Finding locales</a>, Previous: <a 
href="#perllocale-The-use-locale-pragma" accesskey="p" rel="prev">perllocale 
The use locale pragma</a>, Up: <a href="#perllocale-USING-LOCALES" 
accesskey="u" rel="up">perllocale USING LOCALES</a> &nbsp; [<a 
href="#SEC_Contents" title="Table of contents" rel="contents">Contents</a>]</p>
+Next: <a href="#perllocale-Finding-locales" accesskey="n" 
rel="next">perllocale Finding locales</a>, Previous: <a 
href="#perllocale-The-_0022use-locale_0022-pragma" accesskey="p" 
rel="prev">perllocale The <code>&quot;use locale&quot;</code> pragma</a>, Up: 
<a href="#perllocale-USING-LOCALES" accesskey="u" rel="up">perllocale USING 
LOCALES</a> &nbsp; [<a href="#SEC_Contents" title="Table of contents" 
rel="contents">Contents</a>]</p>
 </div>
 <a name="The-setlocale-function"></a>
 <h4 class="subsection">38.5.2 The setlocale function</h4>
@@ -55448,8 +57179,8 @@
 to the environment made by the application after startup may or may not
 be noticed, depending on your system&rsquo;s C library.
 </p>
-<p>Note that Perl ignores the current <code>LC_CTYPE</code> and 
<code>LC_COLLATE</code> locales
-within the scope of a <code>use locale ':not_characters'</code>.
+<p>Note that when a form of <code>use locale</code> that doesn&rsquo;t include 
all
+categories is specified, Perl ignores the excluded categories.
 </p>
 <p>If <code>set_locale()</code> fails for some reason (for example, an attempt 
to set
 to a locale unknown to the system), the locale for the category is not
@@ -55585,7 +57316,7 @@
 locale inconsistencies or to run Perl under the default locale &quot;C&quot;.
 </p>
 <p>Perl&rsquo;s moaning about locale problems can be silenced by setting the
-environment variable <code>PERL_BADLANG</code> to a zero value, for example 
&quot;0&quot;.
+environment variable <code>PERL_BADLANG</code> to &quot;0&quot; or 
&quot;&quot;.
 This method really just sweeps the problem under the carpet: you tell
 Perl to shut up even when Perl sees that something is wrong.  Do not
 be surprised if later something locale-dependent misbehaves.
@@ -55703,7 +57434,9 @@
 
 <p>The <code>POSIX::localeconv()</code> function allows you to get particulars 
of the
 locale-dependent numeric formatting information specified by the current
-<code>LC_NUMERIC</code> and <code>LC_MONETARY</code> locales.  (If you just 
want the name of
+underlying <code>LC_NUMERIC</code> and <code>LC_MONETARY</code> locales 
(regardless of
+whether called from within the scope of <code>use&nbsp;locale<!-- /@w 
--></code> or not).  (If
+you just want the name of
 the current locale for a particular category, use 
<code>POSIX::setlocale()</code>
 with a single parameter&ndash;see <a 
href="#perllocale-The-setlocale-function">The setlocale function</a>.)
 </p>
@@ -55764,6 +57497,10 @@
     }
     print &quot;\n&quot;;
 </pre>
+<p>Note that if the platform doesn&rsquo;t have <code>LC_NUMERIC</code> and/or
+<code>LC_MONETARY</code> available or enabled, the corresponding elements of 
the
+hash will be missing.
+</p>
 <hr>
 <a name="perllocale-I18N_003a_003aLanginfo"></a>
 <div class="header">
@@ -55834,8 +57571,8 @@
 <a name="Category-LC_005fCOLLATE_003a-Collation"></a>
 <h4 class="subsection">38.6.1 Category <code>LC_COLLATE</code>: Collation</h4>
 
-<p>In the scope of <code>use&nbsp;locale</code><!-- /@w --> (but not a
-<code>use locale ':not_characters'</code>), Perl looks to the 
<code>LC_COLLATE</code>
+<p>In the scope of a <code>use&nbsp;locale</code><!-- /@w --> form that 
includes collation, Perl
+looks to the <code>LC_COLLATE</code>
 environment variable to determine the application&rsquo;s notions on collation
 (ordering) of characters.  For example, &quot;b&quot; follows &quot;a&quot; in 
Latin
 alphabets, but where do &quot;ÃÂ¡&quot; and &quot;ÃÂ¥&quot; belong?  And 
while
@@ -55927,8 +57664,8 @@
 <a name="Category-LC_005fCTYPE_003a-Character-Types"></a>
 <h4 class="subsection">38.6.2 Category <code>LC_CTYPE</code>: Character 
Types</h4>
 
-<p>In the scope of <code>use&nbsp;locale</code><!-- /@w --> (but not a
-<code>use locale ':not_characters'</code>), Perl obeys the 
<code>LC_CTYPE</code> locale
+<p>In the scope of a <code>use&nbsp;locale</code><!-- /@w --> form that 
includes <code>LC_CTYPE</code>, Perl
+obeys the <code>LC_CTYPE</code> locale
 setting.  This controls the application&rsquo;s notion of which characters are
 alphabetic, numeric, punctuation, <em>etc</em>.  This affects Perl&rsquo;s 
<code>\w</code>
 regular expression metanotation,
@@ -55956,15 +57693,21 @@
 you may find&ndash;possibly to your surprise&ndash;that 
<code>&quot;|&quot;</code> moves from the
 <code>POSIX::ispunct()</code> class to <code>POSIX::isalpha()</code>.
 Unfortunately, this creates big problems for regular expressions. 
&quot;|&quot; still
-means alternation even though it matches <code>\w</code>.
+means alternation even though it matches <code>\w</code>.  Starting in v5.22, a
+warning will be raised when such a locale is switched into.  More
+details are given several paragraphs further down.
 </p>
 <p>Starting in v5.20, Perl supports UTF-8 locales for <code>LC_CTYPE</code>, 
but
 otherwise Perl only supports single-byte locales, such as the ISO 8859
 series.  This means that wide character locales, for example for Asian
-languages, are not well-supported.  The UTF-8 locale support is actually a
+languages, are not well-supported.  (If the platform has the capability
+for Perl to detect such a locale, starting in Perl v5.22,
+<a href="warnings.html#Category-Hierarchy">(warnings)Perl will warn, default 
enabled</a>,
+using the <code>locale</code> warning category, whenever such a locale is 
switched
+into.)  The UTF-8 locale support is actually a
 superset of POSIX locales, because it is really full Unicode behavior
-as if no locale were in effect at all (except for tainting; see
-<a href="#perllocale-SECURITY">SECURITY</a>).  POSIX locales, even UTF-8 ones,
+as if no <code>LC_CTYPE</code> locale were in effect at all (except for 
tainting;
+see <a href="#perllocale-SECURITY">SECURITY</a>).  POSIX locales, even UTF-8 
ones,
 are lacking certain concepts in Unicode, such as the idea that changing
 the case of a character could expand to be more than one character.
 Perl in a UTF-8 locale, will give you that expansion.  Prior to v5.20,
@@ -55983,6 +57726,19 @@
 for example, that <code>\N</code> in regular expressions (every character
 but new-line) works on the platform character set.
 </p>
+<p>Starting in v5.22, Perl will by default warn when switching into a
+locale that redefines any ASCII printable character (plus <code>\t</code> and
+<code>\n</code>) into a different class than expected.  This is likely to
+happen on modern locales only on EBCDIC platforms, where, for example,
+a CCSID 0037 locale on a CCSID 1047 machine moves <code>&quot;[&quot;</code>, 
but it can
+happen on ASCII platforms with the ISO 646 and other
+7-bit locales that are essentially obsolete.  Things may still work,
+depending on what features of Perl are used by the program.  For
+example, in the example from above where <code>&quot;|&quot;</code> becomes a 
<code>\w</code>, and
+there are no regular expressions where this matters, the program may
+still work properly.  The warning lists all the characters that
+it can determine could be adversely affected.
+</p>
 <p><strong>Note:</strong> A broken or malicious <code>LC_CTYPE</code> locale 
definition may result
 in clearly ineligible characters being considered to be alphanumeric by
 your application.  For strict matching of (mundane) ASCII letters and
@@ -55998,10 +57754,10 @@
 <a name="Category-LC_005fNUMERIC_003a-Numeric-Formatting"></a>
 <h4 class="subsection">38.6.3 Category <code>LC_NUMERIC</code>: Numeric 
Formatting</h4>
 
-<p>After a proper <code>POSIX::setlocale()</code> call, and within the scope 
of one
-of the <code>use locale</code> variants, Perl obeys the <code>LC_NUMERIC</code>
-locale information, which controls an application&rsquo;s idea of how numbers
-should be formatted for human readability.
+<p>After a proper <code>POSIX::setlocale()</code> call, and within the scope of
+of a <code>use locale</code> form that includes numerics, Perl obeys the
+<code>LC_NUMERIC</code> locale information, which controls an 
application&rsquo;s idea
+of how numbers should be formatted for human readability.
 In most implementations the only effect is to
 change the character used for the decimal point&ndash;perhaps from 
&quot;.&quot;  to &quot;,&quot;.
 The functions aren&rsquo;t aware of such niceties as thousands separation and
@@ -56155,16 +57911,16 @@
 </p>
 </li><li> <strong>Case-mapping interpolation</strong> (with <code>\l</code>, 
<code>\L</code>, <code>\u</code>, <code>\U</code>, or <code>\F</code>)
 
-<p>Result string containing interpolated material is tainted if
-<code>use locale</code> (but not <code>use&nbsp;locale&nbsp;<span 
class="nolinebreak">':not_characters'</span></code><!-- /@w -->) is in effect.
+<p>The result string containing interpolated material is tainted if
+a <code>use locale</code> form that includes <code>LC_CTYPE</code> is in 
effect.
 </p>
 </li><li> <strong>Matching operator</strong> (<code>m//</code>):
 
 <p>Scalar true/false result never tainted.
 </p>
 <p>All subpatterns, either delivered as a list-context result or as 
<code>$1</code>
-<em>etc</em>., are tainted if <code>use locale</code> (but not
-<code>use&nbsp;locale&nbsp;<span 
class="nolinebreak">':not_characters'</span></code><!-- /@w -->) is in effect, 
and the subpattern
+<em>etc</em>., are tainted if a <code>use locale</code> form that includes
+<code>LC_CTYPE</code> is in effect, and the subpattern
 regular expression contains a locale-dependent construct.  These
 constructs include <code>\w</code> (to match an alphanumeric character), 
<code>\W</code>
 (non-alphanumeric character), <code>\b</code> and <code>\B</code> 
(word-boundary and
@@ -56186,8 +57942,8 @@
 </li><li> <strong>Substitution operator</strong> (<code>s///</code>):
 
 <p>Has the same behavior as the match operator.  Also, the left
-operand of <code>=~</code> becomes tainted when <code>use locale</code>
-(but not <code>use&nbsp;locale&nbsp;<span 
class="nolinebreak">':not_characters'</span></code><!-- /@w -->) is in effect 
if modified as
+operand of <code>=~</code> becomes tainted when a <code>use locale</code>
+form that includes <code>LC_CTYPE</code> is in effect, if modified as
 a result of a substitution based on a regular
 expression match involving any of the things mentioned in the previous
 item, or of case-mapping, such as <code>\l</code>, 
<code>\L</code>,<code>\u</code>, <code>\U</code>, or <code>\F</code>.
@@ -56200,8 +57956,8 @@
 </p>
 </li><li> <strong>Case-mapping functions</strong> (<code>lc()</code>, 
<code>lcfirst()</code>, <code>uc()</code>, <code>ucfirst()</code>):
 
-<p>Results are tainted if <code>use locale</code> (but not
-<code>use&nbsp;locale&nbsp;<span 
class="nolinebreak">':not_characters'</span></code><!-- /@w -->) is in effect.
+<p>Results are tainted if a <code>use locale</code> form that includes 
<code>LC_CTYPE</code> is
+in effect.
 </p>
 </li><li> <strong>POSIX locale-dependent functions</strong> 
(<code>localeconv()</code>, <code>strcoll()</code>,
 <code>strftime()</code>, <code>strxfrm()</code>):
@@ -56272,8 +58028,8 @@
 <dl compact="compact">
 <dt>PERL_SKIP_LOCALE_INIT</dt>
 <dd><a name="perllocale-PERL_005fSKIP_005fLOCALE_005fINIT"></a>
-<p>This environment variable, available starting in Perl v5.20, and if it
-evaluates to a TRUE value, tells Perl to not use the rest of the
+<p>This environment variable, available starting in Perl v5.20, if set
+(to any value), tells Perl to not use the rest of the
 environment variables to initialize with.  Instead, Perl uses whatever
 the current locale settings are.  This is particularly useful in
 embedded environments, see
@@ -56286,9 +58042,8 @@
 at startup.  Failure can occur if the locale support in the operating
 system is lacking (broken) in some way&ndash;or if you mistyped the name of
 a locale when you set up your environment.  If this environment
-variable is absent, or has a value that does not evaluate to integer
-zero&ndash;that is, &quot;0&quot; or &quot;&quot;&ndash; Perl will complain 
about locale setting
-failures.
+variable is absent, or has a value other than &quot;0&quot; or &quot;&quot;, 
Perl will
+complain about locale setting failures.
 </p>
 <p><strong>NOTE</strong>: <code>PERL_BADLANG</code> only gives you a way to 
hide the warning message.
 The message tells about some problem in your system&rsquo;s locale support,
@@ -56330,8 +58085,8 @@
 See the GNU <code>gettext</code> library documentation for more information.
 </p>
 </dd>
-<dt><code>LC_CTYPE</code>.</dt>
-<dd><a name="perllocale-LC_005fCTYPE_002e"></a>
+<dt><code>LC_CTYPE</code></dt>
+<dd><a name="perllocale-LC_005fCTYPE"></a>
 <p>In the absence of <code>LC_ALL</code>, <code>LC_CTYPE</code> chooses the 
character type
 locale.  In the absence of both <code>LC_ALL</code> and <code>LC_CTYPE</code>, 
<code>LANG</code>
 chooses the character type locale.
@@ -56369,7 +58124,7 @@
 <dd><a name="perllocale-LANG"></a>
 <p><code>LANG</code> is the &quot;catch-all&quot; locale environment variable. 
If it is set, it
 is used as the last resort after the overall <code>LC_ALL</code> and the
-category-specific <code>LC_<em>foo</em></code>
+category-specific <code>LC_<em>foo</em></code>.
 </p>
 </dd>
 </dl>
@@ -56464,6 +58219,10 @@
 </pre>
 <p>This prints <code>2.7</code>.
 </p>
+<p>You could also exclude <code>LC_NUMERIC</code>, if you don&rsquo;t need it, 
by
+</p>
+<pre class="verbatim"> use locale ':!numeric';
+</pre>
 <hr>
 <a name="perllocale-Backward-compatibility"></a>
 <div class="header">
@@ -56479,7 +58238,7 @@
 (see <a href="#perllocale-The-setlocale-function">The setlocale function</a>). 
 By default, Perl still behaves this
 way for backward compatibility.  If you want a Perl application to pay
 attention to locale information, you <strong>must</strong> use the 
<code>use&nbsp;locale</code><!-- /@w -->
-pragma (see <a href="#perllocale-The-use-locale-pragma">The use locale 
pragma</a>) or, in the unlikely event
+pragma (see <a href="#perllocale-The-_0022use-locale_0022-pragma">The 
&quot;use locale&quot; pragma</a>) or, in the unlikely event
 that you want to do so for just pattern matching, the
 <code>/l</code> regular expression modifier (see <a 
href="#perlre-Character-set-modifiers">perlre Character set modifiers</a>) to 
instruct it to do so.
 </p>
@@ -56654,7 +58413,9 @@
 only working under the newer wide library functions like 
<code>iswalnum()</code>,
 which Perl does not use.
 These multi-byte locales are treated like single-byte locales, and will
-have the restrictions described below.
+have the restrictions described below.  Starting in Perl v5.22 a warning
+message is raised when Perl detects a multi-byte locale that it doesn&rsquo;t
+fully support.
 </p>
 <p>For single-byte locales,
 Perl generally takes the tack to use locale rules on code points that can fit
@@ -56702,6 +58463,11 @@
 points meaning the same character.  Thus in a Greek locale, both U+03A7
 and U+00D7 are GREEK CAPITAL LETTER CHI.
 </p>
+<p>Because of all these problems, starting in v5.22, Perl will raise a
+warning if a multi-byte (hence Unicode) code point is used when a
+single-byte locale is in effect.  (Although it doesn&rsquo;t check for this if
+doing so would unreasonably slow execution down.)
+</p>
 <p>Vendor locales are notoriously buggy, and it is difficult for Perl to test
 its locale-handling code because this interacts with code that Perl has no
 control over; therefore the locale-handling code in Perl may be buggy as
@@ -58606,6 +60372,8 @@
 
 </li><li> Choose an appropriate name
 
+</li><li> Get feedback before publishing
+
 </li></ul>
 
 <hr>
@@ -58717,6 +60485,8 @@
 </td></tr>
 <tr><td align="left" valign="top">&bull; <a 
href="#perlmodstyle-What_0027s-in-a-name_003f" accesskey="3">perlmodstyle 
What's in a name?</a>:</td><td>&nbsp;&nbsp;</td><td align="left" valign="top">
 </td></tr>
+<tr><td align="left" valign="top">&bull; <a 
href="#perlmodstyle-Get-feedback-before-publishing" accesskey="4">perlmodstyle 
Get feedback before publishing</a>:</td><td>&nbsp;&nbsp;</td><td align="left" 
valign="top">
+</td></tr>
 </table>
 
 <hr>
@@ -58777,7 +60547,7 @@
 <a name="perlmodstyle-What_0027s-in-a-name_003f"></a>
 <div class="header">
 <p>
-Previous: <a href="#perlmodstyle-Do-one-thing-and-do-it-well" accesskey="p" 
rel="prev">perlmodstyle Do one thing and do it well</a>, Up: <a 
href="#perlmodstyle-BEFORE-YOU-START-WRITING-A-MODULE" accesskey="u" 
rel="up">perlmodstyle BEFORE YOU START WRITING A MODULE</a> &nbsp; [<a 
href="#SEC_Contents" title="Table of contents" rel="contents">Contents</a>]</p>
+Next: <a href="#perlmodstyle-Get-feedback-before-publishing" accesskey="n" 
rel="next">perlmodstyle Get feedback before publishing</a>, Previous: <a 
href="#perlmodstyle-Do-one-thing-and-do-it-well" accesskey="p" 
rel="prev">perlmodstyle Do one thing and do it well</a>, Up: <a 
href="#perlmodstyle-BEFORE-YOU-START-WRITING-A-MODULE" accesskey="u" 
rel="up">perlmodstyle BEFORE YOU START WRITING A MODULE</a> &nbsp; [<a 
href="#SEC_Contents" title="Table of contents" rel="contents">Contents</a>]</p>
 </div>
 <a name="What_0027s-in-a-name_003f"></a>
 <h4 class="subsection">42.4.3 What&rsquo;s in a name?</h4>
@@ -58800,11 +60570,25 @@
 
 </li></ul>
 
-<p>You should contact address@hidden to ask them about your module name
-before publishing your module.  You should also try to ask people who 
-are already familiar with the module&rsquo;s application domain and the CPAN
-naming system.  Authors of similar modules, or modules with similar
-names, may be a good place to start.
+<hr>
+<a name="perlmodstyle-Get-feedback-before-publishing"></a>
+<div class="header">
+<p>
+Previous: <a href="#perlmodstyle-What_0027s-in-a-name_003f" accesskey="p" 
rel="prev">perlmodstyle What's in a name?</a>, Up: <a 
href="#perlmodstyle-BEFORE-YOU-START-WRITING-A-MODULE" accesskey="u" 
rel="up">perlmodstyle BEFORE YOU START WRITING A MODULE</a> &nbsp; [<a 
href="#SEC_Contents" title="Table of contents" rel="contents">Contents</a>]</p>
+</div>
+<a name="Get-feedback-before-publishing"></a>
+<h4 class="subsection">42.4.4 Get feedback before publishing</h4>
+
+<p>If you have never uploaded a module to CPAN before (and even if you have),
+you are strongly encouraged to get feedback on <a 
href="http://prepan.org";>PrePAN</a>.
+PrePAN is a site dedicated to discussing ideas for CPAN modules with other
+Perl developers and is a great resource for new (and experienced) Perl
+developers.
+</p>
+<p>You should also try to get feedback from people who are already familiar
+with the module&rsquo;s application domain and the CPAN naming system.  Authors
+of similar modules, or modules with similar names, may be a good place to
+start, as are community sites like <a href="http://www.perlmonks.org";>Perl 
Monks</a>.
 </p>
 <hr>
 <a name="perlmodstyle-DESIGNING-AND-WRITING-YOUR-MODULE"></a>
@@ -61566,7 +63350,7 @@
 </p>
 <p>If your <code>DESTROY</code> method issues a warning during global 
destruction,
 the Perl interpreter will append the string &quot; during global
-destruction&quot; the warning.
+destruction&quot; to the warning.
 </p>
 <p>During global destruction, Perl will always garbage collect objects
 before unblessed references. See <a 
href="#perlhacktips-PERL_005fDESTRUCT_005fLEVEL">perlhacktips 
PERL_DESTRUCT_LEVEL</a>
@@ -62709,19 +64493,19 @@
 <h3 class="section">48.2 DESCRIPTION</h3>
 
 <p>In Perl, the operator determines what operation is performed,
-independent of the type of the operands.  For example <code>$x + $y</code>
+independent of the type of the operands.  For example 
<code>$x&nbsp;+&nbsp;$y</code><!-- /@w -->
 is always a numeric addition, and if <code>$x</code> or <code>$y</code> do not 
contain
 numbers, an attempt is made to convert them to numbers first.
 </p>
 <p>This is in contrast to many other dynamic languages, where the
 operation is determined by the type of the first argument.  It also
 means that Perl has two versions of some operators, one for numeric
-and one for string comparison.  For example <code>$x == $y</code> compares
-two numbers for equality, and <code>$x eq $y</code> compares two strings.
+and one for string comparison.  For example 
<code>$x&nbsp;==&nbsp;$y</code><!-- /@w --> compares
+two numbers for equality, and <code>$x&nbsp;eq&nbsp;$y</code><!-- /@w --> 
compares two strings.
 </p>
 <p>There are a few exceptions though: <code>x</code> can be either string
 repetition or list repetition, depending on the type of the left
-operand, and <code>&amp;</code>, <code>|</code> and <code>^</code> can be 
either string or numeric bit
+operand, and <code>&amp;</code>, <code>|</code>, <code>^</code> and 
<code>~</code> can be either string or numeric bit
 operations.
 </p>
 <table class="menu" border="0" cellspacing="0">
@@ -62818,16 +64602,15 @@
 they do in mathematics.
 </p>
 <p><em>Operator precedence</em> means some operators are evaluated before
-others.  For example, in <code>2 + 4 * 5</code>, the multiplication has higher
-precedence so <code>4 * 5</code> is evaluated first yielding <code>2 + 20 ==
-22</code> and not <code>6 * 5 == 30</code>.
+others.  For example, in <code>2&nbsp;+&nbsp;4&nbsp;*&nbsp;5</code><!-- /@w 
-->, the multiplication has higher
+precedence so <code>4&nbsp;*&nbsp;5</code><!-- /@w --> is evaluated first 
yielding <code>2&nbsp;+&nbsp;20&nbsp;==&nbsp;22</code><!-- /@w --> and not 
<code>6&nbsp;*&nbsp;5&nbsp;==&nbsp;30</code><!-- /@w -->.
 </p>
 <p><em>Operator associativity</em> defines what happens if a sequence of the
 same operators is used one after another: whether the evaluator will
-evaluate the left operations first or the right.  For example, in <code>8
-- 4 - 2</code>, subtraction is left associative so Perl evaluates the
-expression left to right.  <code>8 - 4</code> is evaluated first making the
-expression <code>4 - 2 == 2</code> and not <code>8 - 2 == 6</code>.
+evaluate the left operations first, or the right first.  For example, in
+<code>8&nbsp;<span class="nolinebreak">-</span>&nbsp;4&nbsp;<span 
class="nolinebreak">-</span>&nbsp;2</code><!-- /@w -->, subtraction is left 
associative so Perl evaluates the
+expression left to right.  <code>8&nbsp;<span 
class="nolinebreak">-</span>&nbsp;4</code><!-- /@w --> is evaluated first 
making the
+expression <code>4&nbsp;<span 
class="nolinebreak">-</span>&nbsp;2&nbsp;==&nbsp;2</code><!-- /@w --> and not 
<code>8&nbsp;<span 
class="nolinebreak">-</span>&nbsp;2&nbsp;==&nbsp;6</code><!-- /@w -->.
 </p>
 <p>Perl operators have the following associativity and precedence,
 listed from highest precedence to lowest.  Operators borrowed from
@@ -62881,7 +64664,7 @@
 operators behaving as functions because you put parentheses around
 the arguments.  These are all documented in <a href="#perlfunc-NAME">perlfunc 
NAME</a>.
 </p>
-<p>If any list operator (print(), etc.) or any unary operator (chdir(), etc.)
+<p>If any list operator (<code>print()</code>, etc.) or any unary operator 
(<code>chdir()</code>, etc.)
 is followed by a left parenthesis as the next token, the operator and
 arguments within parentheses are taken to be of highest precedence,
 just like a normal function call.
@@ -62894,7 +64677,7 @@
 <pre class="verbatim">    @ary = (1, 3, sort 4, 2);
     print @ary;         # prints 1324
 </pre>
-<p>the commas on the right of the sort are evaluated before the sort,
+<p>the commas on the right of the <code>sort</code> are evaluated before the 
<code>sort</code>,
 but the commas on the left are evaluated after.  In other words,
 list operators tend to gobble up all arguments that follow, and
 then act like a simple TERM with regard to the preceding expression.
@@ -62915,7 +64698,7 @@
 </pre>
 <p>probably doesn&rsquo;t do what you expect at first glance.  The parentheses
 enclose the argument list for <code>print</code> which is evaluated (printing
-the result of <code>$foo &amp; 255</code>).  Then one is added to the return 
value
+the result of <code>$foo&nbsp;&amp;&nbsp;255</code><!-- /@w -->).  Then one is 
added to the return value
 of <code>print</code> (usually 1).  The result is something like this:
 </p>
 <pre class="verbatim">    1 + 1, &quot;\n&quot;;    # Obviously not what you 
meant.
@@ -62926,7 +64709,7 @@
 </pre>
 <p>See <a href="#perlop-Named-Unary-Operators">Named Unary Operators</a> for 
more discussion of this.
 </p>
-<p>Also parsed as terms are the <code>do {}</code> and <code>eval {}</code> 
constructs, as
+<p>Also parsed as terms are the <code>do&nbsp;{}</code><!-- /@w --> and 
<code>eval&nbsp;{}</code><!-- /@w --> constructs, as
 well as subroutine and method calls, and the anonymous
 constructors <code>[]</code> and <code>{}</code>.
 </p>
@@ -62968,7 +64751,7 @@
 <a name="Auto_002dincrement-and-Auto_002ddecrement"></a>
 <h4 class="subsection">48.2.4 Auto-increment and Auto-decrement</h4>
 
-<p>&quot;++&quot; and &quot;&ndash;&quot; work as in C.  That is, if placed 
before a variable,
+<p><code>&quot;++&quot;</code> and <code>&quot;--&quot;</code> work as in C.  
That is, if placed before a variable,
 they increment or decrement the variable by one before returning the
 value, and if placed after, increment or decrement after returning the
 value.
@@ -63016,11 +64799,17 @@
 <a name="Exponentiation"></a>
 <h4 class="subsection">48.2.5 Exponentiation</h4>
 
-<p>Binary &quot;**&quot; is the exponentiation operator.  It binds even more
-tightly than unary minus, so -2**4 is -(2**4), not (-2)**4.  (This is
-implemented using C&rsquo;s pow(3) function, which actually works on doubles
+<p>Binary <code>&quot;**&quot;</code> is the exponentiation operator.  It 
binds even more
+tightly than unary minus, so <code>-2**4</code> is <code>-(2**4)</code>, not 
<code>(-2)**4</code>.
+(This is
+implemented using C&rsquo;s <code>pow(3)</code> function, which actually works 
on doubles
 internally.)
 </p>
+<p>Note that certain exponentiation expressions are ill-defined:
+these include <code>0**0</code>, <code>1**Inf</code>, and <code>Inf**0</code>. 
 Do not expect
+any particular results from these special cases, the results
+are platform-dependent.
+</p>
 <hr>
 <a name="perlop-Symbolic-Unary-Operators"></a>
 <div class="header">
@@ -63030,27 +64819,28 @@
 <a name="Symbolic-Unary-Operators"></a>
 <h4 class="subsection">48.2.6 Symbolic Unary Operators</h4>
 
-<p>Unary &quot;!&quot; performs logical negation, that is, &quot;not&quot;.  
See also <code>not</code> for a lower
+<p>Unary <code>&quot;!&quot;</code> performs logical negation, that is, 
&quot;not&quot;.  See also <code>not</code> for a lower
 precedence version of this.
 </p>
-<p>Unary &quot;-&quot; performs arithmetic negation if the operand is numeric,
+<p>Unary <code>&quot;-&quot;</code> performs arithmetic negation if the 
operand is numeric,
 including any string that looks like a number.  If the operand is
 an identifier, a string consisting of a minus sign concatenated
 with the identifier is returned.  Otherwise, if the string starts
 with a plus or minus, a string starting with the opposite sign is
-returned.  One effect of these rules is that -bareword is equivalent
-to the string &quot;-bareword&quot;.  If, however, the string begins with a
-non-alphabetic character (excluding &quot;+&quot; or &quot;-&quot;), Perl will 
attempt to convert
-the string to a numeric and the arithmetic negation is performed.  If the
+returned.  One effect of these rules is that <code>-bareword</code> is 
equivalent
+to the string <code>&quot;-bareword&quot;</code>.  If, however, the string 
begins with a
+non-alphabetic character (excluding <code>&quot;+&quot;</code> or 
<code>&quot;-&quot;</code>), Perl will attempt
+to convert
+the string to a numeric, and the arithmetic negation is performed.  If the
 string cannot be cleanly converted to a numeric, Perl will give the warning
 <strong>Argument &quot;the string&quot; isn&rsquo;t numeric in negation (-) at 
...</strong>.
 </p>
-<p>Unary &quot;~&quot; performs bitwise negation, that is, 1&rsquo;s 
complement.  For
-example, <code>0666 &amp; ~027</code> is 0640.  (See also <a 
href="#perlop-Integer-Arithmetic">Integer Arithmetic</a> and
+<p>Unary <code>&quot;~&quot;</code> performs bitwise negation, that is, 
1&rsquo;s complement.  For
+example, <code>0666&nbsp;&amp;&nbsp;~027</code><!-- /@w --> is 0640.  (See 
also <a href="#perlop-Integer-Arithmetic">Integer Arithmetic</a> and
 <a href="#perlop-Bitwise-String-Operators">Bitwise String Operators</a>.)  
Note that the width of the result is
-platform-dependent: ~0 is 32 bits wide on a 32-bit platform, but 64
+platform-dependent: <code>~0</code> is 32 bits wide on a 32-bit platform, but 
64
 bits wide on a 64-bit platform, so if you are expecting a certain bit
-width, remember to use the &quot;&amp;&quot; operator to mask off the excess 
bits.
+width, remember to use the <code>&quot;&amp;&quot;</code> operator to mask off 
the excess bits.
 </p>
 <p>When complementing strings, if all characters have ordinal values under
 256, then their complements will, also.  But if they do not, all
@@ -63058,12 +64848,18 @@
 architecture.  So for example, <code>~&quot;\x{3B1}&quot;</code> is 
<code>&quot;\x{FFFF_FC4E}&quot;</code> on
 32-bit machines and <code>&quot;\x{FFFF_FFFF_FFFF_FC4E}&quot;</code> on 64-bit 
machines.
 </p>
-<p>Unary &quot;+&quot; has no effect whatsoever, even on strings.  It is useful
+<p>If the experimental &quot;bitwise&quot; feature is enabled via 
<code>use&nbsp;feature&nbsp;'bitwise'</code><!-- /@w -->, then unary 
<code>&quot;~&quot;</code> always treats its argument as a number, and an
+alternate form of the operator, <code>&quot;~.&quot;</code>, always treats its 
argument as a
+string.  So <code>~0</code> and <code>~&quot;0&quot;</code> will both give 
2**32-1 on 32-bit platforms,
+whereas <code>~.0</code> and <code>~.&quot;0&quot;</code> will both yield 
<code>&quot;\xff&quot;</code>.  This feature
+produces a warning unless you use 
<code>no&nbsp;warnings&nbsp;'experimental::bitwise'</code><!-- /@w -->.
+</p>
+<p>Unary <code>&quot;+&quot;</code> has no effect whatsoever, even on strings. 
 It is useful
 syntactically for separating a function name from a parenthesized expression
 that would otherwise be interpreted as the complete list of function
 arguments.  (See examples above under Terms and List Operators (Leftward).)
 </p>
-<p>Unary &quot;\&quot; creates a reference to whatever follows it.  See <a 
href="#perlreftut-NAME">perlreftut NAME</a>
+<p>Unary <code>&quot;\&quot;</code> creates a reference to whatever follows 
it.  See <a href="#perlreftut-NAME">perlreftut NAME</a>
 and <a href="#perlref-NAME">perlref NAME</a>.  Do not confuse this behavior 
with the behavior of
 backslash within a string, although both forms do convey the notion
 of protecting the next thing from interpolation.
@@ -63077,14 +64873,14 @@
 <a name="Binding-Operators"></a>
 <h4 class="subsection">48.2.7 Binding Operators</h4>
 
-<p>Binary &quot;=~&quot; binds a scalar expression to a pattern match.  
Certain operations
-search or modify the string $_ by default.  This operator makes that kind
+<p>Binary <code>&quot;=~&quot;</code> binds a scalar expression to a pattern 
match.  Certain operations
+search or modify the string <code>$_</code> by default.  This operator makes 
that kind
 of operation work on some other string.  The right argument is a search
 pattern, substitution, or transliteration.  The left argument is what is
 supposed to be searched, substituted, or transliterated instead of the default
-$_.  When used in scalar context, the return value generally indicates the
-success of the operation.  The exceptions are substitution (s///)
-and transliteration (y///) with the <code>/r</code> (non-destructive) option,
+<code>$_</code>.  When used in scalar context, the return value generally 
indicates the
+success of the operation.  The exceptions are substitution (<code>s///</code>)
+and transliteration (<code>y///</code>) with the <code>/r</code> 
(non-destructive) option,
 which cause the <strong>r</strong>eturn value to be the result of the 
substitution.
 Behavior in list context depends on the particular operator.
 See <a href="#perlop-Regexp-Quote_002dLike-Operators">Regexp Quote-Like 
Operators</a> for details and <a href="#perlretut-NAME">perlretut NAME</a> for
@@ -63100,11 +64896,11 @@
 <p>is not ok, as the regex engine will end up trying to compile the
 pattern <code>\</code>, which it will consider a syntax error.
 </p>
-<p>Binary &quot;!~&quot; is just like &quot;=~&quot; except the return value 
is negated in
+<p>Binary <code>&quot;!~&quot;</code> is just like <code>&quot;=~&quot;</code> 
except the return value is negated in
 the logical sense.
 </p>
-<p>Binary &quot;!~&quot; with a non-destructive substitution (s///r) or 
transliteration
-(y///r) is a syntax error.
+<p>Binary <code>&quot;!~&quot;</code> with a non-destructive substitution 
(<code>s///r</code>) or transliteration
+(<code>y///r</code>) is a syntax error.
 </p>
 <hr>
 <a name="perlop-Multiplicative-Operators"></a>
@@ -63115,39 +64911,40 @@
 <a name="Multiplicative-Operators"></a>
 <h4 class="subsection">48.2.8 Multiplicative Operators</h4>
 
-<p>Binary &quot;*&quot; multiplies two numbers.
+<p>Binary <code>&quot;*&quot;</code> multiplies two numbers.
 </p>
-<p>Binary &quot;/&quot; divides two numbers.
+<p>Binary <code>&quot;/&quot;</code> divides two numbers.
 </p>
-<p>Binary &quot;%&quot; is the modulo operator, which computes the division
+<p>Binary <code>&quot;%&quot;</code> is the modulo operator, which computes 
the division
 remainder of its first argument with respect to its second argument.
 Given integer
-operands <code>$m</code> and <code>$n</code>: If <code>$n</code> is positive, 
then <code>$m % $n</code> is
+operands <code>$m</code> and <code>$n</code>: If <code>$n</code> is positive, 
then <code>$m&nbsp;%&nbsp;$n</code><!-- /@w --> is
 <code>$m</code> minus the largest multiple of <code>$n</code> less than or 
equal to
-<code>$m</code>.  If <code>$n</code> is negative, then <code>$m % $n</code> is 
<code>$m</code> minus the
+<code>$m</code>.  If <code>$n</code> is negative, then 
<code>$m&nbsp;%&nbsp;$n</code><!-- /@w --> is <code>$m</code> minus the
 smallest multiple of <code>$n</code> that is not less than <code>$m</code> 
(that is, the
 result will be less than or equal to zero).  If the operands
 <code>$m</code> and <code>$n</code> are floating point values and the absolute 
value of
-<code>$n</code> (that is <code>abs($n)</code>) is less than <code>(UV_MAX + 
1)</code>, only
+<code>$n</code> (that is <code>abs($n)</code>) is less than <code><span 
class="nolinebreak">(UV_MAX</span>&nbsp;+&nbsp;1)</code><!-- /@w -->, only
 the integer portion of <code>$m</code> and <code>$n</code> will be used in the 
operation
 (Note: here <code>UV_MAX</code> means the maximum of the unsigned integer 
type).
 If the absolute value of the right operand (<code>abs($n)</code>) is greater 
than
-or equal to <code>(UV_MAX + 1)</code>, &quot;%&quot; computes the 
floating-point remainder
-<code>$r</code> in the equation <code>($r = $m - $i*$n)</code> where 
<code>$i</code> is a certain
+or equal to <code><span 
class="nolinebreak">(UV_MAX</span>&nbsp;+&nbsp;1)</code><!-- /@w -->, 
<code>&quot;%&quot;</code> computes the floating-point remainder
+<code>$r</code> in the equation <code>($r&nbsp;=&nbsp;$m&nbsp;<span 
class="nolinebreak">-</span>&nbsp;$i*$n)</code><!-- /@w --> where 
<code>$i</code> is a certain
 integer that makes <code>$r</code> have the same sign as the right operand
 <code>$n</code> (<strong>not</strong> as the left operand <code>$m</code> like 
C function <code>fmod()</code>)
 and the absolute value less than that of <code>$n</code>.
-Note that when <code>use integer</code> is in scope, &quot;%&quot; gives you 
direct access
+Note that when <code>use&nbsp;integer</code><!-- /@w --> is in scope, 
<code>&quot;%&quot;</code> gives you direct access
 to the modulo operator as implemented by your C compiler.  This
 operator is not as well defined for negative operands, but it will
 execute faster.
 </p>
-<p>Binary &quot;x&quot; is the repetition operator.  In scalar context or if 
the left
+<p>Binary <code>&quot;x&quot;</code> is the repetition operator.  In scalar 
context or if the left
 operand is not enclosed in parentheses, it returns a string consisting
 of the left operand repeated the number of times specified by the right
 operand.  In list context, if the left operand is enclosed in
-parentheses or is a list formed by <code>qw/STRING/</code>, it repeats the 
list.
-If the right operand is zero or negative, it returns an empty string
+parentheses or is a list formed by <code>qw/<em>STRING</em>/</code>, it 
repeats the list.
+If the right operand is zero or negative (raising a warning on
+negative), it returns an empty string
 or an empty list, depending on the context.
 </p>
 <pre class="verbatim">    print '-' x 80;             # print row of dashes
@@ -63166,11 +64963,11 @@
 <a name="Additive-Operators"></a>
 <h4 class="subsection">48.2.9 Additive Operators</h4>
 
-<p>Binary <code>+</code> returns the sum of two numbers.
+<p>Binary <code>&quot;+&quot;</code> returns the sum of two numbers.
 </p>
-<p>Binary <code>-</code> returns the difference of two numbers.
+<p>Binary <code>&quot;-&quot;</code> returns the difference of two numbers.
 </p>
-<p>Binary <code>.</code> concatenates two strings.
+<p>Binary <code>&quot;.&quot;</code> concatenates two strings.
 </p>
 <hr>
 <a name="perlop-Shift-Operators-_003e-_003e_003e_003e"></a>
@@ -63181,16 +64978,16 @@
 <a name="Shift-Operators-_003e-_003e_003e_003e"></a>
 <h4 class="subsection">48.2.10 Shift Operators   &gt;  &gt;&gt;&gt;</h4>
 
-<p>Binary <code>&lt;&lt;</code> returns the value of its left argument shifted 
left by the
+<p>Binary <code>&quot;&lt;&lt;&quot;</code> returns the value of its left 
argument shifted left by the
 number of bits specified by the right argument.  Arguments should be
 integers.  (See also <a href="#perlop-Integer-Arithmetic">Integer 
Arithmetic</a>.)
 </p>
-<p>Binary <code>&gt;&gt;</code> returns the value of its left argument shifted 
right by
+<p>Binary <code>&quot;&gt;&gt;&quot;</code> returns the value of its left 
argument shifted right by
 the number of bits specified by the right argument.  Arguments should
 be integers.  (See also <a href="#perlop-Integer-Arithmetic">Integer 
Arithmetic</a>.)
 </p>
 <p>Note that both <code>&lt;&lt;</code> and <code>&gt;&gt;</code> in Perl are 
implemented directly using
-<code>&lt;&lt;</code> and <code>&gt;&gt;</code>  in C.  If <code>use 
integer</code> (see <a href="#perlop-Integer-Arithmetic">Integer 
Arithmetic</a>) is
+<code>&lt;&lt;</code> and <code>&gt;&gt;</code>  in C.  If 
<code>use&nbsp;integer</code><!-- /@w --> (see <a 
href="#perlop-Integer-Arithmetic">Integer Arithmetic</a>) is
 in force then signed C integers are used, else unsigned C integers are
 used.  Either way, the implementation isn&rsquo;t going to generate results
 larger than the size of the integer type Perl was built with (32 bits
@@ -63198,11 +64995,11 @@
 </p>
 <p>The result of overflowing the range of the integers is undefined
 because it is undefined also in C.  In other words, using 32-bit
-integers, <code>1 &lt;&lt; 32</code> is undefined.  Shifting by a negative 
number
+integers, <code>1&nbsp;&lt;&lt;&nbsp;32</code><!-- /@w --> is undefined.  
Shifting by a negative number
 of bits is also undefined.
 </p>
 <p>If you get tired of being subject to your platform&rsquo;s native integers,
-the <code>use bigint</code> pragma neatly sidesteps the issue altogether:
+the <code>use&nbsp;bigint</code><!-- /@w --> pragma neatly sidesteps the issue 
altogether:
 </p>
 <pre class="verbatim">    print 20 &lt;&lt; 20;  # 20971520
     print 20 &lt;&lt; 40;  # 5120 on 32-bit machines, 
@@ -63222,7 +65019,7 @@
 <p>The various named unary operators are treated as functions with one
 argument, with optional parentheses.
 </p>
-<p>If any list operator (print(), etc.) or any unary operator (chdir(), etc.)
+<p>If any list operator (<code>print()</code>, etc.) or any unary operator 
(<code>chdir()</code>, etc.)
 is followed by a left parenthesis as the next token, the operator and
 arguments within parentheses are taken to be of highest precedence,
 just like a normal function call.  For example,
@@ -63233,7 +65030,7 @@
     chdir ($foo)  || die;       # (chdir $foo) || die
     chdir +($foo) || die;       # (chdir $foo) || die
 </pre>
-<p>but, because * is higher precedence than named operators:
+<p>but, because <code>&quot;*&quot;</code> is higher precedence than named 
operators:
 </p>
 <pre class="verbatim">    chdir $foo * 20;    # chdir ($foo * 20)
     chdir($foo) * 20;   # (chdir $foo) * 20
@@ -63248,7 +65045,7 @@
 <p>Regarding precedence, the filetest operators, like <code>-f</code>, 
<code>-M</code>, etc. are
 treated like named unary operators, but they don&rsquo;t follow this functional
 parenthesis rule.  That means, for example, that 
<code>-f($file).&quot;.bak&quot;</code> is
-equivalent to <code>-f &quot;$file.bak&quot;</code>.
+equivalent to <code><span 
class="nolinebreak">-f</span>&nbsp;&quot;$file.bak&quot;</code><!-- /@w -->.
 </p>
 <p>See also <a 
href="#perlop-Terms-and-List-Operators-_0028Leftward_0029">Terms and List 
Operators (Leftward)</a>.
 </p>
@@ -63266,32 +65063,32 @@
 operators in this section and the equality operators in the next
 one return <code>1</code> for true and a special version of the defined empty
 string, <code>&quot;&quot;</code>, which counts as a zero but is exempt from 
warnings
-about improper numeric conversions, just as <code>&quot;0 but 
true&quot;</code> is.
+about improper numeric conversions, just as 
<code>&quot;0&nbsp;but&nbsp;true&quot;</code><!-- /@w --> is.
 </p>
-<p>Binary &quot;&lt;&quot; returns true if the left argument is numerically 
less than
+<p>Binary <code>&quot;&lt;&quot;</code> returns true if the left argument is 
numerically less than
 the right argument.
 </p>
-<p>Binary &quot;&gt;&quot; returns true if the left argument is numerically 
greater
+<p>Binary <code>&quot;&gt;&quot;</code> returns true if the left argument is 
numerically greater
 than the right argument.
  &gt;&gt;
 </p>
-<p>Binary &quot;&lt;=&quot; returns true if the left argument is numerically 
less than
+<p>Binary <code>&quot;&lt;=&quot;</code> returns true if the left argument is 
numerically less than
 or equal to the right argument.
 </p>
-<p>Binary &quot;&gt;=&quot; returns true if the left argument is numerically 
greater
+<p>Binary <code>&quot;&gt;=&quot;</code> returns true if the left argument is 
numerically greater
 than or equal to the right argument.
 = &gt;&gt;
 </p>
-<p>Binary &quot;lt&quot; returns true if the left argument is stringwise less 
than
+<p>Binary <code>&quot;lt&quot;</code> returns true if the left argument is 
stringwise less than
 the right argument.
 </p>
-<p>Binary &quot;gt&quot; returns true if the left argument is stringwise 
greater
+<p>Binary <code>&quot;gt&quot;</code> returns true if the left argument is 
stringwise greater
 than the right argument.
 </p>
-<p>Binary &quot;le&quot; returns true if the left argument is stringwise less 
than
+<p>Binary <code>&quot;le&quot;</code> returns true if the left argument is 
stringwise less than
 or equal to the right argument.
 </p>
-<p>Binary &quot;ge&quot; returns true if the left argument is stringwise 
greater
+<p>Binary <code>&quot;ge&quot;</code> returns true if the left argument is 
stringwise greater
 than or equal to the right argument.
 </p>
 <hr>
@@ -63303,48 +65100,54 @@
 <a name="Equality-Operators"></a>
 <h4 class="subsection">48.2.13 Equality Operators</h4>
 
-<p>Binary &quot;==&quot; returns true if the left argument is numerically 
equal to
+<p>Binary <code>&quot;==&quot;</code> returns true if the left argument is 
numerically equal to
 the right argument.
 </p>
-<p>Binary &quot;!=&quot; returns true if the left argument is numerically not 
equal
+<p>Binary <code>&quot;!=&quot;</code> returns true if the left argument is 
numerically not equal
 to the right argument.
 </p>
-<p>Binary &quot;&lt;=&gt;&quot; returns -1, 0, or 1 depending on whether the 
left
+<p>Binary <code>&quot;&lt;=&gt;&quot;</code> returns -1, 0, or 1 depending on 
whether the left
 argument is numerically less than, equal to, or greater than the right
-argument.  If your platform supports NaNs (not-a-numbers) as numeric
-values, using them with &quot;&lt;=&gt;&quot; returns undef.  NaN is not 
&quot;&lt;&quot;, &quot;==&quot;, &quot;&gt;&quot;,
-&quot;&lt;=&quot; or &quot;&gt;=&quot; anything (even NaN), so those 5 return 
false.  NaN != NaN
-returns true, as does NaN != anything else.  If your platform doesn&rsquo;t
-support NaNs then NaN is just a string with numeric value 0.
+argument.  If your platform supports <code>NaN</code>&rsquo;s (not-a-numbers) 
as numeric
+values, using them with <code>&quot;&lt;=&gt;&quot;</code> returns undef.  
<code>NaN</code> is not
+<code>&quot;&lt;&quot;</code>, <code>&quot;==&quot;</code>, 
<code>&quot;&gt;&quot;</code>, <code>&quot;&lt;=&quot;</code> or 
<code>&quot;&gt;=&quot;</code> anything
+(even <code>NaN</code>), so those 5 return false.  
<code>NaN&nbsp;!=&nbsp;NaN</code><!-- /@w --> returns
+true, as does <code>NaN&nbsp;!=</code>&nbsp;<em>anything&nbsp;else</em><!-- 
/@w -->.  If your platform doesn&rsquo;t
+support <code>NaN</code>&rsquo;s then <code>NaN</code> is just a string with 
numeric value 0.
  &gt;&gt; 
 </p>
 <pre class="verbatim">    $ perl -le '$x = &quot;NaN&quot;; print &quot;No NaN 
support here&quot; if $x == $x'
     $ perl -le '$x = &quot;NaN&quot;; print &quot;NaN support here&quot; if $x 
!= $x'
 </pre>
 <p>(Note that the <a href="bigint.html#Top">(bigint)</a>, <a 
href="bigrat.html#Top">(bigrat)</a>, and <a href="bignum.html#Top">(bignum)</a> 
pragmas all
-support &quot;NaN&quot;.)
+support <code>&quot;NaN&quot;</code>.)
 </p>
-<p>Binary &quot;eq&quot; returns true if the left argument is stringwise equal 
to
+<p>Binary <code>&quot;eq&quot;</code> returns true if the left argument is 
stringwise equal to
 the right argument.
 </p>
-<p>Binary &quot;ne&quot; returns true if the left argument is stringwise not 
equal
+<p>Binary <code>&quot;ne&quot;</code> returns true if the left argument is 
stringwise not equal
 to the right argument.
 </p>
-<p>Binary &quot;cmp&quot; returns -1, 0, or 1 depending on whether the left
+<p>Binary <code>&quot;cmp&quot;</code> returns -1, 0, or 1 depending on 
whether the left
 argument is stringwise less than, equal to, or greater than the right
 argument.
 </p>
-<p>Binary &quot;~~&quot; does a smartmatch between its arguments.  Smart 
matching
+<p>Binary <code>&quot;~~&quot;</code> does a smartmatch between its arguments. 
 Smart matching
 is described in the next section.
 </p>
-<p>&quot;lt&quot;, &quot;le&quot;, &quot;ge&quot;, &quot;gt&quot; and 
&quot;cmp&quot; use the collation (sort) order specified
-by the current locale if a legacy <code>use locale</code> (but not
-<code>use locale ':not_characters'</code>) is in effect.  See
-<a href="#perllocale-NAME">perllocale NAME</a>.  Do not mix these with 
Unicode, only with legacy binary
-encodings.  The standard <a 
href="Unicode-Collate.html#Top">(Unicode-Collate)</a> and
-<a href="Unicode-Collate-Locale.html#Top">(Unicode-Collate-Locale)</a> modules 
offer much more powerful solutions to
-collation issues.
+<p><code>&quot;lt&quot;</code>, <code>&quot;le&quot;</code>, 
<code>&quot;ge&quot;</code>, <code>&quot;gt&quot;</code> and 
<code>&quot;cmp&quot;</code> use the collation (sort)
+order specified by the current <code>LC_COLLATE</code> locale if a 
<code>use&nbsp;locale</code><!-- /@w --> form that includes collation is in 
effect.  See <a href="#perllocale-NAME">perllocale NAME</a>.
+Do not mix these with Unicode,
+only use them with legacy 8-bit locale encodings.
+The standard <code><a 
href="Unicode-Collate.html#Top">(Unicode-Collate)</a></code> and
+<code><a 
href="Unicode-Collate-Locale.html#Top">(Unicode-Collate-Locale)</a></code> 
modules offer much more powerful
+solutions to collation issues.
 </p>
+<p>For case-insensitive comparisions, look at the <a 
href="#perlfunc-fc">perlfunc fc</a> case-folding
+function, available in Perl v5.16 or later:
+</p>
+<pre class="verbatim">    if ( fc($x) eq fc($y) ) { ... }
+</pre>
 <hr>
 <a name="perlop-Smartmatch-Operator"></a>
 <div class="header">
@@ -63556,9 +65359,9 @@
         say &quot;a and b don't smartmatch each other at all&quot;;
     } 
 </pre>
-<p>If you were to set <code>$b[3] = 4</code>, then instead of reporting that 
&quot;a and b
-are deep copies of each other&quot;, it now reports that &quot;b smartmatches 
in a&quot;.
-That because the corresponding position in <code>@a</code> contains an array 
that
+<p>If you were to set <code>$b[3]&nbsp;=&nbsp;4</code><!-- /@w -->, then 
instead of reporting that &quot;a and b
+are deep copies of each other&quot;, it now reports that <code>&quot;b 
smartmatches in a&quot;</code>.
+That&rsquo;s because the corresponding position in <code>@a</code> contains an 
array that
 (eventually) has a 4 in it.
 </p>
 <p>Smartmatching one hash against another reports whether both contain the
@@ -63637,7 +65440,7 @@
 <p>does <em>not</em> invoke the overload method with <code><em>X</em></code> 
as an argument.
 Instead the above table is consulted as normal, and based on the type of
 <code><em>X</em></code>, overloading may or may not be invoked.  For simple 
strings or
-numbers, in becomes equivalent to this:
+numbers, &quot;in&quot; becomes equivalent to this:
 </p>
 <pre class="verbatim">    $object ~~ $number          ref($object) == $number
     $object ~~ $string          ref($object) eq $string 
@@ -63663,16 +65466,19 @@
 <a name="Bitwise-And"></a>
 <h4 class="subsection">48.2.15 Bitwise And</h4>
 
-<p>Binary &quot;&amp;&quot; returns its operands ANDed together bit by bit.  
Although no
+<p>Binary <code>&quot;&amp;&quot;</code> returns its operands ANDed together 
bit by bit.  Although no
 warning is currently raised, the result is not well defined when this operation
 is performed on operands that aren&rsquo;t either numbers (see
-<a href="#perlop-Integer-Arithmetic">Integer Arithmetic</a>) or bitstrings 
(see <a href="#perlop-Bitwise-String-Operators">Bitwise String Operators</a>).
+<a href="#perlop-Integer-Arithmetic">Integer Arithmetic</a>) nor bitstrings 
(see <a href="#perlop-Bitwise-String-Operators">Bitwise String Operators</a>).
 </p>
-<p>Note that &quot;&amp;&quot; has lower priority than relational operators, 
so for example
+<p>Note that <code>&quot;&amp;&quot;</code> has lower priority than relational 
operators, so for example
 the parentheses are essential in a test like
 </p>
 <pre class="verbatim">    print &quot;Even\n&quot; if ($x &amp; 1) == 0;
 </pre>
+<p>If the experimental &quot;bitwise&quot; feature is enabled via 
<code>use&nbsp;feature&nbsp;'bitwise'</code><!-- /@w -->, then this operator 
always treats its operand as numbers.  This
+feature produces a warning unless you also use 
<code>no&nbsp;warnings&nbsp;'experimental::bitwise'<!-- /@w --></code>.
+</p>
 <hr>
 <a name="perlop-Bitwise-Or-and-Exclusive-Or"></a>
 <div class="header">
@@ -63682,20 +65488,23 @@
 <a name="Bitwise-Or-and-Exclusive-Or"></a>
 <h4 class="subsection">48.2.16 Bitwise Or and Exclusive Or</h4>
 
-<p>Binary &quot;|&quot; returns its operands ORed together bit by bit.
+<p>Binary <code>&quot;|&quot;</code> returns its operands ORed together bit by 
bit.
 </p>
-<p>Binary &quot;^&quot; returns its operands XORed together bit by bit.
+<p>Binary <code>&quot;^&quot;</code> returns its operands XORed together bit 
by bit.
 </p>
 <p>Although no warning is currently raised, the results are not well
 defined when these operations are performed on operands that aren&rsquo;t 
either
-numbers (see <a href="#perlop-Integer-Arithmetic">Integer Arithmetic</a>) or 
bitstrings (see <a href="#perlop-Bitwise-String-Operators">Bitwise String
+numbers (see <a href="#perlop-Integer-Arithmetic">Integer Arithmetic</a>) nor 
bitstrings (see <a href="#perlop-Bitwise-String-Operators">Bitwise String
 Operators</a>).
 </p>
-<p>Note that &quot;|&quot; and &quot;^&quot; have lower priority than 
relational operators, so
-for example the brackets are essential in a test like
+<p>Note that <code>&quot;|&quot;</code> and <code>&quot;^&quot;</code> have 
lower priority than relational operators, so
+for example the parentheses are essential in a test like
 </p>
 <pre class="verbatim">    print &quot;false\n&quot; if (8 | 2) != 10;
 </pre>
+<p>If the experimental &quot;bitwise&quot; feature is enabled via 
<code>use&nbsp;feature&nbsp;'bitwise'</code><!-- /@w -->, then this operator 
always treats its operand as numbers.  This
+feature produces a warning unless you also use 
<code>no&nbsp;warnings&nbsp;'experimental::bitwise'</code><!-- /@w -->.
+</p>
 <hr>
 <a name="perlop-C_002dstyle-Logical-And"></a>
 <div class="header">
@@ -63705,7 +65514,7 @@
 <a name="C_002dstyle-Logical-And"></a>
 <h4 class="subsection">48.2.17 C-style Logical And</h4>
 
-<p>Binary &quot;&amp;&amp;&quot; performs a short-circuit logical AND 
operation.  That is,
+<p>Binary <code>&quot;&amp;&amp;&quot;</code> performs a short-circuit logical 
AND operation.  That is,
 if the left operand is false, the right operand is not even evaluated.
 Scalar or list context propagates down to the right operand if it
 is evaluated.
@@ -63719,7 +65528,7 @@
 <a name="C_002dstyle-Logical-Or"></a>
 <h4 class="subsection">48.2.18 C-style Logical Or</h4>
 
-<p>Binary &quot;||&quot; performs a short-circuit logical OR operation.  That 
is,
+<p>Binary <code>&quot;||&quot;</code> performs a short-circuit logical OR 
operation.  That is,
 if the left operand is true, the right operand is not even evaluated.
 Scalar or list context propagates down to the right operand if it
 is evaluated.
@@ -63734,17 +65543,17 @@
 <h4 class="subsection">48.2.19 Logical Defined-Or</h4>
 
 <p>Although it has no direct equivalent in C, Perl&rsquo;s <code>//</code> 
operator is related
-to its C-style or.  In fact, it&rsquo;s exactly the same as <code>||</code>, 
except that it
+to its C-style &quot;or&quot;.  In fact, it&rsquo;s exactly the same as 
<code>||</code>, except that it
 tests the left hand side&rsquo;s definedness instead of its truth.  Thus,
-<code>EXPR1 // EXPR2</code> returns the value of <code>EXPR1</code> if 
it&rsquo;s defined,
+<code>EXPR1&nbsp;//&nbsp;EXPR2</code><!-- /@w --> returns the value of 
<code>EXPR1</code> if it&rsquo;s defined,
 otherwise, the value of <code>EXPR2</code> is returned.
 (<code>EXPR1</code> is evaluated in scalar context, <code>EXPR2</code>
 in the context of <code>//</code> itself).  Usually,
-this is the same result as <code>defined(EXPR1) ? EXPR1 : EXPR2</code> (except 
that
-the ternary-operator form can be used as a lvalue, while <code>EXPR1 // 
EXPR2</code>
+this is the same result as 
<code>defined(EXPR1)&nbsp;?&nbsp;EXPR1&nbsp;:&nbsp;EXPR2</code><!-- /@w --> 
(except that
+the ternary-operator form can be used as a lvalue, while 
<code>EXPR1&nbsp;//&nbsp;EXPR2</code><!-- /@w -->
 cannot).  This is very useful for
 providing default values for variables.  If you actually want to test if
-at least one of <code>$x</code> and <code>$y</code> is defined, use 
<code>defined($x // $y)</code>.
+at least one of <code>$x</code> and <code>$y</code> is defined, use 
<code>defined($x&nbsp;//&nbsp;$y)</code><!-- /@w -->.
 </p>
 <p>The <code>||</code>, <code>//</code> and <code>&amp;&amp;</code> operators 
return the last value evaluated
 (unlike C&rsquo;s <code>||</code> and <code>&amp;&amp;</code>, which return 0 
or 1).  Thus, a reasonably
@@ -63764,8 +65573,8 @@
 </pre>
 <p>As alternatives to <code>&amp;&amp;</code> and <code>||</code> when used for
 control flow, Perl provides the <code>and</code> and <code>or</code> operators 
(see below).
-The short-circuit behavior is identical.  The precedence of &quot;and&quot;
-and &quot;or&quot; is much lower, however, so that you can safely use them 
after a
+The short-circuit behavior is identical.  The precedence of 
<code>&quot;and&quot;</code>
+and <code>&quot;or&quot;</code> is much lower, however, so that you can safely 
use them after a
 list operator without the need for parentheses:
 </p>
 <pre class="verbatim">    unlink &quot;alpha&quot;, &quot;beta&quot;, 
&quot;gamma&quot;
@@ -63783,7 +65592,7 @@
         next LINE;
     } 
 </pre>
-<p>Using &quot;or&quot; for assignment is unlikely to do what you want; see 
below.
+<p>Using <code>&quot;or&quot;</code> for assignment is unlikely to do what you 
want; see below.
 </p>
 <hr>
 <a name="perlop-Range-Operators"></a>
@@ -63794,12 +65603,12 @@
 <a name="Range-Operators"></a>
 <h4 class="subsection">48.2.20 Range Operators</h4>
 
-<p>Binary &quot;..&quot; is the range operator, which is really two different
+<p>Binary <code>&quot;..&quot;</code> is the range operator, which is really 
two different
 operators depending on the context.  In list context, it returns a
 list of values counting (up by ones) from the left value to the right
 value.  If the left value is greater than the right value then it
 returns the empty list.  The range operator is useful for writing
-<code>foreach (1..10)</code> loops and for doing slice operations on arrays.  
In
+<code>foreach&nbsp;(1..10)</code><!-- /@w --> loops and for doing slice 
operations on arrays.  In
 the current implementation, no temporary array is created when the
 range operator is used as the expression in <code>foreach</code> loops, but 
older
 versions of Perl might burn a lot of memory when you write something
@@ -63812,9 +65621,9 @@
 <p>The range operator also works on strings, using the magical
 auto-increment, see below.
 </p>
-<p>In scalar context, &quot;..&quot; returns a boolean value.  The operator is
+<p>In scalar context, <code>&quot;..&quot;</code> returns a boolean value.  
The operator is
 bistable, like a flip-flop, and emulates the line-range (comma)
-operator of <strong>sed</strong>, <strong>awk</strong>, and various editors.  
Each &quot;..&quot; operator
+operator of <strong>sed</strong>, <strong>awk</strong>, and various editors.  
Each <code>&quot;..&quot;</code> operator
 maintains its own boolean state, even across calls to a subroutine
 that contains it.  It is false as long as its left operand is false.
 Once the left operand is true, the range operator stays true until the
@@ -63823,8 +65632,8 @@
 is evaluated.  It can test the right operand and become false on the
 same evaluation it became true (as in <strong>awk</strong>), but it still 
returns
 true once.  If you don&rsquo;t want it to test the right operand until the
-next evaluation, as in <strong>sed</strong>, just use three dots 
(&quot;...&quot;) instead of
-two.  In all other regards, &quot;...&quot; behaves just like &quot;..&quot; 
does.
+next evaluation, as in <strong>sed</strong>, just use three dots 
(<code>&quot;...&quot;</code>) instead of
+two.  In all other regards, <code>&quot;...&quot;</code> behaves just like 
<code>&quot;..&quot;</code> does.
 </p>
 <p>The right operand is not evaluated while the operator is in the
 &quot;false&quot; state, and the left operand is not evaluated while the
@@ -63832,21 +65641,21 @@
 than || and &amp;&amp;.  The value returned is either the empty string for
 false, or a sequence number (beginning with 1) for true.  The sequence
 number is reset for each range encountered.  The final sequence number
-in a range has the string &quot;E0&quot; appended to it, which doesn&rsquo;t 
affect
+in a range has the string <code>&quot;E0&quot;</code> appended to it, which 
doesn&rsquo;t affect
 its numeric value, but gives you something to search for if you want
 to exclude the endpoint.  You can exclude the beginning point by
 waiting for the sequence number to be greater than 1.
 </p>
-<p>If either operand of scalar &quot;..&quot; is a constant expression,
+<p>If either operand of scalar <code>&quot;..&quot;</code> is a constant 
expression,
 that operand is considered true if it is equal (<code>==</code>) to the current
 input line number (the <code>$.</code> variable).
 </p>
-<p>To be pedantic, the comparison is actually <code>int(EXPR) == 
int(EXPR)</code>,
+<p>To be pedantic, the comparison is actually 
<code>int(EXPR)&nbsp;==&nbsp;int(EXPR)</code><!-- /@w -->,
 but that is only an issue if you use a floating point expression; when
 implicitly using <code>$.</code> as described in the previous paragraph, the
-comparison is <code>int(EXPR) == int($.)</code> which is only an issue when 
<code>$.</code>
+comparison is <code>int(EXPR)&nbsp;==&nbsp;int($.)</code><!-- /@w --> which is 
only an issue when <code>$.</code>
 is set to a floating point value and you are not reading from a file.
-Furthermore, <code>&quot;span&quot; .. &quot;spat&quot;</code> or <code>2.18 
.. 3.14</code> will not do what
+Furthermore, <code>&quot;span&quot;&nbsp;..&nbsp;&quot;spat&quot;</code><!-- 
/@w --> or <code>2.18&nbsp;..&nbsp;3.14</code><!-- /@w --> will not do what
 you want in scalar context because each of the operands are evaluated
 using their integer representation.
 </p>
@@ -63943,7 +65752,7 @@
 you could use the pattern <code>/(?:(?=\p{Greek})\p{Lower})+/</code> (or the
 <a href="#perlrecharclass-Extended-Bracketed-Character-Classes">experimental 
feature</a> <code>/(?[&nbsp;\p{Greek}&nbsp;&amp;&nbsp;\p{Lower}&nbsp;])+/<!-- 
/@w --></code>).
 </p>
-<p>Because each operand is evaluated in integer form, <code>2.18 .. 
3.14</code> will
+<p>Because each operand is evaluated in integer form, 
<code>2.18&nbsp;..&nbsp;3.14</code><!-- /@w --> will
 return two elements in list context.
 </p>
 <pre class="verbatim">    @list = (2.18 .. 3.14); # same as @list = (2 .. 3);
@@ -63957,10 +65766,10 @@
 <a name="Conditional-Operator"></a>
 <h4 class="subsection">48.2.21 Conditional Operator</h4>
 
-<p>Ternary &quot;?:&quot; is the conditional operator, just as in C.  It works 
much
-like an if-then-else.  If the argument before the ? is true, the
-argument before the : is returned, otherwise the argument after the :
-is returned.  For example:
+<p>Ternary <code>&quot;?:&quot;</code> is the conditional operator, just as in 
C.  It works much
+like an if-then-else.  If the argument before the <code>?</code> is true, the
+argument before the <code>:</code> is returned, otherwise the argument after 
the
+<code>:</code> is returned.  For example:
 </p>
 <pre class="verbatim">    printf &quot;I have %d dog%s.\n&quot;, $n,
             ($n == 1) ? &quot;&quot; : &quot;s&quot;;
@@ -64005,7 +65814,7 @@
 
 <p>&gt;     = &gt;&gt;&gt;   
 </p>
-<p>&quot;=&quot; is the ordinary assignment operator.
+<p><code>&quot;=&quot;</code> is the ordinary assignment operator.
 </p>
 <p>Assignment operators work as in C.  That is,
 </p>
@@ -64016,16 +65825,19 @@
 <pre class="verbatim">    $x = $x + 2;
 </pre>
 <p>although without duplicating any side effects that dereferencing the lvalue
-might trigger, such as from tie().  Other assignment operators work similarly.
+might trigger, such as from <code>tie()</code>.  Other assignment operators 
work similarly.
 The following are recognized:
 </p>
-<pre class="verbatim">    **=    +=    *=    &amp;=    &lt;&lt;=    &amp;&amp;=
-           -=    /=    |=    &gt;&gt;=    ||=
-           .=    %=    ^=           //=
+<pre class="verbatim">    **=    +=    *=    &amp;=    &amp;.=    &lt;&lt;=    
&amp;&amp;=
+           -=    /=    |=    |.=    &gt;&gt;=    ||=
+           .=    %=    ^=    ^.=           //=
                  x=
 </pre>
 <p>Although these are grouped by family, they all have the precedence
-of assignment.
+of assignment.  These combined assignment operators can only operate on
+scalars, whereas the ordinary assignment operator can assign to arrays,
+hashes, lists and even references.  (See <a 
href="#perldata-Context">&quot;Context&quot;</a>
+and <a href="#perldata-List-value-constructors">perldata List value 
constructors</a>, and <a href="#perlref-Assigning-to-References">perlref 
Assigning to References</a>.)
 </p>
 <p>Unlike in C, the scalar assignment operator produces a valid lvalue.
 Modifying an assignment is equivalent to doing the assignment and
@@ -64053,6 +65865,9 @@
 the number of elements produced by the expression on the right hand
 side of the assignment.
 </p>
+<p>The three dotted bitwise assignment operators (<code>&amp;.=</code> 
<code>|.=</code> <code>^.=</code>) are new in
+Perl 5.22 and experimental.  See <a 
href="#perlop-Bitwise-String-Operators">Bitwise String Operators</a>.
+</p>
 <hr>
 <a name="perlop-Comma-Operator"></a>
 <div class="header">
@@ -64062,7 +65877,7 @@
 <a name="Comma-Operator"></a>
 <h4 class="subsection">48.2.23 Comma Operator</h4>
 
-<p>Binary &quot;,&quot; is the comma operator.  In scalar context it evaluates
+<p>Binary <code>&quot;,&quot;</code> is the comma operator.  In scalar context 
it evaluates
 its left argument, throws that value away, then evaluates its right
 argument and returns that value.  This is just like C&rsquo;s comma operator.
 </p>
@@ -64070,7 +65885,8 @@
 both its arguments into the list.  These arguments are also evaluated
 from left to right.
 </p>
-<p>The <code>=&gt;</code> operator is a synonym for the comma except that it 
causes a
+<p>The <code>=&gt;</code> operator (sometimes pronounced &quot;fat 
comma&quot;) is a synonym
+for the comma except that it causes a
 word on its left to be interpreted as a string if it begins with a letter
 or underscore and is composed only of letters, digits and underscores.
 This includes operands that might otherwise be interpreted as operators,
@@ -64105,7 +65921,7 @@
 </p>
 <pre class="verbatim">    print time.shift =&gt; &quot;bbb&quot;;
 </pre>
-<p>That example prints something like &quot;1314363215shiftbbb&quot;, because 
the
+<p>That example prints something like 
<code>&quot;1314363215shiftbbb&quot;</code>, because the
 <code>=&gt;</code> implicitly quotes the <code>shift</code> immediately on its 
left, ignoring
 the fact that <code>time.shift</code> is the entire left operand.
 </p>
@@ -64121,7 +65937,7 @@
 <p>On the right side of a list operator, the comma has very low precedence,
 such that it controls all comma-separated expressions found there.
 The only operators with lower precedence are the logical operators
-&quot;and&quot;, &quot;or&quot;, and &quot;not&quot;, which may be used to 
evaluate calls to list
+<code>&quot;and&quot;</code>, <code>&quot;or&quot;</code>, and 
<code>&quot;not&quot;</code>, which may be used to evaluate calls to list
 operators without the need for parentheses:
 </p>
 <pre class="verbatim">    open HANDLE, &quot;&lt; :utf8&quot;, 
&quot;filename&quot; or die &quot;Can't open: $!\n&quot;;
@@ -64131,7 +65947,7 @@
 </p>
 <pre class="verbatim">    open(HANDLE, &quot;&lt; :utf8&quot;, 
&quot;filename&quot;) or die &quot;Can't open: $!\n&quot;;
 </pre>
-<p>in which case you might as well just use the more customary &quot;||&quot; 
operator:
+<p>in which case you might as well just use the more customary 
<code>&quot;||&quot;</code> operator:
 </p>
 <pre class="verbatim">    open(HANDLE, &quot;&lt; :utf8&quot;, 
&quot;filename&quot;) || die &quot;Can't open: $!\n&quot;;
 </pre>
@@ -64146,8 +65962,8 @@
 <a name="Logical-Not"></a>
 <h4 class="subsection">48.2.25 Logical Not</h4>
 
-<p>Unary &quot;not&quot; returns the logical negation of the expression to its 
right.
-It&rsquo;s the equivalent of &quot;!&quot; except for the very low precedence.
+<p>Unary <code>&quot;not&quot;</code> returns the logical negation of the 
expression to its right.
+It&rsquo;s the equivalent of <code>&quot;!&quot;</code> except for the very 
low precedence.
 </p>
 <hr>
 <a name="perlop-Logical-And"></a>
@@ -64158,7 +65974,7 @@
 <a name="Logical-And"></a>
 <h4 class="subsection">48.2.26 Logical And</h4>
 
-<p>Binary &quot;and&quot; returns the logical conjunction of the two 
surrounding
+<p>Binary <code>&quot;and&quot;</code> returns the logical conjunction of the 
two surrounding
 expressions.  It&rsquo;s equivalent to <code>&amp;&amp;</code> except for the 
very low
 precedence.  This means that it short-circuits: the right
 expression is evaluated only if the left expression is true.
@@ -64172,7 +65988,7 @@
 <a name="Logical-or-and-Exclusive-Or"></a>
 <h4 class="subsection">48.2.27 Logical or and Exclusive Or</h4>
 
-<p>Binary &quot;or&quot; returns the logical disjunction of the two surrounding
+<p>Binary <code>&quot;or&quot;</code> returns the logical disjunction of the 
two surrounding
 expressions.  It&rsquo;s equivalent to <code>||</code> except for the very low 
precedence.
 This makes it useful for control flow:
 </p>
@@ -64188,7 +66004,7 @@
     $x = $y || $z;              # better written this way
 </pre>
 <p>However, when it&rsquo;s a list-context assignment and you&rsquo;re trying 
to use
-<code>||</code> for control flow, you probably need &quot;or&quot; so that the 
assignment
+<code>||</code> for control flow, you probably need 
<code>&quot;or&quot;</code> so that the assignment
 takes higher precedence.
 </p>
 <pre class="verbatim">    @info = stat($file) || die;     # oops, scalar sense 
of stat!
@@ -64196,7 +66012,7 @@
 </pre>
 <p>Then again, you could always use parentheses.
 </p>
-<p>Binary <code>xor</code> returns the exclusive-OR of the two surrounding 
expressions.
+<p>Binary <code>&quot;xor&quot;</code> returns the exclusive-OR of the two 
surrounding expressions.
 It cannot short-circuit (of course).
 </p>
 <p>There is no low precedence operator for defined-OR.
@@ -64215,13 +66031,13 @@
 <dl compact="compact">
 <dt>unary &amp;</dt>
 <dd><a name="perlop-unary-_0026"></a>
-<p>Address-of operator.  (But see the &quot;\&quot; operator for taking a 
reference.)
+<p>Address-of operator.  (But see the <code>&quot;\&quot;</code> operator for 
taking a reference.)
 </p>
 </dd>
 <dt>unary *</dt>
 <dd><a name="perlop-unary-_002a"></a>
 <p>Dereference-address operator.  (Perl&rsquo;s prefix dereferencing
-operators are typed: $, @, %, and &amp;.)
+operators are typed: <code>$</code>, <code>@</code>, <code>%</code>, and 
<code>&amp;</code>.)
 </p>
 </dd>
 <dt>(TYPE)</dt>
@@ -64278,12 +66094,12 @@
 </p>
 <pre class="verbatim">    $s = q{ if($x eq &quot;}&quot;) ... }; # WRONG
 </pre>
-<p>is a syntax error.  The <code>Text::Balanced</code> module (standard as of 
v5.8,
+<p>is a syntax error.  The <code><a 
href="Text-Balanced.html#Top">(Text-Balanced)</a></code> module (standard as of 
v5.8,
 and from CPAN before then) is able to do this properly.
 </p>
 <p>There can be whitespace between the operator and the quoting
 characters, except when <code>#</code> is being used as the quoting character.
-<code>q#foo#</code> is parsed as the string <code>foo</code>, while <code>q 
#foo#</code> is the
+<code>q#foo#</code> is parsed as the string <code>foo</code>, while 
<code>q&nbsp;#foo#</code><!-- /@w --> is the
 operator <code>q</code> followed by a comment.  Its argument will be taken
 from the next line.  This allows you to write:
 </p>
@@ -64332,7 +66148,7 @@
 </p>
 <p>Only hexadecimal digits are valid following <code>\x</code>.  When 
<code>\x</code> is followed
 by fewer than two valid digits, any valid digits will be zero-padded.  This
-means that <code>\x7</code> will be interpreted as <code>\x07</code>, and a 
lone &lt;\x&gt; will be
+means that <code>\x7</code> will be interpreted as <code>\x07</code>, and a 
lone <code>&quot;\x&quot;</code> will be
 interpreted as <code>\x00</code>.  Except at the end of a string, having fewer 
than
 two valid digits will result in a warning.  Note that although the warning
 says the illegal character is ignored, it is only ignored as part of the
@@ -64354,7 +66170,7 @@
 </dd>
 <dt>[4]</dt>
 <dd><a name="perlop-_005b4_005d"></a>
-<p><code>\N{U+<em>hexadecimal number</em>}</code> means the Unicode character 
whose Unicode code
+<p><code>\N{U+<em>hexadecimal&nbsp;number</em>}</code><!-- /@w --> means the 
Unicode character whose Unicode code
 point is <em>hexadecimal number</em>.
 </p>
 </dd>
@@ -64373,17 +66189,19 @@
    \cZ      chr(26)
    \cz      chr(26)
    \c[      chr(27)
+                     # See below for chr(28)
    \c]      chr(29)
    \c^      chr(30)
    \c_      chr(31)
-   \c?      chr(127) # (on ASCII platforms)
+   \c?      chr(127) # (on ASCII platforms; see below for link to
+                     #  EBCDIC discussion)
 </pre>
 <p>In other words, it&rsquo;s the character whose code point has had 64 
xor&rsquo;d with
 its uppercase.  <code>\c?</code> is DELETE on ASCII platforms because
 <code>ord(&quot;?&quot;)&nbsp;^&nbsp;64</code><!-- /@w --> is 127, and
-<code>\c@</code> is NULL because the ord of &quot;@&quot; is 64, so 
xor&rsquo;ing 64 itself produces 0.
+<code>\c@</code> is NULL because the ord of <code>&quot;@&quot;</code> is 64, 
so xor&rsquo;ing 64 itself produces 0.
 </p>
-<p>Also, <code>\c\<em>X</em></code> yields <code> chr(28) . 
&quot;<em>X</em>&quot;</code> for any <em>X</em>, but cannot come at the
+<p>Also, <code>\c\<em>X</em></code> yields 
<code>&nbsp;chr(28)&nbsp;.&nbsp;&quot;<em>X</em>&quot;</code><!-- /@w --> for 
any <em>X</em>, but cannot come at the
 end of a string, because the backslash would be parsed as escaping the end
 quote.
 </p>
@@ -64393,10 +66211,11 @@
 differences between these for ASCII versus EBCDIC platforms.
 </p>
 <p>Use of any other character following the <code>&quot;c&quot;</code> besides 
those listed above is
-discouraged, and some are deprecated with the intention of removing
-those in a later Perl version.  What happens for any of these
-other characters currently though, is that the value is derived by 
xor&rsquo;ing
-with the seventh bit, which is 64.
+discouraged, and as of Perl v5.20, the only characters actually allowed
+are the printable ASCII ones, minus the left brace <code>&quot;{&quot;</code>. 
 What happens
+for any of the allowed other characters is that the value is derived by
+xor&rsquo;ing with the seventh bit, which is 64, and a warning raised if
+enabled.  Using the non-allowed characters generates a fatal error.
 </p>
 <p>To get platform independent controls, you can use <code>\N{...}</code>.
 </p>
@@ -64425,8 +66244,9 @@
 use <code>\o{}</code> instead, which avoids all these problems.  Otherwise, it 
is best to
 use this construct only for ordinals <code>\077</code> and below, remembering 
to pad to
 the left with zeros to make three digits.  For larger ordinals, either use
-<code>\o{}</code>, or convert to something else, such as to hex and use 
<code>\x{}</code>
-instead.
+<code>\o{}</code>, or convert to something else, such as to hex and use 
<code>\N{U+}</code>
+(which is portable between platforms with different character sets) or
+<code>\x{}</code> instead.
 </p>
 </dd>
 <dt>[8]</dt>
@@ -64442,15 +66262,15 @@
 character.  For example <code>\x{50}</code> and <code>\o{120}</code> both are 
the number 80 in
 decimal, which is less than 256, so the number is interpreted in the native
 character set encoding.  In ASCII the character in the 80th position (indexed
-from 0) is the letter &quot;P&quot;, and in EBCDIC it is the ampersand symbol 
&quot;&amp;&quot;.
+from 0) is the letter <code>&quot;P&quot;</code>, and in EBCDIC it is the 
ampersand symbol <code>&quot;&amp;&quot;</code>.
 <code>\x{100}</code> and <code>\o{400}</code> are both 256 in decimal, so the 
number is interpreted
 as a Unicode code point no matter what the native encoding is.  The name of the
 character in the 256th position (indexed by 0) in Unicode is
 <code>LATIN CAPITAL LETTER A WITH MACRON</code>.
 </p>
 <p>There are a couple of exceptions to the above rule.  
<code>\N{U+<em>hex&nbsp;number</em>}</code><!-- /@w --> is
-always interpreted as a Unicode code point, so that <code>\N{U+0050}</code> is 
&quot;P&quot; even
-on EBCDIC platforms.  And if <a 
href="encoding.html#Top">(encoding)<code>use&nbsp;encoding<!-- /@w 
--></code></a> is in effect, the
+always interpreted as a Unicode code point, so that <code>\N{U+0050}</code> is 
<code>&quot;P&quot;</code> even
+on EBCDIC platforms.  And if <code><a 
href="encoding.html#Top">(encoding)use&nbsp;encoding</a><!-- /@w --></code> is 
in effect, the
 number is considered to be in that encoding, and is translated from that into
 the platform&rsquo;s native encoding if there is a corresponding native 
character;
 otherwise to Unicode.
@@ -64460,8 +66280,7 @@
 
 <p><strong>NOTE</strong>: Unlike C and other languages, Perl has no 
<code>\v</code> escape sequence for
 the vertical tab (VT, which is 11 in both ASCII and EBCDIC), but you may
-use <code>\ck</code> or
-<code>\x0b</code>.  (<code>\v</code>
+use <code>\N{VT}</code>, <code>\ck</code>, <code>\N{U+0b}</code>, or 
<code>\x0b</code>.  (<code>\v</code>
 does have meaning in regular expression patterns in Perl, see <a 
href="#perlre-NAME">perlre NAME</a>.)
 </p>
 <p>The following escape sequences are available in constructs that interpolate,
@@ -64486,14 +66305,14 @@
 <pre class="verbatim"> say&quot;This \Qquoting \ubusiness \Uhere isn't quite\E 
done yet,\E is it?&quot;;
  This quoting\ Business\ HERE\ ISN\'T\ QUITE\ done\ yet\, is it?
 </pre>
-<p>If <code>use locale</code> is in effect (but not <code>use locale 
':not_characters'</code>),
-the case map used by <code>\l</code>, <code>\L</code>,
-<code>\u</code>, and <code>\U</code> is taken from the current locale.  See <a 
href="#perllocale-NAME">perllocale NAME</a>.
-If Unicode (for example, <code>\N{}</code> or code points of 0x100 or
-beyond) is being used, the case map used by <code>\l</code>, <code>\L</code>, 
<code>\u</code>, and
-<code>\U</code> is as defined by Unicode.  That means that case-mapping
-a single character can sometimes produce several characters.
-Under <code>use locale</code>, <code>\F</code> produces the same results as 
<code>\L</code>
+<p>If a <code>use&nbsp;locale</code><!-- /@w --> form that includes 
<code>LC_CTYPE</code> is in effect (see
+<a href="#perllocale-NAME">perllocale NAME</a>), the case map used by 
<code>\l</code>, <code>\L</code>, <code>\u</code>, and <code>\U</code> is
+taken from the current locale.  If Unicode (for example, <code>\N{}</code> or 
code
+points of 0x100 or beyond) is being used, the case map used by <code>\l</code>,
+<code>\L</code>, <code>\u</code>, and <code>\U</code> is as defined by 
Unicode.  That means that
+case-mapping a single character can sometimes produce a sequence of
+several characters.
+Under <code>use&nbsp;locale</code><!-- /@w -->, <code>\F</code> produces the 
same results as <code>\L</code>
 for all locales but a UTF-8 one, where it instead uses the Unicode
 definition.
 </p>
@@ -64503,7 +66322,7 @@
 device drivers, C libraries, and Perl all conspire to preserve.  Not all
 systems read <code>&quot;\r&quot;</code> as ASCII CR and 
<code>&quot;\n&quot;</code> as ASCII LF.  For example,
 on the ancient Macs (pre-MacOS X) of yesteryear, these used to be reversed,
-and on systems without line terminator,
+and on systems without a line terminator,
 printing <code>&quot;\n&quot;</code> might emit no actual data.  In general, 
use <code>&quot;\n&quot;</code> when
 you mean a &quot;newline&quot; for your system, but use the literal ASCII when 
you
 need an exact character.  For example, most networking protocols expect
@@ -64519,7 +66338,7 @@
 </p>
 <p>Interpolating an array or slice interpolates the elements in order,
 separated by the value of <code>$&quot;</code>, so is equivalent to 
interpolating
-<code>join $&quot;, @array</code>.  &quot;Punctuation&quot; arrays such as 
<code>@*</code> are usually
+<code>join&nbsp;$&quot;,&nbsp;@array</code><!-- /@w -->.  
&quot;Punctuation&quot; arrays such as <code>@*</code> are usually
 interpolated only if the name is enclosed in braces <code>@{*}</code>, but the
 arrays <code>@_</code>, <code>@+</code>, and <code>@-</code> are interpolated 
even without braces.
 </p>
@@ -64567,13 +66386,13 @@
 matching and related activities.
 </p>
 <dl compact="compact">
-<dt>qr/STRING/msixpodual</dt>
-<dd><a name="perlop-qr_002fSTRING_002fmsixpodual"></a>
+<dt><code>qr/<em>STRING</em>/msixpodualn</code></dt>
+<dd><a name="perlop-qr_002fSTRING_002fmsixpodualn"></a>
 <p>This operator quotes (and possibly compiles) its <em>STRING</em> as a 
regular
 expression.  <em>STRING</em> is interpolated the same way as <em>PATTERN</em>
-in <code>m/PATTERN/</code>.  If &quot;&rsquo;&quot; is used as the delimiter, 
no interpolation
+in <code>m/<em>PATTERN</em>/</code>.  If <code>&quot;'&quot;</code> is used as 
the delimiter, no interpolation
 is done.  Returns a Perl value which may be used instead of the
-corresponding <code>/STRING/msixpodual</code> expression.  The returned value 
is a
+corresponding <code>/<em>STRING</em>/msixpodualn</code> expression.  The 
returned value is a
 normalized version of the original pattern.  It magically differs from
 a string containing the same characters: <code>ref(qr/x/)</code> returns 
&quot;Regexp&quot;;
 however, dereferencing it is not well defined (you currently get the 
@@ -64597,9 +66416,9 @@
     $string =~ $re;             # or used standalone
     $string =~ /$re/;           # or this way
 </pre>
-<p>Since Perl may compile the pattern at the moment of execution of the qr()
-operator, using qr() may have speed advantages in some situations,
-notably if the result of qr() is used standalone:
+<p>Since Perl may compile the pattern at the moment of execution of the 
<code>qr()</code>
+operator, using <code>qr()</code> may have speed advantages in some situations,
+notably if the result of <code>qr()</code> is used standalone:
 </p>
 <pre class="verbatim">    sub match {
         my $patterns = shift;
@@ -64614,10 +66433,10 @@
     }
 </pre>
 <p>Precompilation of the pattern into an internal representation at
-the moment of qr() avoids a need to recompile the pattern every
+the moment of <code>qr()</code> avoids the need to recompile the pattern every
 time a match <code>/$pat/</code> is attempted.  (Perl has many other internal
 optimizations, but none would be triggered in the above example if
-we did not use qr() operator.)
+we did not use <code>qr()</code> operator.)
 </p>
 <p>Options (specified by the following modifiers) are:
 </p>
@@ -64627,18 +66446,20 @@
     x   Use extended regular expressions.
     p   When matching preserve a copy of the matched string so
         that ${^PREMATCH}, ${^MATCH}, ${^POSTMATCH} will be
-        defined.
+        defined (ignored starting in v5.20) as these are always
+        defined starting in that relese
     o   Compile pattern only once.
     a   ASCII-restrict: Use ASCII for \d, \s, \w; specifying two
-        a's further restricts /i matching so that no ASCII
-        character will match a non-ASCII one.
-    l   Use the locale.
+        a's further restricts things to that that no ASCII
+        character will match a non-ASCII one under /i.
+    l   Use the current run-time locale's rules.
     u   Use Unicode rules.
     d   Use Unicode or native charset, as in 5.12 and earlier.
+    n   Non-capture mode. Don't let () fill in $1, $2, etc...
 </pre>
 <p>If a precompiled pattern is embedded in a larger pattern then the effect
-of &quot;msixpluad&quot; will be propagated appropriately.  The effect the 
&quot;o&quot;
-modifier has is not propagated, being restricted to those patterns
+of <code>&quot;msixpluadn&quot;</code> will be propagated appropriately.  The 
effect that the
+<code>/o</code> modifier has is not propagated, being restricted to those 
patterns
 explicitly using it.
 </p>
 <p>The last four modifiers listed above, added in Perl 5.14,
@@ -64646,20 +66467,20 @@
 to want to specify explicitly; the other three are selected
 automatically by various pragmas.
 </p>
-<p>See <a href="#perlre-NAME">perlre NAME</a> for additional information on 
valid syntax for STRING, and
+<p>See <a href="#perlre-NAME">perlre NAME</a> for additional information on 
valid syntax for <em>STRING</em>, and
 for a detailed look at the semantics of regular expressions.  In
 particular, all modifiers except the largely obsolete <code>/o</code> are 
further
 explained in <a href="#perlre-Modifiers">perlre Modifiers</a>.  
<code>/o</code> is described in the next section.
 </p>
 </dd>
-<dt>m/PATTERN/msixpodualgc</dt>
-<dd><a name="perlop-m_002fPATTERN_002fmsixpodualgc"></a>
+<dt><code>m/<em>PATTERN</em>/msixpodualngc</code></dt>
+<dd><a name="perlop-m_002fPATTERN_002fmsixpodualngc"></a>
 </dd>
-<dt>/PATTERN/msixpodualgc</dt>
-<dd><a name="perlop-_002fPATTERN_002fmsixpodualgc"></a>
+<dt><code>/<em>PATTERN</em>/msixpodualngc</code></dt>
+<dd><a name="perlop-_002fPATTERN_002fmsixpodualngc"></a>
 <p>Searches a string for a pattern match, and in scalar context returns
 true if it succeeds, false if it fails.  If no string is specified
-via the <code>=~</code> or <code>!~</code> operator, the $_ string is 
searched.  (The
+via the <code>=~</code> or <code>!~</code> operator, the <code>$_</code> 
string is searched.  (The
 string specified with <code>=~</code> need not be an lvalue&ndash;it may be the
 result of an expression evaluation, but remember the <code>=~</code> binds
 rather tightly.)  See also <a href="#perlre-NAME">perlre NAME</a>.
@@ -64671,17 +66492,17 @@
  c  Do not reset search position on a failed match when /g is
     in effect.
 </pre>
-<p>If &quot;/&quot; is the delimiter then the initial <code>m</code> is 
optional.  With the <code>m</code>
+<p>If <code>&quot;/&quot;</code> is the delimiter then the initial 
<code>m</code> is optional.  With the <code>m</code>
 you can use any pair of non-whitespace (ASCII) characters
 as delimiters.  This is particularly useful for matching path names
-that contain &quot;/&quot;, to avoid LTS (leaning toothpick syndrome).  If 
&quot;?&quot; is
+that contain <code>&quot;/&quot;</code>, to avoid LTS (leaning toothpick 
syndrome).  If <code>&quot;?&quot;</code> is
 the delimiter, then a match-only-once rule applies,
-described in <code>m?PATTERN?</code> below.  If &quot;&rsquo;&quot; (single 
quote) is the delimiter,
-no interpolation is performed on the PATTERN.
-When using a character valid in an identifier, whitespace is required
+described in <code>m?<em>PATTERN</em>?</code> below.  If 
<code>&quot;'&quot;</code> (single quote) is the delimiter,
+no interpolation is performed on the <em>PATTERN</em>.
+When using a delimiter character valid in an identifier, whitespace is required
 after the <code>m</code>.
 </p>
-<p>PATTERN may contain variables, which will be interpolated
+<p><em>PATTERN</em> may contain variables, which will be interpolated
 every time the pattern search is evaluated, except
 for when the delimiter is a single quote.  (Note that <code>$(</code>, 
<code>$)</code>, and
 <code>$|</code> are not interpolated because they look like end-of-string 
tests.)
@@ -64720,9 +66541,9 @@
 <p>The bottom line is that using <code>/o</code> is almost never a good idea.
 </p>
 </dd>
-<dt>The empty pattern //</dt>
+<dt>The empty pattern <code>//</code></dt>
 <dd><a name="perlop-The-empty-pattern-_002f_002f"></a>
-<p>If the PATTERN evaluates to the empty string, the last
+<p>If the <em>PATTERN</em> evaluates to the empty string, the last
 <em>successfully</em> matched regular expression is used instead.  In this
 case, only the <code>g</code> and <code>c</code> flags on the empty pattern 
are honored;
 the other flags are taken from the original pattern.  If no match has
@@ -64732,8 +66553,8 @@
 <p>Note that it&rsquo;s possible to confuse Perl into thinking <code>//</code> 
(the empty
 regex) is really <code>//</code> (the defined-or operator).  Perl is usually 
pretty
 good about this, but some pathological cases might trigger this, such as
-<code>$x///</code> (is that <code>($x) / (//)</code> or <code>$x // /</code>?) 
and <code>print $fh //</code>
-(<code>print $fh(//</code> or <code>print($fh //</code>?).  In all of these 
examples, Perl
+<code>$x///</code> (is that <code>($x)&nbsp;/&nbsp;(//)</code><!-- /@w --> or 
<code>$x&nbsp;//&nbsp;/</code><!-- /@w -->?) and 
<code>print&nbsp;$fh&nbsp;//</code><!-- /@w -->
+(<code>print&nbsp;$fh(//</code><!-- /@w --> or 
<code>print($fh&nbsp;//</code><!-- /@w -->?).  In all of these examples, Perl
 will assume you meant defined-or.  If you meant the empty regex, just
 use parentheses or spaces to disambiguate, or even prefix the empty
 regex with an <code>m</code> (so <code>//</code> becomes <code>m//</code>).
@@ -64767,9 +66588,9 @@
 
  if (($F1, $F2, $Etc) = ($foo =~ /^(\S+)\s+(\S+)\s*(.*)/))
 </pre>
-<p>This last example splits $foo into the first two words and the
-remainder of the line, and assigns those three fields to $F1, $F2, and
-$Etc.  The conditional is true if any variables were assigned; that is,
+<p>This last example splits <code>$foo</code> into the first two words and the
+remainder of the line, and assigns those three fields to <code>$F1</code>, 
<code>$F2</code>, and
+<code>$Etc</code>.  The conditional is true if any variables were assigned; 
that is,
 if the pattern matched.
 </p>
 <p>The <code>/g</code> modifier specifies global pattern matching&ndash;that 
is,
@@ -64789,7 +66610,7 @@
 string also resets the search position.
 </p>
 </dd>
-<dt>\G assertion</dt>
+<dt><code>\G <em>assertion</em></code></dt>
 <dd><a name="perlop-_005cG-assertion"></a>
 <p>You can intermix <code>m//g</code> matches with <code>m/\G.../g</code>, 
where <code>\G</code> is a
 zero-width assertion that matches the exact position where the
@@ -64905,13 +66726,13 @@
  lowercase line-noise MiXeD line-noise. That's all!
 </pre>
 </dd>
-<dt>m?PATTERN?msixpodualgc</dt>
-<dd><a name="perlop-m_003fPATTERN_003fmsixpodualgc"></a>
+<dt><code>m?<em>PATTERN</em>?msixpodualngc</code></dt>
+<dd><a name="perlop-m_003fPATTERN_003fmsixpodualngc"></a>
 </dd>
-<dt>?PATTERN?msixpodualgc</dt>
-<dd><a name="perlop-_003fPATTERN_003fmsixpodualgc"></a>
-<p>This is just like the <code>m/PATTERN/</code> search, except that it matches
-only once between calls to the reset() operator.  This is a useful
+<dt><code>?<em>PATTERN</em>?msixpodualngc</code></dt>
+<dd><a name="perlop-_003fPATTERN_003fmsixpodualngc"></a>
+<p>This is just like the <code>m/<em>PATTERN</em>/</code> search, except that 
it matches
+only once between calls to the <code>reset()</code> operator.  This is a useful
 optimization when you want to see only the first occurrence of
 something in each file of a set of files, for instance.  Only <code>m??</code>
 patterns local to the current package are reset.
@@ -64932,14 +66753,14 @@
 <p>The match-once behavior is controlled by the match delimiter being
 <code>?</code>; with any other delimiter this is the normal <code>m//</code> 
operator.  
 </p>
-<p>For historical reasons, the leading <code>m</code> in 
<code>m?PATTERN?</code> is optional,
-but the resulting <code>?PATTERN?</code> syntax is deprecated, will warn on
-usage and might be removed from a future stable release of Perl (without
-further notice!).
+<p>In the past, the leading <code>m</code> in <code>m?<em>PATTERN</em>?</code> 
was optional, but omitting it
+would produce a deprecation warning.  As of v5.22.0, omitting it produces a
+syntax error.  If you encounter this construct in older code, you can just add
+<code>m</code>.
 </p>
 </dd>
-<dt>s/PATTERN/REPLACEMENT/msixpodualgcer</dt>
-<dd><a name="perlop-s_002fPATTERN_002fREPLACEMENT_002fmsixpodualgcer"></a>
+<dt><code>s/<em>PATTERN</em>/<em>REPLACEMENT</em>/msixpodualngcer</code></dt>
+<dd><a name="perlop-s_002fPATTERN_002fREPLACEMENT_002fmsixpodualngcer"></a>
 <p>Searches a string for a pattern, and if found, replaces that pattern
 with the replacement text and returns the number of substitutions
 made.  Otherwise it returns false (specifically, the empty string).
@@ -64958,15 +66779,15 @@
 scalar lvalue.
 </p>
 <p>If the delimiter chosen is a single quote, no interpolation is
-done on either the PATTERN or the REPLACEMENT.  Otherwise, if the
-PATTERN contains a $ that looks like a variable rather than an
+done on either the <em>PATTERN</em> or the <em>REPLACEMENT</em>.  Otherwise, 
if the
+<em>PATTERN</em> contains a <code>$</code> that looks like a variable rather 
than an
 end-of-string test, the variable will be interpolated into the pattern
 at run-time.  If you want the pattern compiled only once the first time
 the variable is interpolated, use the <code>/o</code> option.  If the pattern
 evaluates to the empty string, the last successfully executed regular
 expression is used instead.  See <a href="#perlre-NAME">perlre NAME</a> for 
further explanation on these.
 </p>
-<p>Options are as with m// with the addition of the following replacement
+<p>Options are as with <code>m//</code> with the addition of the following 
replacement
 specific options:
 </p>
 <pre class="verbatim">    e   Evaluate the right side as an expression.
@@ -64980,7 +66801,7 @@
 are used, no interpretation is done on the replacement string (the 
<code>/e</code>
 modifier overrides this, however).  Note that Perl treats backticks
 as normal delimiters; the replacement text is not evaluated as a command.
-If the PATTERN is delimited by bracketing quotes, the REPLACEMENT has
+If the <em>PATTERN</em> is delimited by bracketing quotes, the 
<em>REPLACEMENT</em> has
 its own pair of quotes, which may or may not be bracketing quotes, for example,
 <code>s(foo)(bar)</code> or <code>s&lt;foo&gt;/bar/</code>.  A <code>/e</code> 
will cause the
 replacement portion to be treated as a full-fledged Perl expression
@@ -65055,8 +66876,8 @@
 
     s/([^ ]*) *([^ ]*)/$2 $1/;  # reverse 1st two fields
 </pre>
-<p>Note the use of $ instead of \ in the last example.  Unlike
-<strong>sed</strong>, we use the \&lt;<em>digit</em>&gt; form in only the left 
hand side.
+<p>Note the use of <code>$</code> instead of <code>\</code> in the last 
example.  Unlike
+<strong>sed</strong>, we use the \&lt;<em>digit</em>&gt; form only in the left 
hand side.
 Anywhere else it&rsquo;s $&lt;<em>digit</em>&gt;.
 </p>
 <p>Occasionally, you can&rsquo;t use just a <code>/g</code> to get all the 
changes
@@ -65081,10 +66902,10 @@
 <h4 class="subsection">48.2.31 Quote-Like Operators</h4>
 
 <dl compact="compact">
-<dt>q/STRING/</dt>
+<dt><code>q/<em>STRING</em>/</code></dt>
 <dd><a name="perlop-q_002fSTRING_002f"></a>
 </dd>
-<dt>&rsquo;STRING&rsquo;</dt>
+<dt><code>'<em>STRING</em>'</code></dt>
 <dd><a name="perlop-_0027STRING_0027"></a>
 <p>A single-quoted, literal string.  A backslash represents a backslash
 unless followed by the delimiter or another backslash, in which case
@@ -65095,10 +66916,10 @@
     $baz = '\n';                # a two-character string
 </pre>
 </dd>
-<dt>qq/STRING/</dt>
+<dt><code>qq/<em>STRING</em>/</code></dt>
 <dd><a name="perlop-qq_002fSTRING_002f"></a>
 </dd>
-<dt>&quot;STRING&quot;</dt>
+<dt>&quot;<em>STRING</em>&quot;</dt>
 <dd><a name="perlop-_0022STRING_0022"></a>
 <p>A double-quoted, interpolated string.
 </p>
@@ -65108,19 +66929,19 @@
     $baz = &quot;\n&quot;;                # a one-character string
 </pre>
 </dd>
-<dt>qx/STRING/</dt>
+<dt><code>qx/<em>STRING</em>/</code></dt>
 <dd><a name="perlop-qx_002fSTRING_002f"></a>
 </dd>
-<dt>&lsquo;STRING&lsquo;</dt>
+<dt><code>`<em>STRING</em>`</code></dt>
 <dd><a name="perlop-_0060STRING_0060"></a>
 <p>A string which is (possibly) interpolated and then executed as a
 system command with <samp>/bin/sh</samp> or its equivalent.  Shell wildcards,
 pipes, and redirections will be honored.  The collected standard
 output of the command is returned; standard error is unaffected.  In
 scalar context, it comes back as a single (potentially multi-line)
-string, or undef if the command failed.  In list context, returns a
-list of lines (however you&rsquo;ve defined lines with $/ or
-$INPUT_RECORD_SEPARATOR), or an empty list if the command failed.
+string, or <code>undef</code> if the command failed.  In list context, returns 
a
+list of lines (however you&rsquo;ve defined lines with <code>$/</code> or
+<code>$INPUT_RECORD_SEPARATOR</code>), or an empty list if the command failed.
 </p>
 <p>Because backticks do not affect standard error, use shell file descriptor
 syntax (assuming the shell supports this) if you care to address this.
@@ -65167,7 +66988,7 @@
 interpreter on your system.  On most platforms, you will have to protect
 shell metacharacters if you want them treated literally.  This is in
 practice difficult to do, as it&rsquo;s unclear how to escape which characters.
-See <a href="#perlsec-NAME">perlsec NAME</a> for a clean and safe example of a 
manual fork() and exec()
+See <a href="#perlsec-NAME">perlsec NAME</a> for a clean and safe example of a 
manual <code>fork()</code> and <code>exec()</code>
 to emulate backticks safely.
 </p>
 <p>On some platforms (notably DOS-like ones), the shell may not be
@@ -65180,8 +67001,8 @@
 <p>Perl will attempt to flush all files opened for
 output before starting the child process, but this may not be supported
 on some platforms (see <a href="#perlport-NAME">perlport NAME</a>).  To be 
safe, you may need to set
-<code>$|</code> ($AUTOFLUSH in English) or call the <code>autoflush()</code> 
method of
-<code>IO::Handle</code> on any open handles.
+<code>$|</code> (<code>$AUTOFLUSH</code> in <code><a 
href="English.html#Top">(English)</a></code>) or call the 
<code>autoflush()</code> method of
+<code><a href="IO-Handle.html#Top">(IO-Handle)</a></code> on any open handles.
 </p>
 <p>Beware that some command shells may place restrictions on the length
 of the command line.  You must ensure your strings don&rsquo;t exceed this
@@ -65200,9 +67021,9 @@
 <p>See <a href="#perlop-I_002fO-Operators">I/O Operators</a> for more 
discussion.
 </p>
 </dd>
-<dt>qw/STRING/</dt>
+<dt><code>qw/<em>STRING</em>/</code></dt>
 <dd><a name="perlop-qw_002fSTRING_002f"></a>
-<p>Evaluates to a list of the words extracted out of STRING, using embedded
+<p>Evaluates to a list of the words extracted out of <em>STRING</em>, using 
embedded
 whitespace as the word delimiters.  It can be understood as being roughly
 equivalent to:
 </p>
@@ -65223,21 +67044,21 @@
 <pre class="verbatim">    use POSIX qw( setlocale localeconv )
     @EXPORT = qw( foo bar baz );
 </pre>
-<p>A common mistake is to try to separate the words with comma or to
+<p>A common mistake is to try to separate the words with commas or to
 put comments into a multi-line <code>qw</code>-string.  For this reason, the
-<code>use warnings</code> pragma and the <strong>-w</strong> switch (that is, 
the <code>$^W</code> variable)
-produces warnings if the STRING contains the &quot;,&quot; or the 
&quot;#&quot; character.
+<code>use&nbsp;warnings</code><!-- /@w --> pragma and the <strong>-w</strong> 
switch (that is, the <code>$^W</code> variable)
+produces warnings if the <em>STRING</em> contains the 
<code>&quot;,&quot;</code> or the <code>&quot;#&quot;</code> character.
 </p>
 </dd>
-<dt>tr/SEARCHLIST/REPLACEMENTLIST/cdsr</dt>
+<dt><code>tr/<em>SEARCHLIST</em>/<em>REPLACEMENTLIST</em>/cdsr</code></dt>
 <dd><a name="perlop-tr_002fSEARCHLIST_002fREPLACEMENTLIST_002fcdsr"></a>
 </dd>
-<dt>y/SEARCHLIST/REPLACEMENTLIST/cdsr</dt>
+<dt><code>y/<em>SEARCHLIST</em>/<em>REPLACEMENTLIST</em>/cdsr</code></dt>
 <dd><a name="perlop-y_002fSEARCHLIST_002fREPLACEMENTLIST_002fcdsr"></a>
 <p>Transliterates all occurrences of the characters found in the search list
 with the corresponding character in the replacement list.  It returns
 the number of characters replaced or deleted.  If no string is
-specified via the <code>=~</code> or <code>!~</code> operator, the $_ string 
is transliterated.
+specified via the <code>=~</code> or <code>!~</code> operator, the 
<code>$_</code> string is transliterated.
 </p>
 <p>If the <code>/r</code> (non-destructive) option is present, a new copy of 
the string
 is made and its characters transliterated, and this copy is returned no
@@ -65252,12 +67073,22 @@
 <p>A character range may be specified with a hyphen, so 
<code>tr/A-J/0-9/</code>
 does the same replacement as <code>tr/ACEGIBDFHJ/0246813579/</code>.
 For <strong>sed</strong> devotees, <code>y</code> is provided as a synonym for 
<code>tr</code>.  If the
-SEARCHLIST is delimited by bracketing quotes, the REPLACEMENTLIST has
+<em>SEARCHLIST</em> is delimited by bracketing quotes, the 
<em>REPLACEMENTLIST</em> has
 its own pair of quotes, which may or may not be bracketing quotes;
 for example, <code>tr[aeiouy][yuoiea]</code> or <code>tr(+\-*/)/ABCD/</code>.
 </p>
+<p>Characters may be literals or any of the escape sequences accepted in
+double-quoted strings.  But there is no interpolation, so 
<code>&quot;$&quot;</code> and
+<code>&quot;@&quot;</code> are treated as literals.  A hyphen at the beginning 
or end, or
+preceded by a backslash is considered a literal.  Escape sequence
+details are in <a href="#perlop-Quote-and-Quote_002dlike-Operators">the table 
near the beginning of this section</a>.  It is a bug in Perl v5.22 that 
something like
+</p>
+<pre class="verbatim"> tr/\N{U+20}-\N{U+7E}foobar//
+</pre>
+<p>does not treat that range as fully Unicode.
+</p>
 <p>Note that <code>tr</code> does <strong>not</strong> do regular expression 
character classes such as
-<code>\d</code> or <code>\pL</code>.  The <code>tr</code> operator is not 
equivalent to the tr(1)
+<code>\d</code> or <code>\pL</code>.  The <code>tr</code> operator is not 
equivalent to the <code><a href="http://man.he.net/man1/tr";>tr(1)</a></code>
 utility.  If you want to map strings between lower/upper cases, see
 <a href="#perlfunc-lc">perlfunc lc</a> and <a href="#perlfunc-uc">perlfunc 
uc</a>, and in general consider using the <code>s</code>
 operator if you need regular expressions.  The <code>\U</code>, 
<code>\u</code>, <code>\L</code>, and
@@ -65280,19 +67111,19 @@
     r   Return the modified string and leave the original string
         untouched.
 </pre>
-<p>If the <code>/c</code> modifier is specified, the SEARCHLIST character set
+<p>If the <code>/c</code> modifier is specified, the <em>SEARCHLIST</em> 
character set
 is complemented.  If the <code>/d</code> modifier is specified, any characters
-specified by SEARCHLIST not found in REPLACEMENTLIST are deleted.
+specified by <em>SEARCHLIST</em> not found in <em>REPLACEMENTLIST</em> are 
deleted.
 (Note that this is slightly more flexible than the behavior of some
-<strong>tr</strong> programs, which delete anything they find in the 
SEARCHLIST,
+<strong>tr</strong> programs, which delete anything they find in the 
<em>SEARCHLIST</em>,
 period.)  If the <code>/s</code> modifier is specified, sequences of characters
 that were transliterated to the same character are squashed down
 to a single instance of the character.
 </p>
-<p>If the <code>/d</code> modifier is used, the REPLACEMENTLIST is always 
interpreted
-exactly as specified.  Otherwise, if the REPLACEMENTLIST is shorter
-than the SEARCHLIST, the final character is replicated till it is long
-enough.  If the REPLACEMENTLIST is empty, the SEARCHLIST is replicated.
+<p>If the <code>/d</code> modifier is used, the <em>REPLACEMENTLIST</em> is 
always interpreted
+exactly as specified.  Otherwise, if the <em>REPLACEMENTLIST</em> is shorter
+than the <em>SEARCHLIST</em>, the final character is replicated till it is long
+enough.  If the <em>REPLACEMENTLIST</em> is empty, the <em>SEARCHLIST</em> is 
replicated.
 This latter is useful for counting characters in a class or for
 squashing character sequences in a class.
 </p>
@@ -65330,9 +67161,9 @@
 <p>will transliterate any A to X.
 </p>
 <p>Because the transliteration table is built at compile time, neither
-the SEARCHLIST nor the REPLACEMENTLIST are subjected to double quote
+the <em>SEARCHLIST</em> nor the <em>REPLACEMENTLIST</em> are subjected to 
double quote
 interpolation.  That means that if you want to use variables, you
-must use an eval():
+must use an <code>eval()</code>:
 </p>
 <pre class="verbatim">    eval &quot;tr/$oldlist/$newlist/&quot;;
     die $@ if $@;
@@ -65340,7 +67171,7 @@
     eval &quot;tr/$oldlist/$newlist/, 1&quot; or die $@;
 </pre>
 </dd>
-<dt>&lt;&lt;EOF    &gt;</dt>
+<dt><code>&lt;&lt;<em>EOF</em></code>    &gt;</dt>
 <dd><a name="perlop-_003c_003cEOF-_003e"></a>
 <p>A line-oriented form of quoting is based on the shell 
&quot;here-document&quot;
 syntax.  Following a <code>&lt;&lt;</code> you specify a string to terminate
@@ -65522,10 +67353,10 @@
 <dl compact="compact">
 <dt>Finding the end</dt>
 <dd><a name="perlop-Finding-the-end"></a>
-<p>The first pass is finding the end of the quoted construct, where
-the information about the delimiters is used in parsing.
-During this search, text between the starting and ending delimiters
-is copied to a safe location.  The text copied gets delimiter-independent.
+<p>The first pass is finding the end of the quoted construct.  This results
+in saving to a safe location a copy of the text (between the starting
+and ending delimiters), normalized as necessary to avoid needing to know
+what the original delimiters were.
 </p>
 <p>If the construct is a here-doc, the ending delimiter is a line
 that has a terminating string as the content.  Therefore 
<code>&lt;&lt;EOF</code> is
@@ -65540,7 +67371,7 @@
 (that is <code>(</code>, <code>[</code>, <code>{</code>, or 
<code>&lt;</code>), the ending delimiter is the
 corresponding closing punctuation (that is <code>)</code>, <code>]</code>, 
<code>}</code>, or <code>&gt;</code>).
 If the starting delimiter is an unpaired character like <code>/</code> or a 
closing
-punctuation, the ending delimiter is same as the starting delimiter.
+punctuation, the ending delimiter is the same as the starting delimiter.
 Therefore a <code>/</code> terminates a <code>qq//</code> construct, while a 
<code>]</code> terminates
 both <code>qq[]</code> and <code>qq]]</code> constructs.
 </p>
@@ -65548,7 +67379,7 @@
 and <code>\\</code> are skipped.  For example, while searching for terminating 
<code>/</code>,
 combinations of <code>\\</code> and <code>\/</code> are skipped.  If the 
delimiters are
 bracketing, nested pairs are also skipped.  For example, while searching
-for closing <code>]</code> paired with the opening <code>[</code>, 
combinations of <code>\\</code>, <code>\]</code>,
+for a closing <code>]</code> paired with the opening <code>[</code>, 
combinations of <code>\\</code>, <code>\]</code>,
 and <code>\[</code> are all skipped, and nested <code>[</code> and 
<code>]</code> are skipped as well.
 However, when backslashes are used as the delimiters (like <code>qq\\</code> 
and
 <code>tr\\\</code>), nothing is skipped.
@@ -65565,7 +67396,7 @@
 If the left part is delimited by bracketing punctuation (that is 
<code>()</code>,
 <code>[]</code>, <code>{}</code>, or <code>&lt;&gt;</code>), the right part 
needs another pair of
 delimiters such as <code>s(){}</code> and <code>tr[]//</code>.  In these 
cases, whitespace
-and comments are allowed between the two parts, though the comment must follow
+and comments are allowed between the two parts, although the comment must 
follow
 at least one whitespace character; otherwise a character expected as the 
 start of the comment may be regarded as the starting delimiter of the right 
part.
 </p>
@@ -65615,7 +67446,7 @@
 <dt><code>''</code>, <code>q//</code>, <code>tr'''</code>, <code>y'''</code>, 
the replacement of <code>s'''</code></dt>
 <dd><a 
name="perlop-_0027_0027_002c-q_002f_002f_002c-tr_0027_0027_0027_002c-y_0027_0027_0027_002c-the-replacement-of-s_0027_0027_0027"></a>
 <p>The only interpolation is removal of <code>\</code> from pairs of 
<code>\\</code>.
-Therefore <code>-</code> in <code>tr'''</code> and <code>y'''</code> is 
treated literally
+Therefore <code>&quot;-&quot;</code> in <code>tr'''</code> and 
<code>y'''</code> is treated literally
 as a hyphen and no character range is available.
 <code>\1</code> in the replacement of <code>s'''</code> does not work as 
<code>$1</code>.
 </p>
@@ -65626,15 +67457,15 @@
 case and quoting such as <code>\Q</code>, <code>\U</code>, and <code>\E</code> 
are not recognized.
 The other escape sequences such as <code>\200</code> and <code>\t</code> and 
backslashed
 characters such as <code>\\</code> and <code>\-</code> are converted to 
appropriate literals.
-The character <code>-</code> is treated specially and therefore 
<code>\-</code> is treated
-as a literal <code>-</code>.
+The character <code>&quot;-&quot;</code> is treated specially and therefore 
<code>\-</code> is treated
+as a literal <code>&quot;-&quot;</code>.
 </p>
 </dd>
 <dt><code>&quot;&quot;</code>, <code>``</code>, <code>qq//</code>, 
<code>qx//</code>, <code>&lt;file*glob&gt;</code>, 
<code>&lt;&lt;&quot;EOF&quot;</code></dt>
 <dd><a 
name="perlop-_0022_0022_002c-_0060_0060_002c-qq_002f_002f_002c-qx_002f_002f_002c-_003cfile_002aglob_003e_002c-_003c_003c_0022EOF_0022"></a>
 <p><code>\Q</code>, <code>\U</code>, <code>\u</code>, <code>\L</code>, 
<code>\l</code>, <code>\F</code> (possibly paired with <code>\E</code>) are
 converted to corresponding Perl constructs.  Thus, 
<code>&quot;$foo\Qbaz$bar&quot;</code>
-is converted to <code>$foo . (quotemeta(&quot;baz&quot; . $bar))</code> 
internally.
+is converted to 
<code>$foo&nbsp;.&nbsp;(quotemeta(&quot;baz&quot;&nbsp;.&nbsp;$bar))</code><!-- 
/@w --> internally.
 The other escape sequences such as <code>\200</code> and <code>\t</code> and 
backslashed
 characters such as <code>\\</code> and <code>\-</code> are replaced with 
appropriate
 expansions.
@@ -65653,21 +67484,21 @@
 <p>may be closer to the conjectural <em>intention</em> of the writer of 
<code>&quot;\Q\t\E&quot;</code>.
 </p>
 <p>Interpolated scalars and arrays are converted internally to the 
<code>join</code> and
-<code>.</code> catenation operations.  Thus, <code>&quot;$foo XXX 
'@arr'&quot;</code> becomes:
+<code>&quot;.&quot;</code> catenation operations.  Thus, 
<code>&quot;$foo&nbsp;XXX&nbsp;'@arr'&quot;</code><!-- /@w --> becomes:
 </p>
 <pre class="verbatim">  $foo . &quot; XXX '&quot; . (join $&quot;, @arr) . 
&quot;'&quot;;
 </pre>
 <p>All operations above are performed simultaneously, left to right.
 </p>
-<p>Because the result of <code>&quot;\Q STRING \E&quot;</code> has all 
metacharacters
+<p>Because the result of 
<code>&quot;\Q&nbsp;<em>STRING</em>&nbsp;\E&quot;</code><!-- /@w --> has all 
metacharacters
 quoted, there is no way to insert a literal <code>$</code> or <code>@</code> 
inside a
-<code>\Q\E</code> pair.  If protected by <code>\</code>, <code>$</code> will 
be quoted to became
+<code>\Q\E</code> pair.  If protected by <code>\</code>, <code>$</code> will 
be quoted to become
 <code>&quot;\\\$&quot;</code>; if not, it is interpreted as the start of an 
interpolated
 scalar.
 </p>
 <p>Note also that the interpolation code needs to make a decision on
 where the interpolated scalar ends.  For instance, whether
-<code>&quot;a $x -&gt; {c}&quot;</code> really means:
+<code>&quot;a&nbsp;$x&nbsp;<span 
class="nolinebreak">-&gt;</span>&nbsp;{c}&quot;</code><!-- /@w --> really means:
 </p>
 <pre class="verbatim">  &quot;a &quot; . $x . &quot; -&gt; {c}&quot;;
 </pre>
@@ -65690,7 +67521,7 @@
 <p>It is at this step that <code>\1</code> is begrudgingly converted to 
<code>$1</code> in
 the replacement text of <code>s///</code>, in order to correct the incorrigible
 <em>sed</em> hackers who haven&rsquo;t picked up the saner idiom yet.  A 
warning
-is emitted if the <code>use warnings</code> pragma or the <strong>-w</strong> 
command-line flag
+is emitted if the <code>use&nbsp;warnings</code><!-- /@w --> pragma or the 
<strong>-w</strong> command-line flag
 (that is, the <code>$^W</code> variable) was set.
 </p>
 </dd>
@@ -65715,10 +67546,10 @@
 back to the perl parser, in a similar way that an interpolated array
 subscript expression such as 
<code>&quot;foo$array[1+f(&quot;[xyz&quot;)]bar&quot;</code> would be.
 </p>
-<p>Moreover, inside <code>(?{BLOCK})</code>, <code>(?# comment )</code>, and
-a <code>#</code>-comment in a <code>//x</code>-regular expression, no 
processing is
+<p>Moreover, inside <code>(?{BLOCK})</code>, 
<code>(?#&nbsp;comment&nbsp;)</code><!-- /@w -->, and
+a <code>#</code>-comment in a <code>/x</code>-regular expression, no 
processing is
 performed whatsoever.  This is the first step at which the presence
-of the <code>//x</code> modifier is relevant.
+of the <code>/x</code> modifier is relevant.
 </p>
 <p>Interpolation in patterns has several quirks: <code>$|</code>, 
<code>$(</code>, <code>$)</code>, <code>@+</code>
 and <code>@-</code> are not interpolated, and constructs 
<code>$var[SOMETHING]</code> are
@@ -65744,7 +67575,7 @@
 </pre>
 <p>In the RE above, which is intentionally obfuscated for illustration, the
 delimiter is <code>m</code>, the modifier is <code>mx</code>, and after 
delimiter-removal the
-RE is the same as for <code>m/ ^ a \s* b /mx</code>.  There&rsquo;s more than 
one
+RE is the same as for 
<code>m/&nbsp;^&nbsp;a&nbsp;\s*&nbsp;b&nbsp;/mx</code><!-- /@w -->.  
There&rsquo;s more than one
 reason you&rsquo;re encouraged to restrict your delimiters to non-alphanumeric,
 non-whitespace choices.
 </p>
@@ -65767,9 +67598,9 @@
 <p>Whatever happens in the RE engine might be better discussed in <a 
href="#perlre-NAME">perlre NAME</a>,
 but for the sake of continuity, we shall do so here.
 </p>
-<p>This is another step where the presence of the <code>//x</code> modifier is
+<p>This is another step where the presence of the <code>/x</code> modifier is
 relevant.  The RE engine scans the string from left to right and
-converts it to a finite automaton.
+converts it into a finite automaton.
 </p>
 <p>Backslashed characters are either replaced with corresponding
 literal strings (as with <code>\{</code>), or else they generate special nodes
@@ -65777,7 +67608,7 @@
 RE engine (such as <code>|</code>) generate corresponding nodes or groups of
 nodes.  <code>(?#...)</code> comments are ignored.  All the rest is either
 converted to literal strings to match, or else is ignored (as is
-whitespace and <code>#</code>-style comments if <code>//x</code> is present).
+whitespace and <code>#</code>-style comments if <code>/x</code> is present).
 </p>
 <p>Parsing of the bracketed character class construct, <code>[...]</code>, is
 rather different than the rule used for the rest of the pattern.
@@ -65792,7 +67623,7 @@
 </p>
 <p>It is possible to inspect both the string given to RE engine and the
 resulting finite automaton.  See the arguments 
<code>debug</code>/<code>debugcolor</code>
-in the <code>use <a href="re.html#Top">(re)</a></code> pragma, as well as 
Perl&rsquo;s <strong>-Dr</strong> command-line
+in the <code>use&nbsp;<a href="re.html#Top">(re)</a></code><!-- /@w --> 
pragma, as well as Perl&rsquo;s <strong>-Dr</strong> command-line
 switch documented in <a href="#perlrun-Command-Switches">perlrun Command 
Switches</a>.
 </p>
 </dd>
@@ -65818,7 +67649,7 @@
 <a name="I_002fO-Operators"></a>
 <h4 class="subsection">48.2.33 I/O Operators</h4>
 
-<pre class="verbatim"> &gt;&gt; 
+<pre class="verbatim"> &gt;&gt;  &gt;&gt; 
 </pre>
 <p>There are several I/O operators you should know about.
 </p>
@@ -65849,11 +67680,11 @@
 there is one situation where an automatic assignment happens.  If
 and only if the input symbol is the only thing inside the conditional
 of a <code>while</code> statement (even if disguised as a <code>for(;;)</code> 
loop),
-the value is automatically assigned to the global variable $_,
+the value is automatically assigned to the global variable <code>$_</code>,
 destroying whatever was there previously.  (This may seem like an
 odd thing to you, but you&rsquo;ll use the construct in almost every Perl
-script you write.)  The $_ variable is not implicitly localized.
-You&rsquo;ll have to put a <code>local $_;</code> before the loop if you want 
that
+script you write.)  The <code>$_</code> variable is not implicitly localized.
+You&rsquo;ll have to put a <code>local&nbsp;<span 
class="nolinebreak">$_;</span></code><!-- /@w --> before the loop if you want 
that
 to happen.
 </p>
 <p>The following lines are equivalent:
@@ -65875,40 +67706,40 @@
 is automatic or explicit) is then tested to see whether it is
 defined.  The defined test avoids problems where the line has a string
 value that would be treated as false by Perl; for example a &quot;&quot; or
-a &quot;0&quot; with no trailing newline.  If you really mean for such values
+a <code>&quot;0&quot;</code> with no trailing newline.  If you really mean for 
such values
 to terminate the loop, they should be tested for explicitly:
 </p>
 <pre class="verbatim">    while (($_ = &lt;STDIN&gt;) ne '0') { ... }
     while (&lt;STDIN&gt;) { last unless $_; ... }
 </pre>
-<p>In other boolean contexts, <code>&lt;FILEHANDLE&gt;</code> without an
+<p>In other boolean contexts, <code>&lt;<em>FILEHANDLE</em>&gt;</code> without 
an
 explicit <code>defined</code> test or comparison elicits a warning if the
-<code>use warnings</code> pragma or the <strong>-w</strong>
+<code>use&nbsp;warnings</code><!-- /@w --> pragma or the <strong>-w</strong>
 command-line switch (the <code>$^W</code> variable) is in effect.
 </p>
 <p>The filehandles STDIN, STDOUT, and STDERR are predefined.  (The
 filehandles <code>stdin</code>, <code>stdout</code>, and <code>stderr</code> 
will also work except
 in packages, where they would be interpreted as local identifiers
 rather than global.)  Additional filehandles may be created with
-the open() function, amongst others.  See <a 
href="#perlopentut-NAME">perlopentut NAME</a> and
+the <code>open()</code> function, amongst others.  See <a 
href="#perlopentut-NAME">perlopentut NAME</a> and
 &lsquo;perlfunc open&rsquo; for details on this.
 </p>
-<p>If a &lt;FILEHANDLE&gt; is used in a context that is looking for
+<p>If a <code>&lt;<em>FILEHANDLE</em>&gt;</code> is used in a context that is 
looking for
 a list, a list comprising all input lines is returned, one line per
 list element.  It&rsquo;s easy to grow to a rather large data space this
 way, so use with care.
 </p>
-<p>&lt;FILEHANDLE&gt; may also be spelled <code>readline(*FILEHANDLE)</code>.
+<p><code>&lt;<em>FILEHANDLE</em>&gt;</code>  may also be spelled 
<code>readline(*<em>FILEHANDLE</em>)</code>.
 See <a href="#perlfunc-readline">perlfunc readline</a>.
 </p>
-<p>The null filehandle &lt;&gt; is special: it can be used to emulate the
+<p>The null filehandle <code>&lt;&gt;</code> is special: it can be used to 
emulate the
 behavior of <strong>sed</strong> and <strong>awk</strong>, and any other Unix 
filter program
 that takes a list of filenames, doing the same to each line
-of input from all of them.  Input from &lt;&gt; comes either from
+of input from all of them.  Input from <code>&lt;&gt;</code> comes either from
 standard input, or from each file listed on the command line.  Here&rsquo;s
-how it works: the first time &lt;&gt; is evaluated, the @ARGV array is
-checked, and if it is empty, <code>$ARGV[0]</code> is set to &quot;-&quot;, 
which when opened
-gives you standard input.  The @ARGV array is then processed as a list
+how it works: the first time <code>&lt;&gt;</code> is evaluated, the 
<code>@ARGV</code> array is
+checked, and if it is empty, <code>$ARGV[0]</code> is set to 
<code>&quot;-&quot;</code>, which when opened
+gives you standard input.  The <code>@ARGV</code> array is then processed as a 
list
 of filenames.  The loop
 </p>
 <pre class="verbatim">    while (&lt;&gt;) {
@@ -65926,11 +67757,11 @@
     }
 </pre>
 <p>except that it isn&rsquo;t so cumbersome to say, and will actually work.
-It really does shift the @ARGV array and put the current filename
-into the $ARGV variable.  It also uses filehandle <em>ARGV</em>
-internally.  &lt;&gt; is just a synonym for &lt;ARGV&gt;, which
+It really does shift the <code>@ARGV</code> array and put the current filename
+into the <code>$ARGV</code> variable.  It also uses filehandle <em>ARGV</em>
+internally.  <code>&lt;&gt;</code> is just a synonym for 
<code>&lt;ARGV&gt;</code>, which
 is magical.  (The pseudo code above doesn&rsquo;t work because it treats
-&lt;ARGV&gt; as non-magical.)
+<code>&lt;ARGV&gt;</code> as non-magical.)
 </p>
 <p>Since the null filehandle uses the two argument form of &lsquo;perlfunc 
open&rsquo;
 it interprets special characters, so if you have a script like this:
@@ -65939,18 +67770,28 @@
         print;
     }
 </pre>
-<p>and call it with <code>perl dangerous.pl 'rm -rfv *|'</code>, it actually 
opens a
+<p>and call it with <code>perl&nbsp;dangerous.pl&nbsp;'rm&nbsp;<span 
class="nolinebreak">-rfv</span>&nbsp;*|'</code><!-- /@w -->, it actually opens a
 pipe, executes the <code>rm</code> command and reads <code>rm</code>&rsquo;s 
output from that pipe.
 If you want all items in <code>@ARGV</code> to be interpreted as file names, 
you
-can use the module <code>ARGV::readonly</code> from CPAN.
+can use the module <code>ARGV::readonly</code> from CPAN, or use the double 
bracket:
 </p>
-<p>You can modify @ARGV before the first &lt;&gt; as long as the array ends up
+<pre class="verbatim">    while (&lt;&lt;&gt;&gt;) {
+        print;
+    }
+</pre>
+<p>Using double angle brackets inside of a while causes the open to use the
+three argument form (with the second argument being <code>&lt;</code>), so all
+arguments in <code>ARGV</code> are treated as literal filenames (including 
<code>&quot;-&quot;</code>).
+(Note that for convenience, if you use <code>&lt;&lt;&gt;&gt;</code> and if 
<code>@ARGV</code> is
+empty, it will still read from the standard input.)
+</p>
+<p>You can modify <code>@ARGV</code> before the first <code>&lt;&gt;</code> as 
long as the array ends up
 containing the list of filenames you really want.  Line numbers 
(<code>$.</code>)
 continue as though the input were one big happy file.  See the example
 in <a href="#perlfunc-eof">perlfunc eof</a> for how to reset line numbers on 
each file.
 </p>
-<p>If you want to set @ARGV to your own list of files, go right ahead.
-This sets @ARGV to all plain text files if no @ARGV was given:
+<p>If you want to set <code>@ARGV</code> to your own list of files, go right 
ahead.
+This sets <code>@ARGV</code> to all plain text files if no <code>@ARGV</code> 
was given:
 </p>
 <pre class="verbatim">    @ARGV = grep { -f &amp;&amp; -T } glob('*') unless 
@ARGV;
 </pre>
@@ -65960,7 +67801,7 @@
 <pre class="verbatim">    @ARGV = map { /\.(gz|Z)$/ ? &quot;gzip -dc &lt; $_ 
|&quot; : $_ } @ARGV;
 </pre>
 <p>If you want to pass switches into your script, you can use one of the
-Getopts modules or put a loop on the front like this:
+<code>Getopts</code> modules or put a loop on the front like this:
 </p>
 <pre class="verbatim">    while ($_ = $ARGV[0], /^-/) {
         shift;
@@ -65974,12 +67815,12 @@
         # ...           # code for each line
     }
 </pre>
-<p>The &lt;&gt; symbol will return <code>undef</code> for end-of-file only 
once.
+<p>The <code>&lt;&gt;</code> symbol will return <code>undef</code> for 
end-of-file only once.
 If you call it again after this, it will assume you are processing another
address@hidden list, and if you haven&rsquo;t set @ARGV, will read input from 
STDIN.
+<code>@ARGV</code> list, and if you haven&rsquo;t set <code>@ARGV</code>, will 
read input from STDIN.
 </p>
 <p>If what the angle brackets contain is a simple scalar variable (for example,
-&lt;$foo&gt;), then that variable contains the name of the
+<code>$foo</code>), then that variable contains the name of the
 filehandle to input from, or its typeglob, or a reference to the
 same.  For example:
 </p>
@@ -65991,9 +67832,9 @@
 reference, it is interpreted as a filename pattern to be globbed, and
 either a list of filenames or the next filename in the list is returned,
 depending on context.  This distinction is determined on syntactic
-grounds alone.  That means <code>&lt;$x&gt;</code> is always a readline() from
-an indirect handle, but <code>&lt;$hash{key}&gt;</code> is always a glob().
-That&rsquo;s because $x is a simple scalar variable, but 
<code>$hash{key}</code> is
+grounds alone.  That means <code>&lt;$x&gt;</code> is always a 
<code>readline()</code> from
+an indirect handle, but <code>&lt;$hash{key}&gt;</code> is always a 
<code>glob()</code>.
+That&rsquo;s because <code>$x</code> is a simple scalar variable, but 
<code>$hash{key}</code> is
 not&ndash;it&rsquo;s a hash element.  Even <code>&lt;$x &gt;</code> (note the 
extra space)
 is treated as <code>glob(&quot;$x &quot;)</code>, not 
<code>readline($x)</code>.
 </p>
@@ -66018,7 +67859,7 @@
     }
 </pre>
 <p>except that the globbing is actually done internally using the standard
-<code>File::Glob</code> extension.  Of course, the shortest way to do the 
above is:
+<code><a href="File-Glob.html#Top">(File-Glob)</a></code> extension.  Of 
course, the shortest way to do the above is:
 </p>
 <pre class="verbatim">    chmod 0644, &lt;*.c&gt;;
 </pre>
@@ -66045,7 +67886,7 @@
 returning false.
 </p>
 <p>If you&rsquo;re trying to do variable interpolation, it&rsquo;s definitely 
better
-to use the glob() function, because the older notation can cause people
+to use the <code>glob()</code> function, because the older notation can cause 
people
 to become confused with the indirect filehandle notation.
 </p>
 <pre class="verbatim">    @files = glob(&quot;$dir/*.[ch]&quot;);
@@ -66134,6 +67975,34 @@
     $baz = 0+$foo &amp; 0+$bar;     # both ops explicitly numeric
     $biz = &quot;$foo&quot; ^ &quot;$bar&quot;;     # both ops explicitly 
stringy
 </pre>
+<p>This somewhat unpredictable behavior can be avoided with the experimental
+&quot;bitwise&quot; feature, new in Perl 5.22.  You can enable it via 
<code>use&nbsp;feature&nbsp;'bitwise'</code><!-- /@w -->.  By default, it will 
warn unless the <code>&quot;experimental::bitwise&quot;</code>
+warnings category has been disabled.  
(<code>use&nbsp;experimental&nbsp;'bitwise'</code><!-- /@w --> will
+enable the feature and disable the warning.)  Under this feature, the four
+standard bitwise operators (<code>~ | &amp; ^</code>) are always numeric.  
Adding a dot
+after each operator (<code>~. |. &amp;. ^.</code>) forces it to treat its 
operands as
+strings:
+</p>
+<pre class="verbatim">    use experimental &quot;bitwise&quot;;
+    $foo =  150  |  105;        # yields 255  (0x96 | 0x69 is 0xFF)
+    $foo = '150' |  105;        # yields 255
+    $foo =  150  | '105';       # yields 255
+    $foo = '150' | '105';       # yields 255
+    $foo =  150  |. 105;        # yields string '155'
+    $foo = '150' |. 105;        # yields string '155'
+    $foo =  150  |.'105';       # yields string '155'
+    $foo = '150' |.'105';       # yields string '155'
+
+    $baz = $foo &amp;  $bar;        # both operands numeric
+    $biz = $foo ^. $bar;        # both operands stringy
+</pre>
+<p>The assignment variants of these operators (<code>&amp;= |= ^= &amp;.= |.= 
^.=</code>)
+behave likewise under the feature.
+</p>
+<p>The behavior of these operators is problematic (and subject to change)
+if either or both of the strings are encoded in UTF-8 (see
+<a href="#perlunicode-Byte-and-Character-Semantics">perlunicode Byte and 
Character Semantics</a>.
+</p>
 <p>See &lsquo;perlfunc vec&rsquo; for information on how to manipulate 
individual bits
 in a bit vector.
 </p>
@@ -66160,16 +68029,16 @@
 <p>which lasts until the end of that BLOCK.  Note that this doesn&rsquo;t
 mean everything is an integer, merely that Perl will use integer
 operations for arithmetic, comparison, and bitwise operators.  For
-example, even under <code>use integer</code>, if you take the 
<code>sqrt(2)</code>, you&rsquo;ll
+example, even under <code>use&nbsp;integer</code><!-- /@w -->, if you take the 
<code>sqrt(2)</code>, you&rsquo;ll
 still get <code>1.4142135623731</code> or so.
 </p>
-<p>Used on numbers, the bitwise operators (&quot;&amp;&quot;, &quot;|&quot;, 
&quot;^&quot;, &quot;~&quot;, &quot;&lt;&lt;&quot;,
-and &quot;&gt;&gt;&quot;) always produce integral results.  (But see also
-<a href="#perlop-Bitwise-String-Operators">Bitwise String Operators</a>.)  
However, <code>use integer</code> still has meaning for
+<p>Used on numbers, the bitwise operators (<code>&amp;</code> <code>|</code> 
<code>^</code> <code>~</code> <code>&lt;&lt;</code>
+<code>&gt;&gt;</code>) always produce integral results.  (But see also
+<a href="#perlop-Bitwise-String-Operators">Bitwise String Operators</a>.)  
However, <code>use&nbsp;integer</code><!-- /@w --> still has meaning for
 them.  By default, their results are interpreted as unsigned integers, but
-if <code>use integer</code> is in effect, their results are interpreted
+if <code>use&nbsp;integer</code><!-- /@w --> is in effect, their results are 
interpreted
 as signed integers.  For example, <code>~0</code> usually evaluates to a large
-integral value.  However, <code>use integer; ~0</code> is <code>-1</code> on 
two&rsquo;s-complement
+integral value.  However, <code>use&nbsp;integer;&nbsp;~0</code><!-- /@w --> 
is <code>-1</code> on two&rsquo;s-complement
 machines.
 </p>
 <hr>
@@ -66181,10 +68050,10 @@
 <a name="Floating_002dpoint-Arithmetic"></a>
 <h4 class="subsection">48.2.38 Floating-point Arithmetic</h4>
 
-<p>While <code>use integer</code> provides integer-only arithmetic, there is no
+<p>While <code>use&nbsp;integer</code><!-- /@w --> provides integer-only 
arithmetic, there is no
 analogous mechanism to provide automatic rounding or truncation to a
 certain number of decimal places.  For rounding to a certain number
-of digits, sprintf() or printf() is usually the easiest route.
+of digits, <code>sprintf()</code> or <code>printf()</code> is usually the 
easiest route.
 See <a href="perlfaq4.html#Top">(perlfaq4)</a>.
 </p>
 <p>Floating-point numbers are only approximations to what a mathematician
@@ -66209,10 +68078,10 @@
     }
 </pre>
 <p>The POSIX module (part of the standard perl distribution) implements
-ceil(), floor(), and other mathematical and trigonometric functions.
-The Math::Complex module (part of the standard perl distribution)
+<code>ceil()</code>, <code>floor()</code>, and other mathematical and 
trigonometric functions.
+The <code><a href="Math-Complex.html#Top">(Math-Complex)</a></code> module 
(part of the standard perl distribution)
 defines mathematical functions that work on both the reals and the
-imaginary numbers.  Math::Complex not as efficient as POSIX, but
+imaginary numbers.  <code>Math::Complex</code> is not as efficient as POSIX, 
but
 POSIX can&rsquo;t work with complex numbers.
 </p>
 <p>Rounding in financial applications can have serious implications, and
@@ -66230,7 +68099,8 @@
 <a name="Bigger-Numbers"></a>
 <h4 class="subsection">48.2.39 Bigger Numbers</h4>
 
-<p>The standard <code>Math::BigInt</code>, <code>Math::BigRat</code>, and 
<code>Math::BigFloat</code> modules,
+<p>The standard <code><a href="Math-BigInt.html#Top">(Math-BigInt)</a></code>, 
<code><a href="Math-BigRat.html#Top">(Math-BigRat)</a></code>, and
+<code><a href="Math-BigFloat.html#Top">(Math-BigFloat)</a></code> modules,
 along with the <code>bignum</code>, <code>bigint</code>, and 
<code>bigrat</code> pragmas, provide
 variable-precision arithmetic and overloaded operators, although
 they&rsquo;re currently pretty slow.  At the cost of some space and
@@ -66254,8 +68124,8 @@
         x/y is 9/44
         x*y is 1/11
 </pre>
-<p>Several modules let you calculate with (bound only by memory and CPU time)
-unlimited or fixed precision.  There
+<p>Several modules let you calculate with unlimited or fixed precision
+(bound only by memory and CPU time).  There
 are also some non-standard modules that
 provide faster implementations via external C libraries.
 </p>
@@ -67248,14 +69118,19 @@
 pack codes <code>f</code>, <code>d</code>, <code>F</code> and <code>D</code>. 
<code>f</code> and <code>d</code> pack into (or unpack
 from) single-precision or double-precision representation as it is provided
 by your system. If your systems supports it, <code>D</code> can be used to 
pack and
-unpack extended-precision floating point values (<code>long double</code>), 
which
-can offer even more resolution than <code>f</code> or <code>d</code>. 
<code>F</code> packs an <code>NV</code>,
-which is the floating point type used by Perl internally. (There
-is no such thing as a network representation for reals, so if you want
-to send your real numbers across computer boundaries, you&rsquo;d better stick
-to ASCII representation, unless you&rsquo;re absolutely sure what&rsquo;s on 
the other
-end of the line. For the even more adventuresome, you can use the byte-order
-modifiers from the previous section also on floating point codes.)
+unpack (<code>long double</code>) values, which can offer even more resolution
+than <code>f</code> or <code>d</code>.  <strong>Note that there are different 
long double formats.</strong>
+</p>
+<p><code>F</code> packs an <code>NV</code>, which is the floating point type 
used by Perl
+internally.
+</p>
+<p>There is no such thing as a network representation for reals, so if
+you want to send your real numbers across computer boundaries, you&rsquo;d
+better stick to text representation, possibly using the hexadecimal
+float format (avoiding the decimal conversion loss), unless you&rsquo;re
+absolutely sure what&rsquo;s on the other end of the line. For the even more
+adventuresome, you can use the byte-order modifiers from the previous
+section also on floating point codes.
 </p>
 <hr>
 <a name="perlpacktut-Exotic-Templates"></a>
@@ -68352,7 +70227,7 @@
 to compare, instead of using multiple sort keys, which makes it possible to use
 the standard, written in <code>c</code> and fast, perl <code>sort()</code> 
function on the output,
 and is the basis of the <code>GRT</code> (Guttman Rossler Transform).  Some 
string
-combinations can slow the <code>GRT</code> down, by just being too plain 
complex for it&rsquo;s
+combinations can slow the <code>GRT</code> down, by just being too plain 
complex for its
 own good.
 </p>
 <p>For applications using database backends, the standard <code>DBIx</code> 
namespace has
@@ -69313,7 +71188,7 @@
 </pre>
 <p>BTW. Beware too of pressure from managers who see you speed a program up by 
50%
 of the runtime once, only to get a request one month later to do the same again
-(true story) - you&rsquo;ll just have to point out your only human, even if 
you are a
+(true story) - you&rsquo;ll just have to point out you&rsquo;re only human, 
even if you are a
 Perl programmer, and you&rsquo;ll see what you can do...
 </p>
 <hr>
@@ -69339,7 +71214,7 @@
 debug level set in the logging configuration file is zero.  Once the debug()
 subroutine has been entered, and the internal <code>$debug</code> variable 
confirmed to
 be zero, for example, the message which has been sent in will be discarded and
-the program will continue.  In the example given though, the \%INC hash will
+the program will continue.  In the example given though, the 
<code>\%INC</code> hash will
 already have been dumped, and the message string constructed, all of which work
 could be bypassed by a debug variable at the statement level, like this:
 </p>
@@ -69955,13 +71830,18 @@
 <dt><code>=encoding <em>encodingname</em></code></dt>
 <dd><a name="perlpod-_003dencoding-encodingname"></a>
 <p>This command is used for declaring the encoding of a document.  Most
-users won&rsquo;t need this; but if your encoding isn&rsquo;t US-ASCII or 
Latin-1,
-then put a <code>=encoding <em>encodingname</em></code> command early in the 
document so
+users won&rsquo;t need this; but if your encoding isn&rsquo;t US-ASCII,
+then put a <code>=encoding <em>encodingname</em></code> command very early in 
the document so
 that pod formatters will know how to decode the document.  For
 <em>encodingname</em>, use a name recognized by the <a 
href="Encode-Supported.html#Top">(Encode-Supported)</a>
-module.  Examples:
+module.  Some pod formatters may try to guess between a Latin-1 or
+CP-1252 versus
+UTF-8 encoding, but they may guess wrong.  It&rsquo;s best to be explicit if
+you use anything besides strict ASCII.  Examples:
 </p>
-<pre class="verbatim">  =encoding utf8
+<pre class="verbatim">  =encoding latin1
+
+  =encoding utf8
 
   =encoding koi8-r
 
@@ -70143,7 +72023,7 @@
 <p>Note that older Pod formatters might not recognize octal or
 hex numeric escapes, and that many formatters cannot reliably
 render characters above 255.  (Some formatters may even have
-to use compromised renderings of Latin-1 characters, like
+to use compromised renderings of Latin-1/CP-1252 characters, like
 rendering <code>E&lt;eacute&gt;</code> as just a plain &quot;e&quot;.)
 </p>
 </li></ul>
@@ -70515,7 +72395,7 @@
 etc.).
 </p>
 <p>Pod content is contained in <strong>Pod blocks</strong>.  A Pod block 
starts with a
-line that matches &lt;m/\A=[a-zA-Z]/&gt;, and continues up to the next line
+line that matches <code>m/\A=[a-zA-Z]/</code>, and continues up to the next 
line
 that matches <code>m/\A=cut/</code> or up to the end of the file if there is
 no <code>m/\A=cut/</code> line.
 </p>
@@ -70845,7 +72725,7 @@
     B&lt;&lt; $foo-&gt;bar(); &gt;&gt;
 </pre>
 <p>With this syntax, the whitespace character(s) after the 
&quot;C&lt;&lt;&lt;&quot;
-and before the &quot;&gt;&gt;&quot; (or whatever letter) are <em>not</em> 
renderable. They
+and before the &quot;&gt;&gt;&gt;&quot; (or whatever letter) are <em>not</em> 
renderable. They
 do not signify whitespace, are merely part of the formatting codes
 themselves.  That is, these are all synonymous:
 </p>
@@ -71039,7 +72919,8 @@
 big-endian or little-endian) or UTF-8, Pod parsers should do the
 same.  Otherwise, the character encoding should be understood as
 being UTF-8 if the first highbit byte sequence in the file seems
-valid as a UTF-8 sequence, or otherwise as Latin-1.
+valid as a UTF-8 sequence, or otherwise as CP-1252 (earlier versions of
+this specification used Latin-1 instead of CP-1252).
 
 <p>Future versions of this specification may specify
 how Pod can accept other encodings.  Presumably treatment of other
@@ -71051,18 +72932,31 @@
 file begins with the two literal byte values 0xFE 0xFF, this is
 the BOM for big-endian UTF-16.  If the file begins with the two
 literal byte value 0xFF 0xFE, this is the BOM for little-endian
-UTF-16.  If the file begins with the three literal byte values
+UTF-16.  On an ASCII platform, if the file begins with the three literal
+byte values
 0xEF 0xBB 0xBF, this is the BOM for UTF-8.
+A mechanism portable to EBCDIC platforms is to:
 
-</li><li> A naive but sufficient heuristic for testing the first highbit
+<pre class="verbatim">  my $utf8_bom = &quot;\x{FEFF}&quot;;
+  utf8::encode($utf8_bom);
+</pre>
+</li><li> A naive, but often sufficient heuristic on ASCII platforms, for 
testing
+the first highbit
 byte-sequence in a BOM-less file (whether in code or in Pod!), to see
 whether that sequence is valid as UTF-8 (RFC 2279) is to check whether
-that the first byte in the sequence is in the range 0xC0 - 0xFD
+that the first byte in the sequence is in the range 0xC2 - 0xFD
 <em>and</em> whether the next byte is in the range
 0x80 - 0xBF.  If so, the parser may conclude that this file is in
 UTF-8, and all highbit sequences in the file should be assumed to
 be UTF-8.  Otherwise the parser should treat the file as being
-in Latin-1.  In the unlikely circumstance that the first highbit
+in CP-1252.  (A better check, and which works on EBCDIC platforms as
+well, is to pass a copy of the sequence to
+<a href="utf8.html#Top">(utf8)utf8::decode()</a> which performs a full 
validity check on the
+sequence and returns TRUE if it is valid UTF-8, FALSE otherwise.  This
+function is always pre-loaded, is fast because it is written in C, and
+will only get called at most once, so you don&rsquo;t need to avoid it out of
+performance concerns.)
+In the unlikely circumstance that the first highbit
 sequence in a truly non-UTF-8 file happens to appear to be UTF-8, one
 can cater to our heuristic (as well as any more intelligent heuristic)
 by prefacing that line with a comment line containing a highbit
@@ -71070,10 +72964,6 @@
 of simply &quot;#&quot;, an e-acute, and any non-highbit byte,
 is sufficient to establish this file&rsquo;s encoding.
 
-</li><li> This document&rsquo;s requirements and suggestions about encodings
-do not apply to Pod processors running on non-ASCII platforms,
-notably EBCDIC platforms.
-
 </li><li> Pod processors must treat a &quot;=for [label] [content...]&quot; 
paragraph as
 meaning the same thing as a &quot;=begin [label]&quot; paragraph, content, and
 an &quot;=end [label]&quot; paragraph.  (The parser may conflate these two
@@ -71209,17 +73099,20 @@
 
 </li><li> Characters in Pod documents may be conveyed either as literals, or by
 number in E&lt;n&gt; codes, or by an equivalent mnemonic, as in
-E&lt;eacute&gt; which is exactly equivalent to E&lt;233&gt;.
+E&lt;eacute&gt; which is exactly equivalent to E&lt;233&gt;.  The numbers
+are the Latin1/Unicode values, even on EBCDIC platforms.
 
-<p>Characters in the range 32-126 refer to those well known US-ASCII
-characters (also defined there by Unicode, with the same meaning),
-which all Pod formatters must render faithfully.  Characters
-in the ranges 0-31 and 127-159 should not be used (neither as
-literals, nor as E&lt;number&gt; codes), except for the
-literal byte-sequences for newline (13, 13 10, or 10), and tab (9).
+<p>When referring to characters by using a E&lt;n&gt; numeric code, numbers
+in the range 32-126 refer to those well known US-ASCII characters (also
+defined there by Unicode, with the same meaning), which all Pod
+formatters must render faithfully.  Characters whose E&lt;&gt; numbers
+are in the ranges 0-31 and 127-159 should not be used (neither as
+literals,
+nor as E&lt;number&gt; codes), except for the literal byte-sequences for
+newline (ASCII 13, ASCII 13 10, or ASCII 10), and tab (ASCII 9).
 </p>
-<p>Characters in the range 160-255 refer to Latin-1 characters (also
-defined there by Unicode, with the same meaning).  Characters above
+<p>Numbers in the range 160-255 refer to Latin-1 characters (also
+defined there by Unicode, with the same meaning).  Numbers above
 255 should be understood to refer to Unicode characters.
 </p>
 </li><li> Be warned
@@ -71260,14 +73153,14 @@
 </li><li> Note that in all cases of &quot;E&lt;whatever&gt;&quot;, 
<em>whatever</em> (whether
 an htmlname, or a number in any base) must consist only of
 alphanumeric characters &ndash; that is, <em>whatever</em> must watch
-<code>m/\A\w+\z/</code>.  So &quot;E&lt; 0 1 2 3 &gt;&quot; is invalid, because
+<code>m/\A\w+\z/</code>.  So 
&quot;E&lt;&nbsp;0&nbsp;1&nbsp;2&nbsp;3&nbsp;&gt;&quot;<!-- /@w --> is invalid, 
because
 it contains spaces, which aren&rsquo;t alphanumeric characters.  This
 presumably does not <em>need</em> special treatment by a Pod processor;
-&quot; 0 1 2 3 &quot; doesn&rsquo;t look like a number in any base, so it would
+&quot;&nbsp;0&nbsp;1&nbsp;2&nbsp;3&nbsp;&quot;<!-- /@w --> doesn&rsquo;t look 
like a number in any base, so it would
 presumably be looked up in the table of HTML-like names.  Since
-there isn&rsquo;t (and cannot be) an HTML-like entity called &quot; 0 1 2 3 
&quot;,
+there isn&rsquo;t (and cannot be) an HTML-like entity called 
&quot;&nbsp;0&nbsp;1&nbsp;2&nbsp;3&nbsp;&quot;<!-- /@w -->,
 this will be treated as an error.  However, Pod processors may
-treat &quot;E&lt; 0 1 2 3 &gt;&quot; or &quot;E&lt;e-acute&gt;&quot; as 
<em>syntactically</em>
+treat &quot;E&lt;&nbsp;0&nbsp;1&nbsp;2&nbsp;3&nbsp;&gt;&quot;<!-- /@w --> or 
&quot;E&lt;e-acute&gt;&quot; as <em>syntactically</em>
 invalid, potentially earning a different error message than the
 error message (or warning, or event) generated by a merely unknown
 (but theoretically valid) htmlname, as in &quot;E&lt;qacute&gt;&quot;
@@ -71472,7 +73365,7 @@
 <dl compact="compact">
 <dt>First:</dt>
 <dd><a name="perlpodspec-First_003a"></a>
-<p>The link-text.  If there is none, this must be undef.  (E.g., in
+<p>The link-text.  If there is none, this must be <code>undef</code>.  (E.g., 
in
 &quot;L&lt;Perl Functions|perlfunc&gt;&quot;, the link-text is &quot;Perl 
Functions&quot;.
 In &quot;L&lt;Time::HiRes&gt;&quot; and even 
&quot;L&lt;|Time::HiRes&gt;&quot;, there is no
 link text.  Note that link text may contain formatting.)
@@ -71487,14 +73380,14 @@
 </dd>
 <dt>Third:</dt>
 <dd><a name="perlpodspec-Third_003a"></a>
-<p>The name or URL, or undef if none.  (E.g., in &quot;L&lt;Perl
+<p>The name or URL, or <code>undef</code> if none.  (E.g., in &quot;L&lt;Perl
 Functions|perlfunc&gt;&quot;, the name (also sometimes called the page)
-is &quot;perlfunc&quot;.  In &quot;L&lt;/CAVEATS&gt;&quot;, the name is undef.)
+is &quot;perlfunc&quot;.  In &quot;L&lt;/CAVEATS&gt;&quot;, the name is 
<code>undef</code>.)
 </p>
 </dd>
 <dt>Fourth:</dt>
 <dd><a name="perlpodspec-Fourth_003a"></a>
-<p>The section (AKA &quot;item&quot; in older perlpods), or undef if none.  
E.g.,
+<p>The section (AKA &quot;item&quot; in older perlpods), or <code>undef</code> 
if none.  E.g.,
 in &quot;L&lt;Getopt::Std/DESCRIPTION&gt;&quot;, &quot;DESCRIPTION&quot; is 
the section.  (Note
 that this is not the same as a manpage section like the &quot;5&quot; in 
&quot;man 5
 crontab&quot;.  &quot;Section Foo&quot; in the Pod sense means the part of the 
text
@@ -72762,9 +74655,9 @@
 the Perl community should expect from Perl&rsquo;s developers:
 </p>
 <ul>
-<li> We &quot;officially&quot; support the two most recent stable release 
series.  5.14.x
-and earlier are now out of support.  As of the release of 5.20.0, we will
-&quot;officially&quot; end support for Perl 5.16.x, other than providing 
security
+<li> We &quot;officially&quot; support the two most recent stable release 
series.  5.16.x
+and earlier are now out of support.  As of the release of 5.22.0, we will
+&quot;officially&quot; end support for Perl 5.18.x, other than providing 
security
 updates as described below.
 
 </li><li> To the best of our ability, we will attempt to fix critical issues
@@ -72830,23 +74723,20 @@
 years and decades, but not at the expense of our user community.
 </p>
 <p>Existing syntax and semantics should only be marked for destruction in
-very limited circumstances.  If a given language feature&rsquo;s continued
-inclusion in the language will cause significant harm to the language
-or prevent us from making needed changes to the runtime, then it may
-be considered for deprecation.
-</p>
-<p>Any language change which breaks backward-compatibility should be able to
-be enabled or disabled lexically.  Unless code at a given scope declares
-that it wants the new behavior, that new behavior should be disabled.
-Which backward-incompatible changes are controlled implicitly by a
-&rsquo;use v5.x.y&rsquo; is a decision which should be made by the pumpking in
-consultation with the community.
-</p>
-<p>When a backward-incompatible change can&rsquo;t be toggled lexically, the 
decision
-to change the language must be considered very, very carefully.  If it&rsquo;s
-possible to move the old syntax or semantics out of the core language
-and into XS-land, that XS module should be enabled by default unless
-the user declares that they want a newer revision of Perl.
+very limited circumstances.  If they are believed to be very rarely used,
+stand in the way of actual improvement to the Perl language or perl
+interpreter, and if affected code can be easily updated to continue
+working, they may be considered for removal.  When in doubt, caution
+dictates that we will favor backward compatibility.  When a feature is
+deprecated, a statement of reasoning describing the decision process
+will be posted, and a link to it will be provided in the relevant
+perldelta documents.
+</p>
+<p>Using a lexical pragma to enable or disable legacy behavior should be
+considered when appropriate, and in the absence of any pragma legacy
+behavior should be enabled.  Which backward-incompatible changes are
+controlled implicitly by a &rsquo;use v5.x.y&rsquo; is a decision which should 
be
+made by the pumpking in consultation with the community.
 </p>
 <p>Historically, we&rsquo;ve held ourselves to a far higher standard than
 backward-compatibility &ndash; bugward-compatibility.  Any accident of
@@ -72861,7 +74751,9 @@
 </p>
 <p>New syntax and semantics which don&rsquo;t break existing language 
constructs
 and syntax have a much lower bar.  They merely need to prove themselves
-to be useful, elegant, well designed, and well tested.
+to be useful, elegant, well designed, and well tested.  In most cases,
+these additions will be marked as <em>experimental</em> for some time.  See
+below for more on that.
 </p>
 <table class="menu" border="0" cellspacing="0">
 <tr><td align="left" valign="top">&bull; <a href="#perlpolicy-Terminology" 
accesskey="1">perlpolicy Terminology</a>:</td><td>&nbsp;&nbsp;</td><td 
align="left" valign="top">
@@ -72948,53 +74840,77 @@
 <a name="MAINTENANCE-BRANCHES"></a>
 <h3 class="section">55.6 MAINTENANCE BRANCHES</h3>
 
+<p>New releases of maintenance branches should only contain changes that fall 
into
+one of the &quot;acceptable&quot; categories set out below, but must not 
contain any
+changes that fall into one of the &quot;unacceptable&quot; categories.  (For 
example, a
+fix for a crashing bug must not be included if it breaks binary compatibility.)
+</p>
+<p>It is not necessary to include every change meeting these criteria, and in
+general the focus should be on addressing security issues, crashing bugs,
+regressions and serious installation issues.  The temptation to include a
+plethora of minor changes that don&rsquo;t affect the installation or 
execution of
+perl (e.g. spelling corrections in documentation) should be resisted in order
+to reduce the overall risk of overlooking something.  The intention is to
+create maintenance releases which are both worthwhile and which users can have
+full confidence in the stability of.  (A secondary concern is to avoid burning
+out the maint-pumpking or overwhelming other committers voting on changes to be
+included (see <a 
href="#perlpolicy-Getting-changes-into-a-maint-branch">Getting changes into a 
maint branch</a> below).)
+</p>
+<p>The following types of change may be considered acceptable, as long as they 
do
+not also fall into any of the &quot;unacceptable&quot; categories set out 
below:
+</p>
 <ul>
-<li> New releases of maint should contain as few changes as possible.
-If there is any question about whether a given patch might merit
-inclusion in a maint release, then it almost certainly should not
-be included.
+<li> Patches that fix CVEs or security issues.  These changes should
+be run through the address@hidden mailing list
+rather than applied directly.
 
-</li><li> Portability fixes, such as changes to Configure and the files in
-hints/ are acceptable. Ports of Perl to a new platform, architecture
-or OS release that involve changes to the implementation are NOT
-acceptable.
-
-</li><li> Acceptable documentation updates are those that correct factual 
errors,
-explain significant bugs or deficiencies in the current implementation,
-or fix broken markup.
+</li><li> Patches that fix crashing bugs, assertion failures and
+memory corruption but which do not otherwise change perl&rsquo;s
+functionality or negatively impact performance.
 
-</li><li> Patches that add new warnings or errors or deprecate features
-are not acceptable.
+</li><li> Patches that fix regressions in perl&rsquo;s behavior relative to 
previous
+releases, no matter how old the regression, since some people may
+upgrade from very old versions of perl to the latest version.
 
-</li><li> Patches that fix crashing bugs, assertion failures and
-memory corruption that do not otherwise change Perl&rsquo;s
-functionality or negatively impact performance are acceptable.
+</li><li> Patches that fix anything which prevents or seriously impacts the 
build
+or installation of perl.
 
-</li><li> Patches that fix CVEs or security issues are acceptable, but should
-be run through the address@hidden mailing list
-rather than applied directly.
+</li><li> Portability fixes, such as changes to Configure and the files in
+the hints/ folder.
 
-</li><li> Patches that fix regressions in perl&rsquo;s behavior relative to 
previous
-releases are acceptable.
+</li><li> Minimal patches that fix platform-specific test failures.
+
+</li><li> Documentation updates that correct factual errors, explain 
significant
+bugs or deficiencies in the current implementation, or fix broken markup.
 
 </li><li> Updates to dual-life modules should consist of minimal patches to
-fix crashing or security issues (as above).
+fix crashing bugs or security issues (as above).  Any changes made to
+dual-life modules for which CPAN is canonical should be coordinated with
+the upstream author.
 
-</li><li> Minimal patches that fix platform-specific test failures or build or
-installation issues are acceptable. When these changes are made
-to dual-life modules for which CPAN is canonical, any changes
-should be coordinated with the upstream author.
+</li></ul>
 
-</li><li> New versions of dual-life modules should NOT be imported into maint.
-Those belong in the next stable series.
+<p>The following types of change are NOT acceptable:
+</p>
+<ul>
+<li> Patches that break binary compatibility.  (Please talk to a pumpking.)
+
+</li><li> Patches that add or remove features.
 
-</li><li> Patches that add or remove features are not acceptable.
+</li><li> Patches that add new warnings or errors or deprecate features.
 
-</li><li> Patches that break binary compatibility are not acceptable.  (Please
-talk to a pumpking.)
+</li><li> Ports of Perl to a new platform, architecture or OS release that
+involve changes to the implementation.
+
+</li><li> New versions of dual-life modules should NOT be imported into maint.
+Those belong in the next stable series.
 
 </li></ul>
 
+<p>If there is any question about whether a given patch might merit
+inclusion in a maint release, then it almost certainly should not
+be included.
+</p>
 <table class="menu" border="0" cellspacing="0">
 <tr><td align="left" valign="top">&bull; <a 
href="#perlpolicy-Getting-changes-into-a-maint-branch" accesskey="1">perlpolicy 
Getting changes into a maint branch</a>:</td><td>&nbsp;&nbsp;</td><td 
align="left" valign="top">
 </td></tr>
@@ -73021,6 +74937,17 @@
 other committers respond to the list giving their assent. (This policy
 applies to current and former pumpkings, as well as other committers.)
 </p>
+<p>Other voting mechanisms may be used instead, as long as the same number of
+votes is gathered in a transparent manner.  Specifically, proposals of
+which changes to cherry-pick must be visible to everyone on perl5-porters
+so that the views of everyone interested may be heard.
+</p>
+<p>It is not necessary for voting to be held on cherry-picking perldelta
+entries associated with changes that have already been cherry-picked, nor
+for the maint-pumpking to obtain votes on changes required by the
+<samp>Porting/release_managers_guide.pod</samp> where such changes can be 
applied by
+the means of cherry-picking from blead.
+</p>
 <hr>
 <a name="perlpolicy-CONTRIBUTED-MODULES"></a>
 <div class="header">
@@ -73310,7 +75237,7 @@
 particular task.  Thus, when you begin attacking a problem, it is
 important to consider under which part of the tradeoff curve you
 want to operate.  Specifically, you must decide whether it is
-important that the task that you are coding have the full generality
+important that the task that you are coding has the full generality
 of being portable, or whether to just get the job done right now.
 This is the hardest choice to be made.  The rest is easy, because
 Perl provides many choices, whichever way you want to approach your
@@ -73360,7 +75287,7 @@
 </p>
 <p>The material below is separated into three main sections: main issues of
 portability (<a href="#perlport-ISSUES">ISSUES</a>), platform-specific issues 
(<a href="#perlport-PLATFORMS">PLATFORMS</a>), and
-built-in perl functions that behave differently on various ports
+built-in Perl functions that behave differently on various ports
 (<a href="#perlport-FUNCTION-IMPLEMENTATIONS">FUNCTION IMPLEMENTATIONS</a>).
 </p>
 <p>This information should not be considered complete; it includes possibly
@@ -73423,23 +75350,24 @@
 <p>In most operating systems, lines in files are terminated by newlines.
 Just what is used as a newline may vary from OS to OS.  Unix
 traditionally uses <code>\012</code>, one type of DOSish I/O uses 
<code>\015\012</code>,
-and Mac&nbsp;OS<!-- /@w --> uses <code>\015</code>.
+Mac&nbsp;OS<!-- /@w --> uses <code>\015</code>, and z/OS uses 
<code>\025</code>.
 </p>
 <p>Perl uses <code>\n</code> to represent the &quot;logical&quot; newline, 
where what is
 logical may depend on the platform in use.  In MacPerl, <code>\n</code> always
-means <code>\015</code>.  In DOSish perls, <code>\n</code> usually means 
<code>\012</code>, but when
+means <code>\015</code>.  On EBCDIC platforms, <code>\n</code> could be 
<code>\025</code> or <code>\045</code>.
+In DOSish perls, <code>\n</code> usually means <code>\012</code>, but when
 accessing a file in &quot;text&quot; mode, perl uses the <code>:crlf</code> 
layer that
 translates it to (or from) <code>\015\012</code>, depending on whether 
you&rsquo;re
 reading or writing. Unix does the same thing on ttys in canonical
 mode.  <code>\015\012</code> is commonly referred to as CRLF.
 </p>
-<p>To trim trailing newlines from text lines use chomp().  With default 
+<p>To trim trailing newlines from text lines use <code>chomp()</code>.  With 
default
 settings that function looks for a trailing <code>\n</code> character and thus 
 trims in a portable way.
 </p>
 <p>When dealing with binary files (or text files in binary mode) be sure
 to explicitly set $/ to the appropriate value for your file format
-before using chomp().
+before using <code>chomp()</code>.
 </p>
 <p>Because of the &quot;text&quot; mode translation, DOSish perls have 
limitations
 in using <code>seek</code> and <code>tell</code> on a file accessed in 
&quot;text&quot; mode.
@@ -73447,9 +75375,9 @@
 others), and you are usually free to use <code>seek</code> and 
<code>tell</code> even
 in &quot;text&quot; mode.  Using <code>seek</code> or <code>tell</code> or 
other file operations
 may be non-portable.  If you use <code>binmode</code> on a file, however, you
-can usually <code>seek</code> and <code>tell</code> with arbitrary values in 
safety.
+can usually <code>seek</code> and <code>tell</code> with arbitrary values 
safely.
 </p>
-<p>A common misconception in socket programming is that <code>\n</code> eq 
<code>\012</code>
+<p>A common misconception in socket programming is that 
<code>\n&nbsp;eq&nbsp;\012</code><!-- /@w -->
 everywhere.  When using protocols such as common Internet protocols,
 <code>\012</code> and <code>\015</code> are called for specifically, and the 
values of
 the logical <code>\n</code> and <code>\r</code> (carriage return) are not 
reliable.
@@ -73459,7 +75387,7 @@
 </pre>
 <p>However, using <code>\015\012</code> (or <code>\cM\cJ</code>, or 
<code>\x0D\x0A</code>) can be tedious
 and unsightly, as well as confusing to those maintaining the code.  As
-such, the Socket module supplies the Right Thing for those who want it.
+such, the <code>Socket</code> module supplies the Right Thing for those who 
want it.
 </p>
 <pre class="verbatim">    use Socket qw(:DEFAULT :crlf);
     print SOCKET &quot;Hi there, client!$CRLF&quot;      # RIGHT
@@ -73468,7 +75396,7 @@
 separator <code>$/</code> is <code>\n</code>, but robust socket code will 
recognize as
 either <code>\012</code> or <code>\015\012</code> as end of line:
 </p>
-<pre class="verbatim">    while (&lt;SOCKET&gt;) {
+<pre class="verbatim">    while (&lt;SOCKET&gt;) {  # NOT ADVISABLE!
         # ...
     }
 </pre>
@@ -73549,7 +75477,7 @@
 usually either &quot;live&quot; via network connection, or by storing the
 numbers to secondary storage such as a disk file or tape.
 </p>
-<p>Conflicting storage orders make utter mess out of the numbers.  If a
+<p>Conflicting storage orders make an utter mess out of the numbers.  If a
 little-endian host (Intel, VAX) stores 0x12345678 (305419896 in
 decimal), a big-endian host (Motorola, Sparc, PA) reads it as
 0x78563412 (2018915346 in decimal).  Alpha and MIPS can be either:
@@ -73558,7 +75486,7 @@
 connections use the <code>pack</code> and <code>unpack</code> formats 
<code>n</code> and <code>N</code>, the
 &quot;network&quot; orders.  These are guaranteed to be portable.
 </p>
-<p>As of perl 5.10.0, you can also use the <code>&gt;</code> and 
<code>&lt;</code> modifiers
+<p>As of Perl 5.10.0, you can also use the <code>&gt;</code> and 
<code>&lt;</code> modifiers
 to force big- or little-endian byte-order.  This is useful if you want
 to store signed integers or 64-bit integers, for example.
 </p>
@@ -73582,11 +75510,12 @@
 </p>
 <p>One can circumnavigate both these problems in two ways.  Either
 transfer and store numbers always in text format, instead of raw
-binary, or else consider using modules like Data::Dumper and Storable
-(included as of perl 5.8).  Keeping all data as text significantly
+binary, or else consider using modules like <code>Data::Dumper</code> and
+<code>Storable</code>
+(included as of Perl 5.8).  Keeping all data as text significantly
 simplifies matters.
 </p>
-<p>The v-strings are portable only up to v2147483647 (0x7FFFFFFF), that&rsquo;s
+<p>The v-strings are portable only up to v2147483647 (0x7FFF_FFFF), 
that&rsquo;s
 how far EBCDIC, or more precisely UTF-EBCDIC will go.
 </p>
 <hr>
@@ -73636,13 +75565,13 @@
 </p>
 <p>Don&rsquo;t assume Unix filesystem access semantics: that read, write,
 and execute are all the permissions there are, and even if they exist,
-that their semantics (for example what do r, w, and x mean on
+that their semantics (for example what do <code>&quot;r&quot;</code>, 
<code>&quot;w&quot;</code>, and <code>&quot;x&quot;</code> mean on
 a directory) are the Unix ones.  The various Unix/POSIX compatibility
-layers usually try to make interfaces like chmod() work, but sometimes
+layers usually try to make interfaces like <code>chmod()</code> work, but 
sometimes
 there simply is no good mapping.
 </p>
 <p>If all this is intimidating, have no (well, maybe only a little)
-fear.  There are modules that can help.  The File::Spec modules
+fear.  There are modules that can help.  The <code>File::Spec</code> modules
 provide methods to do the Right Thing on whatever platform happens
 to be running the program.
 </p>
@@ -73653,11 +75582,11 @@
     # on Mac OS Classic, ':temp:file.txt'
     # on VMS, '[.temp]file.txt'
 </pre>
-<p>File::Spec is available in the standard distribution as of version
-5.004_05.  File::Spec::Functions is only in File::Spec 0.7 and later,
-and some versions of perl come with version 0.6.  If File::Spec
+<p><code>File::Spec</code> is available in the standard distribution as of 
version
+5.004_05.  <code>File::Spec::Functions</code> is only in 
<code>File::Spec</code> 0.7 and later,
+and some versions of Perl come with version 0.6.  If <code>File::Spec</code>
 is not updated to 0.7 or later, you must use the object-oriented
-interface from File::Spec (or upgrade File::Spec).
+interface from <code>File::Spec</code> (or upgrade <code>File::Spec</code>).
 </p>
 <p>In general, production code should not have file paths hardcoded.
 Making them user-supplied or read from a configuration file is
@@ -73667,7 +75596,7 @@
 <p>This is especially noticeable in scripts like Makefiles and test suites,
 which often assume <code>/</code> as a path separator for subdirectories.
 </p>
-<p>Also of use is File::Basename from the standard distribution, which
+<p>Also of use is <code>File::Basename</code> from the standard distribution, 
which
 splits a pathname into pieces (base filename, full path to directory,
 and file suffix).
 </p>
@@ -73692,7 +75621,7 @@
 keep them to the 8.3 convention, for maximum portability, onerous a
 burden though this may appear.
 </p>
-<p>Likewise, when using the AutoSplit module, try to keep your functions to
+<p>Likewise, when using the <code>AutoSplit</code> module, try to keep your 
functions to
 8.3 naming and case-insensitive conventions; or, at the least,
 make it so the resulting files have a unique (case-insensitively)
 first 8 characters.
@@ -73706,7 +75635,7 @@
 </p>
 <p>Don&rsquo;t assume <code>&gt;</code> won&rsquo;t be the first character of 
a filename.
 Always use <code>&lt;</code> explicitly to open a file for reading, or even
-better, use the three-arg version of open, unless you want the user to
+better, use the three-arg version of <code>open</code>, unless you want the 
user to
 be able to specify a pipe open.
 </p>
 <pre class="verbatim">    open my $fh, '&lt;', $existing_file) or die $!;
@@ -73726,7 +75655,7 @@
 </p>
 <p>Don&rsquo;t assume that in pathnames you can collapse two leading slashes
 <code>//</code> into one: some networking and clustering filesystems have 
special
-semantics for that.  Let the operating system to sort it out.
+semantics for that.  Let the operating system sort it out.
 </p>
 <p>The <em>portable filename characters</em> as defined by ANSI C are
 </p>
@@ -73735,7 +75664,7 @@
  0 1 2 3 4 5 6 7 8 9
  . _ -
 </pre>
-<p>and the &quot;-&quot; shouldn&rsquo;t be the first character.  If you want 
to be
+<p>and the <code>&quot;-&quot;</code> shouldn&rsquo;t be the first character.  
If you want to be
 hypercorrect, stay case-insensitive and within the 8.3 naming
 convention (all the files and directories have to be unique within one
 directory if their names are lowercased and truncated to eight
@@ -73776,7 +75705,7 @@
 </p>
 <p>Don&rsquo;t assume that a single <code>unlink</code> completely gets rid of 
the file:
 some filesystems (most notably the ones in VMS) have versioned
-filesystems, and unlink() removes only the most recent one (it doesn&rsquo;t
+filesystems, and <code>unlink()</code> removes only the most recent one (it 
doesn&rsquo;t
 remove all the versions because by default the native tools on those
 platforms remove just the most recent version, too).  The portable
 idiom to remove all the versions of a file is
@@ -73788,20 +75717,21 @@
 </p>
 <p>Don&rsquo;t count on a specific environment variable existing in 
<code>%ENV</code>.
 Don&rsquo;t count on <code>%ENV</code> entries being case-sensitive, or even
-case-preserving.  Don&rsquo;t try to clear %ENV by saying <code>%ENV = 
();</code>, or,
+case-preserving.  Don&rsquo;t try to clear <code>%ENV</code> by saying 
<code>%ENV = ();</code>, or,
 if you really have to, make it conditional on <code>$^O ne 'VMS'</code> since 
in
 VMS the <code>%ENV</code> table is much more than a per-process key-value 
string
 table.
 </p>
-<p>On VMS, some entries in the %ENV hash are dynamically created when
+<p>On VMS, some entries in the <code>%ENV</code> hash are dynamically created 
when
 their key is used on a read if they did not previously exist.  The
-values for <code>$ENV{HOME}</code>, <code>$ENV{TERM}</code>, 
<code>$ENV{HOME}</code>, and <code>$ENV{USER}</code>,
+values for <code>$ENV{HOME}</code>, <code>$ENV{TERM}</code>, 
<code>$ENV{PATH}</code>, and <code>$ENV{USER}</code>,
 are known to be dynamically generated.  The specific names that are
 dynamically generated may vary with the version of the C library on VMS,
-and more may exist than is documented.
+and more may exist than are documented.
 </p>
-<p>On VMS by default, changes to the %ENV hash are persistent after the process
-exits.  This can cause unintended issues.
+<p>On VMS by default, changes to the %ENV hash persist after perl exits.
+Subsequent invocations of perl in the same process can inadvertently
+inherit environment settings that were meant to be temporary.
 </p>
 <p>Don&rsquo;t count on signals or <code>%SIG</code> for anything.
 </p>
@@ -73812,10 +75742,10 @@
 directories.
 </p>
 <p>Don&rsquo;t count on specific values of <code>$!</code>, neither numeric nor
-especially the strings values. Users may switch their locales causing
+especially the string values. Users may switch their locales causing
 error messages to be translated into their languages.  If you can
 trust a POSIXish environment, you can portably use the symbols defined
-by the Errno module, like ENOENT.  And don&rsquo;t trust on the values of 
<code>$!</code>
+by the <code>Errno</code> module, like <code>ENOENT</code>.  And don&rsquo;t 
trust on the values of <code>$!</code>
 at all except immediately after a failed system call.
 </p>
 <hr>
@@ -73835,16 +75765,16 @@
 corresponding file.  Second, some operating systems (e.g., Cygwin,
 DJGPP, OS/2, and VOS) have required suffixes for executable files;
 these suffixes are generally permitted on the command name but are not
-required.  Thus, a command like &quot;perl&quot; might exist in a file named
-&quot;perl&quot;, &quot;perl.exe&quot;, or &quot;perl.pm&quot;, depending on 
the operating system.
-The variable &quot;_exe&quot; in the Config module holds the executable suffix,
-if any.  Third, the VMS port carefully sets up $^X and
-$Config{perlpath} so that no further processing is required.  This is
+required.  Thus, a command like <samp>&quot;perl&quot;</samp> might exist in a 
file named
+<samp>&quot;perl&quot;</samp>, <samp>&quot;perl.exe&quot;</samp>, or 
<samp>&quot;perl.pm&quot;</samp>, depending on the operating system.
+The variable <code>&quot;_exe&quot;</code> in the <code>Config</code> module 
holds the executable suffix,
+if any.  Third, the VMS port carefully sets up <code>$^X</code> and
+<code>$Config{perlpath}</code> so that no further processing is required.  
This is
 just as well, because the matching regular expression used below would
 then have to deal with a possible trailing version number in the VMS
 file name.
 </p>
-<p>To convert $^X to a file pathname, taking account of the requirements
+<p>To convert <code>$^X</code> to a file pathname, taking account of the 
requirements
 of the various operating system possibilities, say:
 </p>
 <pre class="verbatim"> use Config;
@@ -73852,7 +75782,7 @@
  if ($^O ne 'VMS')
     {$thisperl .= $Config{_exe} unless $thisperl =~ m/$Config{_exe}$/i;}
 </pre>
-<p>To convert $Config{perlpath} to a file pathname, say:
+<p>To convert <code>$Config{perlpath}</code> to a file pathname, say:
 </p>
 <pre class="verbatim"> use Config;
  my $thisperl = $Config{perlpath};
@@ -73886,13 +75816,13 @@
 </p>
 <p>Don&rsquo;t assume a particular network device name.
 </p>
-<p>Don&rsquo;t assume a particular set of ioctl()s will work.
+<p>Don&rsquo;t assume a particular set of <code>ioctl()</code>s will work.
 </p>
 <p>Don&rsquo;t assume that you can ping hosts and get replies.
 </p>
 <p>Don&rsquo;t assume that any particular port (service) will respond.
 </p>
-<p>Don&rsquo;t assume that Sys::Hostname (or any other API or command) returns
+<p>Don&rsquo;t assume that <code>Sys::Hostname</code> (or any other API or 
command) returns
 either a fully qualified hostname or a non-qualified hostname: it all
 depends on how the system had been configured.  Also remember that for
 things such as DHCP and NAT, the hostname you get back might not be
@@ -73914,7 +75844,7 @@
 <p>In general, don&rsquo;t directly access the system in code meant to be
 portable.  That means, no <code>system</code>, <code>exec</code>, 
<code>fork</code>, <code>pipe</code>,
 <code>``</code>, <code>qx//</code>, <code>open</code> with a <code>|</code>, 
nor any of the other things
-that makes being a perl hacker worth being.
+that makes being a Perl hacker worth being.
 </p>
 <p>Commands that launch external processes are generally supported on
 most platforms (though many of them do not support any type of
@@ -73935,10 +75865,10 @@
 available.  But it is not fine for many non-Unix systems, and even
 some Unix systems that may not have sendmail installed.  If a portable
 solution is needed, see the various distributions on CPAN that deal
-with it.  Mail::Mailer and Mail::Send in the MailTools distribution are
-commonly used, and provide several mailing methods, including mail,
-sendmail, and direct SMTP (via Net::SMTP) if a mail transfer agent is
-not available.  Mail::Sendmail is a standalone module that provides
+with it.  <code>Mail::Mailer</code> and <code>Mail::Send</code> in the 
<code>MailTools</code> distribution are
+commonly used, and provide several mailing methods, including 
<code>mail</code>,
+<code>sendmail</code>, and direct SMTP (via <code>Net::SMTP</code>) if a mail 
transfer agent is
+not available.  <code>Mail::Sendmail</code> is a standalone module that 
provides
 simple, platform-independent mailing.
 </p>
 <p>The Unix System V IPC (<code>msg*(), sem*(), shm*()</code>) is not available
@@ -73949,12 +75879,12 @@
 both forms just pack the four bytes into network order.  That this
 would be equal to the C language <code>in_addr</code> struct (which is what the
 socket code internally uses) is not guaranteed.  To be portable use
-the routines of the Socket extension, such as <code>inet_aton()</code>,
+the routines of the <code>Socket</code> extension, such as 
<code>inet_aton()</code>,
 <code>inet_ntoa()</code>, and <code>sockaddr_in()</code>.
 </p>
 <p>The rule of thumb for portable code is: Do it all in portable Perl, or
 use a module (that may internally implement it with platform-specific
-code, but expose a common interface).
+code, but exposes a common interface).
 </p>
 <hr>
 <a name="perlport-External-Subroutines-_0028XS_0029"></a>
@@ -73987,17 +75917,17 @@
 <h4 class="subsection">56.3.9 Standard Modules</h4>
 
 <p>In general, the standard modules work across platforms.  Notable
-exceptions are the CPAN module (which currently makes connections to external
+exceptions are the <code>CPAN</code> module (which currently makes connections 
to external
 programs that may not be available), platform-specific modules (like
-ExtUtils::MM_VMS), and DBM modules.
+<code>ExtUtils::MM_VMS</code>), and DBM modules.
 </p>
 <p>There is no one DBM module available on all platforms.
-SDBM_File and the others are generally available on all Unix and DOSish
-ports, but not in MacPerl, where only NBDM_File and DB_File are
+<code>SDBM_File</code> and the others are generally available on all Unix and 
DOSish
+ports, but not in MacPerl, where only <code>NDBM_File</code> and 
<code>DB_File</code> are
 available.
 </p>
 <p>The good news is that at least some DBM module should be available, and
-AnyDBM_File will use whichever module it can find.  Of course, then
+<code>AnyDBM_File</code> will use whichever module it can find.  Of course, 
then
 the code needs to be fairly strict, dropping to the greatest common
 factor (e.g., not exceeding 1K for each record), so that it will
 work with any DBM module.  See <a 
href="AnyDBM_File.html#Top">(AnyDBM_File)</a> for more details.
@@ -74029,9 +75959,9 @@
 Please do use the ISO 8601 instead of making us guess what
 date 02/03/04 might be.  ISO 8601 even sorts nicely as-is.
 A text representation (like &quot;1987-12-18&quot;) can be easily converted
-into an OS-specific value using a module like Date::Parse.
+into an OS-specific value using a module like <code>Date::Parse</code>.
 An array of values, such as those returned by <code>localtime</code>, can be
-converted to an OS-specific representation using Time::Local.
+converted to an OS-specific representation using <code>Time::Local</code>.
 </p>
 <p>When calculating specific times, such as for tests in time or date modules,
 it may be appropriate to calculate an offset for the epoch.
@@ -74055,17 +75985,25 @@
 <p>Assume very little about character sets.
 </p>
 <p>Assume nothing about numerical values (<code>ord</code>, <code>chr</code>) 
of characters.
-Do not use explicit code point ranges (like \xHH-\xHH); use for
-example symbolic character classes like <code>[:print:]</code>.
+Do not use explicit code point ranges (like <code>\xHH-\xHH)</code>.  However,
+starting in Perl v5.22, regular expression pattern bracketed character
+class ranges specified like <code>qr/[\N{U+HH}-\N{U+HH}]/</code> are portable.
+You can portably use symbolic character classes like <code>[:print:]</code>.
 </p>
 <p>Do not assume that the alphabetic characters are encoded contiguously
-(in the numeric sense).  There may be gaps.
+(in the numeric sense).  There may be gaps.  Special coding in Perl,
+however, guarantees that all subsets of <code>qr/[A-Z]/</code>, 
<code>qr/[a-z]/</code>, and
+<code>qr/[0-9]/</code> behave as expected.  <code>tr///</code> behaves the 
same for these
+ranges.  In patterns, any ranges specified with end points using the
+<code>\N{...}</code> notations ensures character set portability, but it is a 
bug
+in Perl v5.22, that this isn&rsquo;t true of <code>tr///</code>.
 </p>
 <p>Do not assume anything about the ordering of the characters.
 The lowercase letters may come before or after the uppercase letters;
 the lowercase and uppercase may be interlaced so that both &quot;a&quot; and 
&quot;A&quot;
 come before &quot;b&quot;; the accented and other international characters may
 be interlaced so that ÃÂ¤ comes before &quot;b&quot;.
+<a href="Unicode-Collate.html#Top">(Unicode-Collate)</a> can be used to sort 
this all out.
 </p>
 <hr>
 <a name="perlport-Internationalisation"></a>
@@ -74093,10 +76031,11 @@
 illegal (&quot;Malformed UTF-8 ...&quot;)  This means that for example 
embedding
 ISO 8859-1 bytes beyond 0x7f into your strings might cause trouble
 later.  If the bytes are native 8-bit bytes, you can use the <code>bytes</code>
-pragma.  If the bytes are in a string (regular expression being a
-curious string), you can often also use the <code>\xHH</code> notation instead
+pragma.  If the bytes are in a string (regular expressions being
+curious strings), you can often also use the <code>\xHH</code> or more 
portably,
+the <code>\N{U+HH}</code> notations instead
 of embedding the bytes as-is.  If you want to write your code in UTF-8,
-you can use the <code>utf8</code>.
+you can use <a href="utf8.html#Top">(utf8)</a>.
 </p>
 <hr>
 <a name="perlport-System-Resources"></a>
@@ -74119,7 +76058,7 @@
 <p>The last two constructs may appear unintuitive to most people.  The
 first repeatedly grows a string, whereas the second allocates a
 large chunk of memory in one go.  On some systems, the second is
-more efficient that the first.
+more efficient than the first.
 </p>
 <hr>
 <a name="perlport-Security"></a>
@@ -74141,17 +76080,17 @@
 </p>
 <p>Don&rsquo;t assume the Unix filesystem access semantics: the operating
 system or the filesystem may be using some ACL systems, which are
-richer languages than the usual rwx.  Even if the rwx exist,
+richer languages than the usual <code>rwx</code>.  Even if the 
<code>rwx</code> exist,
 their semantics might be different.
 </p>
-<p>(From security viewpoint testing for permissions before attempting to
+<p>(From the security viewpoint, testing for permissions before attempting to
 do something is silly anyway: if one tries this, there is potential
 for race conditions. Someone or something might change the
 permissions between the permissions check and the actual operation.
 Just try the operation.)
 </p>
 <p>Don&rsquo;t assume the Unix user and group semantics: especially, 
don&rsquo;t
-expect the <code>$&lt;</code> and <code>$&gt;</code> (or the <code>$(</code> 
and <code>$)</code>) to work
+expect <code>$&lt;</code> and <code>$&gt;</code> (or <code>$(</code> and 
<code>$)</code>) to work
 for switching identities (or memberships).
 </p>
 <p>Don&rsquo;t assume set-uid and set-gid semantics. (And even if you do,
@@ -74168,7 +76107,7 @@
 
 <p>For those times when it is necessary to have platform-specific code,
 consider keeping the platform-specific code in one place, making porting
-to other platforms easier.  Use the Config module and the special
+to other platforms easier.  Use the <code>Config</code> module and the special
 variable <code>$^O</code> to differentiate platforms, as described in
 <a href="#perlport-PLATFORMS">PLATFORMS</a>.
 </p>
@@ -74179,7 +76118,7 @@
 assume certain things about the filesystem and paths.  Be careful not
 to depend on a specific output style for errors, such as when checking
 <code>$!</code> after a failed system call.  Using <code>$!</code> for 
anything else than
-displaying it as output is doubtful (though see the Errno module for
+displaying it as output is doubtful (though see the <code>Errno</code> module 
for
 testing reasonably portably for error value). Some platforms expect
 a certain output format, and Perl on those platforms may have been
 adjusted accordingly.  Most specifically, don&rsquo;t anchor a regex when
@@ -74383,16 +76322,16 @@
 </pre>
 <p>The various MSWin32 Perl&rsquo;s can distinguish the OS they are running on
 via the value of the fifth element of the list returned from
-Win32::GetOSVersion().  For example:
+<code>Win32::GetOSVersion()</code>.  For example:
 </p>
 <pre class="verbatim">    if ($^O eq 'MSWin32') {
         my @os_version_info = Win32::GetOSVersion();
         print +('3.1','95','NT')[$os_version_info[4]],&quot;\n&quot;;
     }
 </pre>
-<p>There are also Win32::IsWinNT() and Win32::IsWin95(), try <code>perldoc 
Win32</code>,
+<p>There are also <code>Win32::IsWinNT()</code> and 
<code>Win32::IsWin95()</code>; try <code>perldoc Win32</code>,
 and as of libwin32 0.19 (not part of the core Perl distribution)
-Win32::GetOSName().  The very portable POSIX::uname() will work too:
+<code>Win32::GetOSName()</code>.  The very portable 
<code>POSIX::uname()</code> will work too:
 </p>
 <pre class="verbatim">    c:\&gt; perl -MPOSIX -we &quot;print join '|', 
uname&quot;
     Windows NT|moonru|5.0|Build 2195 (Service Pack 2)|x86
@@ -74432,21 +76371,10 @@
 <a name="VMS"></a>
 <h4 class="subsection">56.5.3 VMS</h4>
 
-<p>Perl on VMS is discussed in <a href="#perlvms-NAME">perlvms NAME</a> in the 
perl distribution.
+<p>Perl on VMS is discussed in <a href="#perlvms-NAME">perlvms NAME</a> in the 
Perl distribution.
 </p>
 <p>The official name of VMS as of this writing is OpenVMS.
 </p>
-<p>Perl on VMS can accept either VMS- or Unix-style file
-specifications as in either of the following:
-</p>
-<pre class="verbatim">    $ perl -ne &quot;print if /perl_setup/i&quot; 
SYS$LOGIN:LOGIN.COM
-    $ perl -ne &quot;print if /perl_setup/i&quot; /sys$login/login.com
-</pre>
-<p>but not a mixture of both as in:
-</p>
-<pre class="verbatim">    $ perl -ne &quot;print if /perl_setup/i&quot; 
sys$login:/login.com
-    Can't open sys$login:/login.com: file specification syntax error
-</pre>
 <p>Interacting with Perl from the Digital Command Language (DCL) shell
 often requires a different set of quotation marks than Unix shells do.
 For example:
@@ -74454,7 +76382,7 @@
 <pre class="verbatim">    $ perl -e &quot;print &quot;&quot;Hello, 
world.\n&quot;&quot;&quot;
     Hello, world.
 </pre>
-<p>There are several ways to wrap your perl scripts in DCL <samp>.COM</samp> 
files, if
+<p>There are several ways to wrap your Perl scripts in DCL <samp>.COM</samp> 
files, if
 you are so inclined.  For example:
 </p>
 <pre class="verbatim">    $ write sys$output &quot;Hello from DCL!&quot;
@@ -74470,132 +76398,50 @@
     $ endif
 </pre>
 <p>Do take care with <code>$ ASSIGN/nolog/user SYS$COMMAND: SYS$INPUT</code> 
if your
-perl-in-DCL script expects to do things like <code>$read = 
&lt;STDIN&gt;;</code>.
+Perl-in-DCL script expects to do things like <code>$read = 
&lt;STDIN&gt;;</code>.
 </p>
-<p>The VMS operating system has two filesystems, known as ODS-2 and ODS-5.
+<p>The VMS operating system has two filesystems, designated by their
+on-disk structure (ODS) level: ODS-2 and its successor ODS-5.  The
+initial port of Perl to VMS pre-dates ODS-5, but all current testing and
+development assumes ODS-5 and its capabilities, including case
+preservation, extended characters in filespecs, and names up to 8192
+bytes long.
 </p>
-<p>For ODS-2, filenames are in the format &quot;name.extension;version&quot;.  
The
-maximum length for filenames is 39 characters, and the maximum length for
-extensions is also 39 characters.  Version is a number from 1 to
-32767.  Valid characters are <code>/[A-Z0-9$_-]/</code>.
-</p>
-<p>The ODS-2 filesystem is case-insensitive and does not preserve case.
-Perl simulates this by converting all filenames to lowercase internally.
-</p>
-<p>For ODS-5, filenames may have almost any character in them and can include
-Unicode characters.  Characters that could be misinterpreted by the DCL
-shell or file parsing utilities need to be prefixed with the <code>^</code>
-character, or replaced with hexadecimal characters prefixed with the
-<code>^</code> character.  Such prefixing is only needed with the pathnames are
-in VMS format in applications.  Programs that can accept the Unix format
-of pathnames do not need the escape characters.  The maximum length for
-filenames is 255 characters.  The ODS-5 file system can handle both
-a case preserved and a case sensitive mode.
-</p>
-<p>ODS-5 is only available on the OpenVMS for 64 bit platforms.
-</p>
-<p>Support for the extended file specifications is being done as optional
-settings to preserve backward compatibility with Perl scripts that
-assume the previous VMS limitations.
-</p>
-<p>In general routines on VMS that get a Unix format file specification
-should return it in a Unix format, and when they get a VMS format
-specification they should return a VMS format unless they are documented
-to do a conversion.
-</p>
-<p>For routines that generate return a file specification, VMS allows setting
-if the C library which Perl is built on if it will be returned in VMS
-format or in Unix format.
-</p>
-<p>With the ODS-2 file system, there is not much difference in syntax of
-filenames without paths for VMS or Unix.  With the extended character
-set available with ODS-5 there can be a significant difference.
-</p>
-<p>Because of this, existing Perl scripts written for VMS were sometimes
-treating VMS and Unix filenames interchangeably.  Without the extended
-character set enabled, this behavior will mostly be maintained for
-backwards compatibility.
-</p>
-<p>When extended characters are enabled with ODS-5, the handling of
-Unix formatted file specifications is to that of a Unix system.
-</p>
-<p>VMS file specifications without extensions have a trailing dot.  An
-equivalent Unix file specification should not show the trailing dot.
-</p>
-<p>The result of all of this, is that for VMS, for portable scripts, you
-can not depend on Perl to present the filenames in lowercase, to be
-case sensitive, and that the filenames could be returned in either
-Unix or VMS format.
-</p>
-<p>And if a routine returns a file specification, unless it is intended to
-convert it, it should return it in the same format as it found it.
-</p>
-<p><code>readdir</code> by default has traditionally returned lowercased 
filenames.
-When the ODS-5 support is enabled, it will return the exact case of the
-filename on the disk.
-</p>
-<p>Files without extensions have a trailing period on them, so doing a
-<code>readdir</code> in the default mode with a file named <samp>A.;5</samp> 
will
-return <samp>a.</samp> when VMS is (though that file could be opened with
-<code>open(FH, 'A')</code>).
-</p>
-<p>With support for extended file specifications and if <code>opendir</code> 
was
-given a Unix format directory, a file named <samp>A.;5</samp> will return 
<samp>a</samp>
-and optionally in the exact case on the disk.  When <code>opendir</code> is 
given
-a VMS format directory, then <code>readdir</code> should return 
<samp>a.</samp>, and
-again with the optionally the exact case.
-</p>
-<p>RMS had an eight level limit on directory depths from any rooted logical
-(allowing 16 levels overall) prior to VMS 7.2, and even with versions of
-VMS on VAX up through 7.3.  Hence <code>PERL_ROOT:[LIB.2.3.4.5.6.7.8]</code> 
is a
-valid directory specification but <code>PERL_ROOT:[LIB.2.3.4.5.6.7.8.9]</code> 
is
-not.  <samp>Makefile.PL</samp> authors might have to take this into account, 
but at
-least they can refer to the former as 
<code>/PERL_ROOT/lib/2/3/4/5/6/7/8/</code>.
-</p>
-<p>Pumpkings and module integrators can easily see whether files with too many
-directory levels have snuck into the core by running the following in the
-top-level source directory:
-</p>
-<pre class="verbatim"> $ perl -ne &quot;$_=~s/\s+.*//; print if scalar(split 
/\//) &gt; 8;&quot; &lt; MANIFEST
-</pre>
-<p>The VMS::Filespec module, which gets installed as part of the build
-process on VMS, is a pure Perl module that can easily be installed on
-non-VMS platforms and can be helpful for conversions to and from RMS
-native formats.  It is also now the only way that you should check to
-see if VMS is in a case sensitive mode.
+<p>Perl on VMS can accept either VMS- or Unix-style file
+specifications as in either of the following:
+</p>
+<pre class="verbatim">    $ perl -ne &quot;print if /perl_setup/i&quot; 
SYS$LOGIN:LOGIN.COM
+    $ perl -ne &quot;print if /perl_setup/i&quot; /sys$login/login.com
+</pre>
+<p>but not a mixture of both as in:
+</p>
+<pre class="verbatim">    $ perl -ne &quot;print if /perl_setup/i&quot; 
sys$login:/login.com
+    Can't open sys$login:/login.com: file specification syntax error
+</pre>
+<p>In general, the easiest path to portability is always to specify
+filenames in Unix format unless they will need to be processed by native
+commands or utilities.  Because of this latter consideration, the
+File::Spec module by default returns native format specifications
+regardless of input format.  This default may be reversed so that
+filenames are always reported in Unix format by specifying the
+<code>DECC$FILENAME_UNIX_REPORT</code> feature logical in the environment.
+</p>
+<p>The file type, or extension, is always present in a VMS-format file
+specification even if it&rsquo;s zero-length.  This means that, by default,
+<code>readdir</code> will return a trailing dot on a file with no extension, so
+where you would see <code>&quot;a&quot;</code> on Unix you&rsquo;ll see 
<code>&quot;a.&quot;</code> on VMS.  However,
+the trailing dot may be suppressed by enabling the
+<code>DECC$READDIR_DROPDOTNOTYPE</code> feature in the environment (see the 
CRTL
+documentation on feature logical names).
 </p>
 <p>What <code>\n</code> represents depends on the type of file opened.  It 
usually
 represents <code>\012</code> but it could also be <code>\015</code>, 
<code>\012</code>, <code>\015\012</code>,
 <code>\000</code>, <code>\040</code>, or nothing depending on the file 
organization and
-record format.  The VMS::Stdio module provides access to the
-special fopen() requirements of files with unusual attributes on VMS.
-</p>
-<p>TCP/IP stacks are optional on VMS, so socket routines might not be
-implemented.  UDP sockets may not be supported.
-</p>
-<p>The TCP/IP library support for all current versions of VMS is dynamically
-loaded if present, so even if the routines are configured, they may
-return a status indicating that they are not implemented.
+record format.  The <code>VMS::Stdio</code> module provides access to the
+special <code>fopen()</code> requirements of files with unusual attributes on 
VMS.
 </p>
 <p>The value of <code>$^O</code> on OpenVMS is &quot;VMS&quot;.  To determine 
the architecture
-that you are running on without resorting to loading all of 
<code>%Config</code>
-you can examine the content of the <code>@INC</code> array like so:
-</p>
-<pre class="verbatim">    if (grep(/VMS_AXP/, @INC)) {
-        print &quot;I'm on Alpha!\n&quot;;
-
-    } elsif (grep(/VMS_VAX/, @INC)) {
-        print &quot;I'm on VAX!\n&quot;;
-
-    } elsif (grep(/VMS_IA64/, @INC)) {
-        print &quot;I'm on IA64!\n&quot;;
-
-    } else {
-        print &quot;I'm not so sure about where $^O is...\n&quot;;
-    }
-</pre>
-<p>In general, the significant differences should only be if Perl is running
-on VMS_VAX or one of the 64 bit OpenVMS platforms.
+that you are running on refer to <code>$Config{'archname'}</code>.
 </p>
 <p>On VMS, perl determines the UTC offset from the 
<code>SYS$TIMEZONE_DIFFERENTIAL</code>
 logical name.  Although the VMS epoch began at 17-NOV-1858 00:00:00.00,
@@ -74611,6 +76457,8 @@
 
 </li><li> vmsperl on the web, <a 
href="http://www.sidhe.org/vmsperl/index.html";>http://www.sidhe.org/vmsperl/index.html</a>
 
+</li><li> VMS Software Inc. web site, <a 
href="http://www.vmssoftware.com";>http://www.vmssoftware.com</a>
+
 </li></ul>
 
 <hr>
@@ -74623,7 +76471,7 @@
 <h4 class="subsection">56.5.4 VOS</h4>
 
 <p>Perl on VOS (also known as OpenVOS) is discussed in <samp>README.vos</samp>
-in the perl distribution (installed as <a 
href="perlvos.html#Top">(perlvos)</a>).  Perl on VOS
+in the Perl distribution (installed as <a 
href="perlvos.html#Top">(perlvos)</a>).  Perl on VOS
 can accept either VOS- or Unix-style file specifications as in
 either of the following:
 </p>
@@ -74663,7 +76511,7 @@
 </p>
 <p>The value of <code>$^O</code> on VOS is &quot;vos&quot;.  To determine the
 architecture that you are running on without resorting to loading
-all of <code>%Config</code> you can examine the content of the @INC array
+all of <code>%Config</code> you can examine the content of the 
<code>@INC</code> array
 like so:
 </p>
 <pre class="verbatim">    if ($^O =~ /vos/) {
@@ -74700,20 +76548,27 @@
 <a name="EBCDIC-Platforms"></a>
 <h4 class="subsection">56.5.5 EBCDIC Platforms</h4>
 
-<p>Recent versions of Perl have been ported to platforms such as OS/400 on
-AS/400 minicomputers as well as OS/390, VM/ESA, and BS2000 for S/390
-Mainframes.  Such computers use EBCDIC character sets internally (usually
+<p>v5.22 core Perl runs on z/OS (formerly OS/390).  Theoretically it could
+run on the successors of OS/400 on AS/400 minicomputers as well as
+VM/ESA, and BS2000 for S/390 Mainframes.  Such computers use EBCDIC
+character sets internally (usually
 Character Code Set ID 0037 for OS/400 and either 1047 or POSIX-BC for S/390
-systems).  On the mainframe perl currently works under the &quot;Unix system
+systems).
+</p>
+<p>The rest of this section may need updating, but we don&rsquo;t know what it
+should say.  Please email comments to
+<a href="mailto:address@hidden";>address@hidden</a>.
+</p>
+<p>On the mainframe Perl currently works under the &quot;Unix system
 services for OS/390&quot; (formerly known as OpenEdition), VM/ESA OpenEdition, 
or
-the BS200 POSIX-BC system (BS2000 is supported in perl 5.6 and greater).
+the BS200 POSIX-BC system (BS2000 is supported in Perl 5.6 and greater).
 See <a href="perlos390.html#Top">(perlos390)</a> for details.  Note that for 
OS/400 there is also a port of
 Perl 5.8.1/5.10.0 or later to the PASE which is ASCII-based (as opposed to
 ILE which is EBCDIC-based), see <a href="perlos400.html#Top">(perlos400)</a>.
 </p>
 <p>As of R2.5 of USS for OS/390 and Version 2.3 of VM/ESA these Unix
 sub-systems do not support the <code>#!</code> shebang trick for script 
invocation.
-Hence, on OS/390 and VM/ESA perl scripts can be executed with a header
+Hence, on OS/390 and VM/ESA Perl scripts can be executed with a header
 similar to the following simple script:
 </p>
 <pre class="verbatim">    : # use perl
@@ -74728,18 +76583,18 @@
 S/390 systems.
 </p>
 <p>On the AS/400, if PERL5 is in your library list, you may need
-to wrap your perl scripts in a CL procedure to invoke them like so:
+to wrap your Perl scripts in a CL procedure to invoke them like so:
 </p>
 <pre class="verbatim">    BEGIN
       CALL PGM(PERL5/PERL) PARM('/QOpenSys/hello.pl')
     ENDPGM
 </pre>
-<p>This will invoke the perl script <samp>hello.pl</samp> in the root of the
+<p>This will invoke the Perl script <samp>hello.pl</samp> in the root of the
 QOpenSys file system.  On the AS/400 calls to <code>system</code> or backticks
 must use CL syntax.
 </p>
 <p>On these platforms, bear in mind that the EBCDIC character set may have
-an effect on what happens with some perl functions (such as <code>chr</code>,
+an effect on what happens with some Perl functions (such as <code>chr</code>,
 <code>pack</code>, <code>print</code>, <code>printf</code>, <code>ord</code>, 
<code>sort</code>, <code>sprintf</code>, <code>unpack</code>), as
 well as bit-fiddling with ASCII constants using operators like <code>^</code>, 
<code>&amp;</code>
 and <code>|</code>, not to mention dealing with socket interfaces to ASCII 
computers
@@ -74747,7 +76602,7 @@
 </p>
 <p>Fortunately, most web servers for the mainframe will correctly
 translate the <code>\n</code> in the following statement to its ASCII 
equivalent
-(<code>\r</code> is the same under both Unix and OS/390):
+(<code>\r</code> is the same under both Unix and z/OS):
 </p>
 <pre class="verbatim">    print &quot;Content-type: text/html\r\n\r\n&quot;;
 </pre>
@@ -74776,7 +76631,7 @@
 <p>Also see:
 </p>
 <ul>
-<li> <a href="perlos390.html#Top">(perlos390)</a>, <samp>README.os390</samp>, 
<samp>perlbs2000</samp>, <a href="#perlebcdic-NAME">perlebcdic NAME</a>.
+<li> <a href="perlos390.html#Top">(perlos390)</a>, <a 
href="perlos400.html#Top">(perlos400)</a>, <a 
href="perlbs2000.html#Top">(perlbs2000)</a>, <a 
href="#perlebcdic-NAME">perlebcdic NAME</a>.
 
 </li><li> The address@hidden list is for discussion of porting issues as well 
as
 general usage issues for all EBCDIC Perls.  Send a message body of
@@ -74958,7 +76813,7 @@
 <p>Be aware, moreover, that even among Unix-ish systems there are variations.
 </p>
 <p>For many functions, you can also query <code>%Config</code>, exported by
-default from the Config module.  For example, to check whether the
+default from the <code>Config</code> module.  For example, to check whether the
 platform has the <code>lstat</code> call, check <code>$Config{d_lstat}</code>. 
 See
 <a href="Config.html#Top">(Config)</a> for a full description of available 
variables.
 </p>
@@ -75052,7 +76907,7 @@
 <p>The actual permissions set depend on the value of the <code>CYGWIN</code>
 in the SYSTEM environment settings.  (Cygwin)
 </p>
-<p>Setting the exec bit on some locations (generally /sdcard) will return true
+<p>Setting the exec bit on some locations (generally <samp>/sdcard</samp>) 
will return true
 but not actually set the bit. (Android)
 </p>
 </dd>
@@ -75100,7 +76955,7 @@
 <dt>exec</dt>
 <dd><a name="perlport-exec"></a>
 <p><code>exec LIST</code> without the use of indirect object syntax 
(<code>exec PROGRAM LIST</code>)
-may fall back to trying the shell if the first spawn() fails.  (Win32)
+may fall back to trying the shell if the first <code>spawn()</code> fails.  
(Win32)
 </p>
 <p>Does not automatically flush output handles on some platforms.
 (SunOS, Solaris, HP-UX)
@@ -75110,11 +76965,12 @@
 </dd>
 <dt>exit</dt>
 <dd><a name="perlport-exit"></a>
-<p>Emulates Unix exit() (which considers <code>exit 1</code> to indicate an 
error) by
-mapping the <code>1</code> to SS$_ABORT (<code>44</code>).  This behavior may 
be overridden
-with the pragma <code>use vmsish 'exit'</code>.  As with the CRTL&rsquo;s 
exit()
-function, <code>exit 0</code> is also mapped to an exit status of SS$_NORMAL
-(<code>1</code>); this mapping cannot be overridden.  Any other argument to 
exit()
+<p>Emulates Unix <code>exit()</code> (which considers <code>exit 1</code> to 
indicate an error) by
+mapping the <code>1</code> to <code>SS$_ABORT</code> (<code>44</code>).  This 
behavior may be overridden
+with the pragma <code>use vmsish 'exit'</code>.  As with the CRTL&rsquo;s 
<code>exit()</code>
+function, <code>exit 0</code> is also mapped to an exit status of 
<code>SS$_NORMAL</code>
+(<code>1</code>); this mapping cannot be overridden.  Any other argument to
+<code>exit()</code>
 is used directly as Perl&rsquo;s exit status.  On VMS, unless the future
 POSIX_EXIT mode is enabled, the exit code should always be a valid
 VMS exit code and not a generic number.  When the POSIX_EXIT mode is
@@ -75314,13 +77170,13 @@
 </dd>
 <dt>glob</dt>
 <dd><a name="perlport-glob"></a>
-<p>This operator is implemented via the File::Glob extension on most
+<p>This operator is implemented via the <code>File::Glob</code> extension on 
most
 platforms.  See <a href="File-Glob.html#Top">(File-Glob)</a> for portability 
information.
 </p>
 </dd>
 <dt>gmtime</dt>
 <dd><a name="perlport-gmtime"></a>
-<p>In theory, gmtime() is reliable from -2**63 to 2**63-1.  However,
+<p>In theory, <code>gmtime()</code> is reliable from -2**63 to 2**63-1.  
However,
 because work arounds in the implementation use floating point numbers,
 it will become inaccurate as the time gets larger.  This is a bug and
 will be fixed in the future.
@@ -75332,7 +77188,7 @@
 <dd><a name="perlport-ioctl-FILEHANDLE_002cFUNCTION_002cSCALAR"></a>
 <p>Not implemented. (VMS)
 </p>
-<p>Available only for socket handles, and it does what the ioctlsocket() call
+<p>Available only for socket handles, and it does what the 
<code>ioctlsocket()</code> call
 in the Winsock API does. (Win32)
 </p>
 <p>Available only for socket handles. (RISC&nbsp;OS<!-- /@w -->)
@@ -75344,12 +77200,12 @@
 </p>
 <p><code>kill()</code> doesn&rsquo;t have the semantics of 
<code>raise()</code>, i.e. it doesn&rsquo;t send
 a signal to the identified process like it does on Unix platforms.
-Instead <code>kill($sig, $pid)</code> terminates the process identified by 
$pid,
+Instead <code>kill($sig, $pid)</code> terminates the process identified by 
<code>$pid</code>,
 and makes it exit immediately with exit status $sig.  As in Unix, if
 $sig is 0 and the specified process exists, it returns true without
 actually terminating it. (Win32)
 </p>
-<p><code>kill(-9, $pid)</code> will terminate the process specified by $pid and
+<p><code>kill(-9, $pid)</code> will terminate the process specified by 
<code>$pid</code> and
 recursively all child processes owned by it.  This is different from
 the Unix semantics, where the signal will be delivered to all
 processes in the same process group as the process specified by
@@ -75423,8 +77279,8 @@
 </dd>
 <dt>rewinddir</dt>
 <dd><a name="perlport-rewinddir"></a>
-<p>Will not cause readdir() to re-read the directory stream.  The entries
-already read before the rewinddir() call will just be returned again
+<p>Will not cause <code>readdir()</code> to re-read the directory stream.  The 
entries
+already read before the <code>rewinddir()</code> call will just be returned 
again
 from a cache buffer. (Win32)
 </p>
 </dd>
@@ -75490,7 +77346,7 @@
 <dt>sleep</dt>
 <dd><a name="perlport-sleep"></a>
 <p>Emulated using synchronization functions such that it can be
-interrupted by alarm(), and limited to a maximum of 4294967 seconds,
+interrupted by <code>alarm()</code>, and limited to a maximum of 4294967 
seconds,
 approximately 49 days. (Win32)
 </p>
 </dd>
@@ -75527,12 +77383,12 @@
 <p>dev, rdev, blksize, and blocks are not available.  inode is not
 meaningful and will differ between stat calls on the same file.  (os2)
 </p>
-<p>some versions of cygwin when doing a stat(&quot;foo&quot;) and if not 
finding it
-may then attempt to stat(&quot;foo.exe&quot;) (Cygwin)
+<p>some versions of cygwin when doing a <code>stat(&quot;foo&quot;)</code> and 
if not finding it
+may then attempt to <code>stat(&quot;foo.exe&quot;)</code> (Cygwin)
 </p>
-<p>On Win32 stat() needs to open the file to determine the link count
+<p>On Win32 <code>stat()</code> needs to open the file to determine the link 
count
 and update attributes that may have been changed through hard links.
-Setting ${^WIN32_SLOPPY_STAT} to a true value speeds up stat() by
+Setting <code>${^WIN32_SLOPPY_STAT}</code> to a true value speeds up 
<code>stat()</code> by
 not performing this operation. (Win32)
 </p>
 </dd>
@@ -75562,9 +77418,9 @@
 <code>$ENV{PERL5SHELL}</code>.  <code>system(1, @args)</code> spawns an 
external
 process and immediately returns its process designator, without
 waiting for it to terminate.  Return value may be used subsequently
-in <code>wait</code> or <code>waitpid</code>.  Failure to spawn() a subprocess 
is indicated
-by setting $? to &quot;255 &lt;&lt; 8&quot;.  <code>$?</code> is set in a way 
compatible with
-Unix (i.e. the exitstatus of the subprocess is obtained by &quot;$? &gt;&gt; 
8&quot;,
+in <code>wait</code> or <code>waitpid</code>.  Failure to <code>spawn()</code> 
a subprocess is indicated
+by setting <code>$?</code> to 
<code>&quot;255&nbsp;&lt;&lt;&nbsp;8&quot;</code><!-- /@w -->.  <code>$?</code> 
is set in a way compatible with
+Unix (i.e. the exitstatus of the subprocess is obtained by 
<code>&quot;$?&nbsp;</code><!-- /@w --> 8&quot;&gt;&gt;,
 as described in the documentation).  (Win32)
 </p>
 <p>There is no shell to process metacharacters, and the native standard is
@@ -75578,7 +77434,7 @@
 of a child Unix program will exists.  Mileage <strong>will</strong> vary.  
(RISC&nbsp;OS<!-- /@w -->)
 </p>
 <p><code>system LIST</code> without the use of indirect object syntax 
(<code>system PROGRAM LIST</code>)
-may fall back to trying the shell if the first spawn() fails.  (Win32)
+may fall back to trying the shell if the first <code>spawn()</code> fails.  
(Win32)
 </p>
 <p>Does not automatically flush output handles on some platforms.
 (SunOS, Solaris, HP-UX)
@@ -75600,7 +77456,7 @@
 <dd><a name="perlport-times"></a>
 <p>&quot;cumulative&quot; times will be bogus.  On anything other than Windows 
NT
 or Windows 2000, &quot;system&quot; time will be bogus, and &quot;user&quot; 
time is
-actually the time returned by the clock() function in the C runtime
+actually the time returned by the <code>clock()</code> function in the C 
runtime
 library. (Win32)
 </p>
 <p>Not useful. (RISC&nbsp;OS<!-- /@w -->)
@@ -75631,7 +77487,7 @@
 <p>Only the modification time is updated. (VMS, RISC&nbsp;OS<!-- /@w -->)
 </p>
 <p>May not behave as expected.  Behavior depends on the C runtime
-library&rsquo;s implementation of utime(), and the filesystem being
+library&rsquo;s implementation of <code>utime()</code>, and the filesystem 
being
 used.  The FAT filesystem typically does not support an &quot;access
 time&quot; field, and it may limit timestamps to a granularity of
 two seconds. (Win32)
@@ -76355,6 +78211,10 @@
 operations, plus various examples of the same, see discussions of
 <code>m//</code>, <code>s///</code>, <code>qr//</code> and <code>??</code> in 
<a href="#perlop-Regexp-Quote_002dLike-Operators">perlop Regexp Quote-Like 
Operators</a>.
 </p>
+<p>New in v5.22, <a href="re.html#g_t_0027strict_0027-mode">(re)<code>use re 
'strict'</code></a> applies stricter
+rules than otherwise when compiling regular expression patterns.  It can
+find things that, while legal, may not be what you intended.
+</p>
 <table class="menu" border="0" cellspacing="0">
 <tr><td align="left" valign="top">&bull; <a href="#perlre-Modifiers" 
accesskey="1">perlre Modifiers</a>:</td><td>&nbsp;&nbsp;</td><td align="left" 
valign="top">
 </td></tr>
@@ -76473,6 +78333,26 @@
 <a href="#perlre-Character-set-modifiers">Character set modifiers</a>.
 </p>
 </dd>
+<dt>n</dt>
+<dd><a name="perlre-n"></a>
+<p>Prevent the grouping metacharacters <code>()</code> from capturing. This 
modifier,
+new in 5.22, will stop <code>$1</code>, <code>$2</code>, etc... from being 
filled in.
+</p>
+<pre class="verbatim">  &quot;hello&quot; =~ /(hi|hello)/;   # $1 is 
&quot;hello&quot;
+  &quot;hello&quot; =~ /(hi|hello)/n;  # $1 is undef
+</pre>
+<p>This is equivalent to putting <code>?:</code> at the beginning of every 
capturing group:
+</p>
+<pre class="verbatim">  &quot;hello&quot; =~ /(?:hi|hello)/; # $1 is undef
+</pre>
+<p><code>/n</code> can be negated on a per-group basis. Alternatively, named 
captures
+may still be used.
+</p>
+<pre class="verbatim">  &quot;hello&quot; =~ /(?-n:(hi|hello))/n;   # $1 is 
&quot;hello&quot;
+  &quot;hello&quot; =~ /(?&lt;greet&gt;hi|hello)/n; # $1 is &quot;hello&quot;, 
$+{greet} is
+                                    # &quot;hello&quot;
+</pre>
+</dd>
 <dt>Other Modifiers</dt>
 <dd><a name="perlre-Other-Modifiers"></a>
 <p>There are a number of flags that can be found at the end of regular
@@ -76488,7 +78368,7 @@
 </pre>
 <p>Substitution-specific modifiers described in
 </p>
-<p><a href="#perlop-s_002fPATTERN_002fREPLACEMENT_002fmsixpodualgcer">perlop 
s/PATTERN/REPLACEMENT/msixpodualgcer</a> are:
+<p><a href="#perlop-s_002fPATTERN_002fREPLACEMENT_002fmsixpodualngcer">perlop 
<code>s/<em>PATTERN</em>/<em>REPLACEMENT</em>/msixpodualngcer</code></a> are:
 </p>
 <pre class="verbatim">  e  - evaluate the right-hand side as an expression
   ee - evaluate the right side as a string then eval the result
@@ -76578,6 +78458,21 @@
 in <code>\p{...}</code> there can be spaces that follow the Unicode rules, for 
which see
 <a 
href="perluniprops.html#Properties-accessible-through-_005cp_007b_007d-and-_005cP_007b_007d">(perluniprops)Properties
 accessible through \p{} and \P{}</a>.
 </p>
+<p>The set of characters that are deemed whitespace are those that Unicode
+calls &quot;Pattern White Space&quot;, namely:
+</p>
+<pre class="verbatim"> U+0009 CHARACTER TABULATION
+ U+000A LINE FEED
+ U+000B LINE TABULATION
+ U+000C FORM FEED
+ U+000D CARRIAGE RETURN
+ U+0020 SPACE
+ U+0085 NEXT LINE
+ U+200E LEFT-TO-RIGHT MARK
+ U+200F RIGHT-TO-LEFT MARK
+ U+2028 LINE SEPARATOR
+ U+2029 PARAGRAPH SEPARATOR
+</pre>
 <hr>
 <a name="perlre-Character-set-modifiers"></a>
 <div class="header">
@@ -76634,8 +78529,8 @@
 but the <code>/l</code> does not affect how the <code>\U</code> operates.  
Most likely you
 want both of them to use locale rules.  To do this, instead compile the
 regular expression within the scope of <code>use locale</code>.  This both
-implicitly adds the <code>/l</code> and applies locale rules to the 
<code>\U</code>.   The
-lesson is to <code>use locale</code> and not <code>/l</code> explicitly.
+implicitly adds the <code>/l</code>, and applies locale rules to the 
<code>\U</code>.   The
+lesson is to <code>use locale</code>, and not <code>/l</code> explicitly.
 </p>
 <p>Similarly, it would be better to use <code>use feature 
'unicode_strings'</code>
 instead of,
@@ -76757,7 +78652,9 @@
 
 </li><li> the pattern uses a Unicode name (<code>\N{...}</code>);  or
 
-</li><li> the pattern uses a Unicode property (<code>\p{...}</code>); or
+</li><li> the pattern uses a Unicode property (<code>\p{...}</code> or 
<code>\P{...}</code>); or
+
+</li><li> the pattern uses a Unicode break (<code>\b{...}</code> or 
<code>\B{...}</code>); or
 
 </li><li> the pattern uses <a href="#perlre-_0028_003f_005b-_005d_0029">(?[ 
])</a>
 
@@ -76803,8 +78700,8 @@
 the Posix character classes to match only in the ASCII range.  They thus
 revert to their pre-5.6, pre-Unicode meanings.  Under <code>/a</code>,  
<code>\d</code>
 always means precisely the digits <code>&quot;0&quot;</code> to 
<code>&quot;9&quot;</code>; <code>\s</code> means the five
-characters <code>[ \f\n\r\t]</code>, and starting in Perl v5.18, 
experimentally,
-the vertical tab; <code>\w</code> means the 63 characters
+characters <code>[ \f\n\r\t]</code>, and starting in Perl v5.18, the vertical 
tab;
+<code>\w</code> means the 63 characters
 <code>[A-Za-z0-9_]</code>; and likewise, all the Posix classes such as
 <code>[[:print:]]</code> match only the appropriate ASCII-range characters.
 </p>
@@ -77000,20 +78897,12 @@
 </pre>
 <p>(If a curly bracket occurs in any other context and does not form part of
 a backslashed sequence like <code>\x{...}</code>, it is treated as a regular
-character.  In particular, the lower quantifier bound is not optional,
-and a typo in a quantifier silently causes it to be treated as the
-literal characters.  For example,
-</p>
-<pre class="verbatim">    /o{4,a}/
-</pre>
-<p>compiles to match the sequence of six characters
-<code>&quot;o&nbsp;{&nbsp;4&nbsp;,&nbsp;a&nbsp;}&quot;</code><!-- /@w -->.  It 
is planned to eventually require literal uses
-of curly brackets to be escaped, say by preceding them with a backslash
-or enclosing them within square brackets, (<code>&quot;\{&quot;</code> or 
<code>&quot;[{]&quot;</code>).  This
-change will allow for future syntax extensions (like making the lower
-bound of a quantifier optional), and better error checking.  In the
-meantime, you should get in the habit of escaping all instances where
-you mean a literal &quot;{&quot;.)
+character.  However, a deprecation warning is raised for all such
+occurrences, and in Perl v5.26, literal uses of a curly bracket will be
+required to be escaped, say by preceding them with a backslash 
(<code>&quot;\{&quot;</code>)
+or enclosing them within square brackets  (<code>&quot;[{]&quot;</code>).  
This change will
+allow for future syntax extensions (like making the lower bound of a
+quantifier optional), and better error checking of quantifiers.)
 </p>
 <p>The &quot;*&quot; quantifier is equivalent to <code>{0,}</code>, the 
&quot;+&quot;
 quantifier to <code>{1,}</code>, and the &quot;?&quot; quantifier to 
<code>{0,1}</code>.  n and m are limited
@@ -77219,7 +79108,9 @@
 
 <p>Perl defines the following zero-width assertions:
 </p>
-<pre class="verbatim">    \b  Match a word boundary
+<pre class="verbatim">    \b{} Match at Unicode boundary of specified type
+    \B{} Match where corresponding \b{} doesn't match
+    \b  Match a word boundary
     \B  Match except at a word boundary
     \A  Match only at beginning of string
     \Z  Match only at end of string, or before newline at the end
@@ -77227,6 +79118,12 @@
     \G  Match only at pos() (e.g. at the end-of-match position
         of prior m//g)
 </pre>
+<p>A Unicode boundary (<code>\b{}</code>), available starting in v5.22, is a 
spot
+between two characters, or before the first character in the string, or
+after the final character in the string where certain criteria defined
+by Unicode are met.  See <a 
href="#perlrebackslash-_005cb_007b_007d_002c-_005cb_002c-_005cB_007b_007d_002c-_005cB">perlrebackslash
 \b{}, \b, \B{}, \B</a> for
+details.
+</p>
 <p>A word boundary (<code>\b</code>) is a spot between two characters
 that has a <code>\w</code> on one side of it and a <code>\W</code> on the 
other side
 of it (in either order), counting the imaginary characters off the
@@ -78815,11 +80712,25 @@
 a range, the &quot;-&quot; is understood literally.
 </p>
 <p>Note also that the whole range idea is rather unportable between
-character sets&ndash;and even within character sets they may cause results
-you probably didn&rsquo;t expect.  A sound principle is to use only ranges
-that begin from and end at either alphabetics of equal case ([a-e],
-[A-E]), or digits ([0-9]).  Anything else is unsafe.  If in doubt,
-spell out the character sets in full.
+character sets, except for four situations that Perl handles specially.
+Any subset of the ranges <code>[A-Z]</code>, <code>[a-z]</code>, and 
<code>[0-9]</code> are guaranteed
+to match the expected subset of ASCII characters, no matter what
+character set the platform is running.  The fourth portable way to
+specify ranges is to use the <code>\N{...}</code> syntax to specify either end
+point of the range.  For example, <code>[\N{U+04}-\N{U+07}]</code> means to 
match
+the Unicode code points <code>\N{U+04}</code>, <code>\N{U+05}</code>, 
<code>\N{U+06}</code>, and
+<code>\N{U+07}</code>, whatever their native values may be on the platform.  
Under
+<a href="re.html#g_t_0027strict_0027-mode">(re)use re &rsquo;strict&rsquo;</a> 
or within a <a href="#perlre-_0028_003f_005b-_005d_0029">(?[ ])</a>, a warning
+is raised, if enabled, and the other end point of a range which has a
+<code>\N{...}</code> endpoint is not portably specified.  For example,
+</p>
+<pre class="verbatim"> [\N{U+00}-\x06]    # Warning under &quot;use re 
'strict'&quot;.
+</pre>
+<p>It is hard to understand without digging what exactly matches ranges
+other than subsets of <code>[A-Z]</code>, <code>[a-z]</code>, and 
<code>[0-9]</code>.  A sound
+principle is to use only ranges that begin from and end at either
+alphabetics of equal case ([a-e], [A-E]), or digits ([0-9]).  Anything
+else is unsafe or unclear.  If in doubt, spell out the range in full.
 </p>
 <p>Characters may be specified using a metacharacter syntax much like that
 used in C: &quot;\n&quot; matches a newline, &quot;\t&quot; a tab, 
&quot;\r&quot; a carriage return,
@@ -79477,7 +81388,7 @@
 other engines have to.
 </p>
 <p>The <code>flags</code> parameter is a bitfield which indicates which of the
-<code>msixp</code> flags the regex was compiled with.  It also contains
+<code>msixpn</code> flags the regex was compiled with.  It also contains
 additional info, such as if <code>use locale</code> is in effect.
 </p>
 <p>The <code>eogc</code> flags are stripped out before being passed to the comp
@@ -80673,8 +82584,8 @@
  \1                Absolute backreference.  Not in [].
  \a                Alarm or bell.
  \A                Beginning of string.  Not in [].
- \b                Word/non-word boundary. (Backspace in []).
- \B                Not a word/non-word boundary.  Not in [].
+ \b{}, \b          Boundary. (\b is a backspace in []).
+ \B{}, \B          Not a boundary.  Not in [].
  \cX               Control-X.
  \C                Single octet, even under UTF-8.  Not in [].
                    (Deprecated)
@@ -80779,7 +82690,8 @@
 <dt>[1]</dt>
 <dd><a name="perlrebackslash-_005b1_005d-1"></a>
 <p><code>\b</code> is the backspace character only inside a character class. 
Outside a
-character class, <code>\b</code> is a word/non-word boundary.
+character class, <code>\b</code> alone is a word-character/non-word-character
+boundary, and <code>\b{}</code> is some other type of boundary.
 </p>
 </dd>
 <dt>[2]</dt>
@@ -81000,7 +82912,7 @@
 <h4 class="subsubsection">60.2.3.10 Hexadecimal escapes</h4>
 
 <p>Like octal escapes, there are two forms of hexadecimal escapes, but both 
start
-with the same thing, <code>\x</code>.  This is followed by either exactly two 
hexadecimal
+with the sequence <code>\x</code>.  This is followed by either exactly two 
hexadecimal
 digits forming a number, or a hexadecimal number of arbitrary length surrounded
 by curly braces. The hexadecimal number is the code point of the character you
 want to express.
@@ -81341,10 +83253,22 @@
 <p>Mnemonic: <em>G</em>lobal.
 </p>
 </dd>
-<dt>\b, \B</dt>
-<dd><a name="perlrebackslash-_005cb_002c-_005cB"></a>
-<p><code>\b</code> matches at any place between a word and a non-word 
character; <code>\B</code>
-matches at any place between characters where <code>\b</code> doesn&rsquo;t 
match. <code>\b</code>
+<dt>\b{}, \b, \B{}, \B</dt>
+<dd><a 
name="perlrebackslash-_005cb_007b_007d_002c-_005cb_002c-_005cB_007b_007d_002c-_005cB"></a>
+<p><code>\b{...}</code>, available starting in v5.22, matches a boundary 
(between two
+characters, or before the first character of the string, or after the
+final character of the string) based on the Unicode rules for the
+boundary type specified inside the braces.  The currently known boundary
+types are given a few paragraphs below.  <code>\B{...}</code> matches at any 
place
+between characters where <code>\b{...}</code> of the same type doesn&rsquo;t 
match.
+</p>
+<p><code>\b</code> when not immediately followed by a 
<code>&quot;{&quot;</code> matches at any place
+between a word (something matched by <code>\w</code>) and a non-word character
+(<code>\W</code>); <code>\B</code> when not immediately followed by a 
<code>&quot;{&quot;</code> matches at any
+place between characters where <code>\b</code> doesn&rsquo;t match.  To get 
better
+word matching of natural language text, see <a href="bwb.html#Top">(bwb)</a> 
below.
+</p>
+<p><code>\b</code>
 and <code>\B</code> assume there&rsquo;s a non-word character before the 
beginning and after
 the end of the source string; so <code>\b</code> will match at the beginning 
(or end)
 of the source string if the source string begins (or ends) with a word
@@ -81353,13 +83277,84 @@
 <p>Do not use something like <code>\b=head\d\b</code> and expect it to match 
the
 beginning of a line.  It can&rsquo;t, because for there to be a boundary before
 the non-word &quot;=&quot;, there must be a word character immediately 
previous.  
-All boundary determinations look for word characters alone, not for
-non-words characters nor for string ends.  It may help to understand how
+All plain <code>\b</code> and <code>\B</code> boundary determinations look for 
word
+characters alone, not for
+non-word characters nor for string ends.  It may help to understand how
 &lt;\b&gt; and &lt;\B&gt; work by equating them as follows:
 </p>
 <pre class="verbatim">    \b  really means    
(?:(?&lt;=\w)(?!\w)|(?&lt;!\w)(?=\w))
     \B  really means    (?:(?&lt;=\w)(?=\w)|(?&lt;!\w)(?!\w))
 </pre>
+<p>In contrast, <code>\b{...}</code> and <code>\B{...}</code> may or may not 
match at the
+beginning and end of the line, depending on the boundary type.  These
+implement the Unicode default boundaries, specified in
+<a 
href="http://www.unicode.org/reports/tr29/";>http://www.unicode.org/reports/tr29/</a>.
+The boundary types currently available are:
+</p>
+<dl compact="compact">
+<dt><code>\b{gcb}</code> or <code>\b{g}</code></dt>
+<dd><a name="perlrebackslash-_005cb_007bgcb_007d-or-_005cb_007bg_007d"></a>
+<p>This matches a Unicode &quot;Grapheme Cluster Boundary&quot;.  (Actually 
Perl
+always uses the improved &quot;extended&quot; grapheme cluster&quot;).  These 
are
+explained below under <a href="#perlrebackslash-_005cX">\X</a>.  In fact, 
<code>\X</code> is another way to get
+the same functionality.  It is equivalent to <code>/.+?\b{gcb}/</code>.  Use
+whichever is most convenient for your situation.
+</p>
+</dd>
+<dt><code>\b{sb}</code></dt>
+<dd><a name="perlrebackslash-_005cb_007bsb_007d"></a>
+<p>This matches a Unicode &quot;Sentence Boundary&quot;.  This is an aid to 
parsing
+natural language sentences.  It gives good, but imperfect results.  For
+example, it thinks that &quot;Mr. Smith&quot; is two sentences.  More details 
are
+at <a 
href="http://www.unicode.org/reports/tr29/";>http://www.unicode.org/reports/tr29/</a>.
  Note also that it thinks
+that anything matching <a href="#perlrebackslash-_005cR">\R</a> (except form 
feed and vertical tab) is a
+sentence boundary.  <code>\b{sb}</code> works with text designed for
+word-processors which wrap lines
+automatically for display, but hard-coded line boundaries are considered
+to be essentially the ends of text blocks (paragraphs really), and hence
+the ends of sententces.  <code>\b{sb}</code> doesn&rsquo;t do well with text 
containing
+embedded newlines, like the source text of the document you are reading.
+Such text needs to be preprocessed to get rid of the line separators
+before looking for sentence boundaries.  Some people view this as a bug
+in the Unicode standard, and this behavior is quite subject to change in
+future Perl versions.
+</p>
+</dd>
+<dt><code>\b{wb}</code></dt>
+<dd><a name="perlrebackslash-_005cb_007bwb_007d"></a>
+<p>This matches a Unicode &quot;Word Boundary&quot;.  This gives better 
(though not
+perfect) results for natural language processing than plain <code>\b</code>
+(without braces) does.  For example, it understands that apostrophes can
+be in the middle of words and that parentheses aren&rsquo;t (see the examples
+below).   More details are at <a 
href="http://www.unicode.org/reports/tr29/";>http://www.unicode.org/reports/tr29/</a>.
+</p>
+</dd>
+</dl>
+
+<p>It is important to realize when you use these Unicode boundaries,
+that you are taking a risk that a future version of Perl which contains
+a later version of the Unicode Standard will not work precisely the same
+way as it did when your code was written.  These rules are not
+considered stable and have been somewhat more subject to change than the
+rest of the Standard.  Unicode reserves the right to change them at
+will, and Perl reserves the right to update its implementation to
+Unicode&rsquo;s new rules.  In the past, some changes have been because new
+characters have been added to the Standard which have different
+characteristics than all previous characters, so new rules are
+formulated for handling them.  These should not cause any backward
+compatibility issues.  But some changes have changed the treatment of
+existing characters because the Unicode Technical Committee has decided
+that the change is warranted for whatever reason.  This could be to fix
+a bug, or because they think better results are obtained with the new
+rule.
+</p>
+<p>It is also important to realize that these are default boundary
+definitions, and that implementations may wish to tailor the results for
+particular purposes and locales.
+</p>
+<p>Unicode defines a fourth boundary type, accessible through the
+<a href="Unicode-LineBreak.html#Top">(Unicode-LineBreak)</a> module.
+</p>
 <p>Mnemonic: <em>b</em>oundary.
 </p>
 </dd>
@@ -81395,6 +83390,13 @@
   while (&quot;cat dog&quot; =~ /\G(\w+)/g) {
       print $1;           # Prints 'cat'
   }
+
+  my $s = &quot;He said, \&quot;Is pi 3.14? (I'm not sure).\&quot;&quot;;
+  print join(&quot;|&quot;, $s =~ m/ ( .+? \b     ) /xg), &quot;\n&quot;;
+  print join(&quot;|&quot;, $s =~ m/ ( .+? \b{wb} ) /xg), &quot;\n&quot;;
+ prints
+  He| |said|, &quot;|Is| |pi| |3|.|14|? (|I|'|m| |not| |sure
+  He| |said|,| |&quot;|Is| |pi| |3.14|?| |(|I'm| |not| |sure|)|.|&quot;
 </pre>
 <hr>
 <a name="perlrebackslash-Misc"></a>
@@ -81486,6 +83488,8 @@
 <p>The match is greedy and non-backtracking, so that the cluster is never
 broken up into smaller components.
 </p>
+<p>See also <a 
href="#perlrebackslash-_005cb_007b_007d_002c-_005cb_002c-_005cB_007b_007d_002c-_005cB"><code>\b{gcb}</code></a>.
+</p>
 <p>Mnemonic: e<em>X</em>tended Unicode character.
 </p>
 </dd>
@@ -81825,7 +83829,7 @@
 <p>In all Perl versions, <code>\s</code> matches the 5 characters [\t\n\f\r ]; 
that
 is, the horizontal tab,
 the newline, the form feed, the carriage return, and the space.
-Starting in Perl v5.18, experimentally, it also matches the vertical tab, 
<code>\cK</code>.
+Starting in Perl v5.18, it also matches the vertical tab, <code>\cK</code>.
 See note <code>[1]</code> below for a discussion of this.
 </p>
 </dd>
@@ -81854,7 +83858,7 @@
 </dd>
 <dt>otherwise ...</dt>
 <dd><a name="perlrecharclass-otherwise-_002e_002e_002e-3"></a>
-<p><code>\s</code> matches [\t\n\f\r ] and, starting, experimentally in Perl
+<p><code>\s</code> matches [\t\n\f\r ] and, starting in Perl
 v5.18, the vertical tab, <code>\cK</code>.
 (See note <code>[1]</code> below for a discussion of this.)
 Note that this list doesn&rsquo;t include the non-breaking space.
@@ -81887,9 +83891,9 @@
 locale that may otherwise be in use.
 </p>
 <p><code>\R</code> matches anything that can be considered a newline under 
Unicode
-rules. It&rsquo;s not a character class, as it can match a multi-character
-sequence. Therefore, it cannot be used inside a bracketed character
-class; use <code>\v</code> instead (vertical whitespace).  It uses the 
platform&rsquo;s
+rules. It can match a multi-character sequence. It cannot be used inside
+a bracketed character class; use <code>\v</code> instead (vertical whitespace).
+It uses the platform&rsquo;s
 native character set, and does not consider any locale that may
 otherwise be in use.
 Details are discussed in <a href="#perlrebackslash-NAME">perlrebackslash 
NAME</a>.
@@ -81939,13 +83943,8 @@
 <dl compact="compact">
 <dt>[1]</dt>
 <dd><a name="perlrecharclass-_005b1_005d"></a>
-<p>Prior to Perl v5.18, <code>\s</code> did not match the vertical tab.  The 
change
-in v5.18 is considered an experiment, which means it could be backed out
-in v5.22 if experience indicates that it breaks too much
-existing code.  If this change adversely affects you, send email to
-<code>address@hidden</code>; if it affects you positively, email
-<code>address@hidden</code>.  In the meantime, <code>[^\S\cK]</code> 
(obscurely)
-matches what <code>\s</code> traditionally did.
+<p>Prior to Perl v5.18, <code>\s</code> did not match the vertical tab.
+<code>[^\S\cK]</code> (obscurely) matches what <code>\s</code> traditionally 
did.
 </p>
 </dd>
 <dt>[2]</dt>
@@ -82094,30 +84093,54 @@
 
  -------
 </pre>
-<p>* There is an exception to a bracketed character class matching a
-single character only.  When the class is to match caselessly under 
<code>/i</code>
-matching rules, and a character that is explicitly mentioned inside the
-class matches a
-multiple-character sequence caselessly under Unicode rules, the class
-(when not <a href="#perlrecharclass-Negation">inverted</a>) will also match 
that sequence.  For
-example, Unicode says that the letter <code>LATIN SMALL LETTER SHARP S</code>
-should match the sequence <code>ss</code> under <code>/i</code> rules.  Thus,
+<p>* There are two exceptions to a bracketed character class matching a
+single character only.  Each requires special handling by Perl to make
+things work:
 </p>
+<ul>
+<li> When the class is to match caselessly under <code>/i</code> matching 
rules, and a
+character that is explicitly mentioned inside the class matches a
+multiple-character sequence caselessly under Unicode rules, the class
+will also match that sequence.  For example, Unicode says that the
+letter <code>LATIN SMALL LETTER SHARP S</code> should match the sequence 
<code>ss</code>
+under <code>/i</code> rules.  Thus,
+
 <pre class="verbatim"> 'ss' =~ /\A\N{LATIN SMALL LETTER SHARP S}\z/i           
  # Matches
  'ss' =~ /\A[aeioust\N{LATIN SMALL LETTER SHARP S}]\z/i    # Matches
 </pre>
-<p>For this to happen, the character must be explicitly specified, and not
-be part of a multi-character range (not even as one of its endpoints).
-(<a href="#perlrecharclass-Character-Ranges">Character Ranges</a> will be 
explained shortly.)  Therefore,
-</p>
-<pre class="verbatim"> 'ss' =~ /\A[\0-\x{ff}]\z/i        # Doesn't match
- 'ss' =~ /\A[\0-\N{LATIN SMALL LETTER SHARP S}]\z/i    # No match
- 'ss' =~ /\A[\xDF-\xDF]\z/i    # Matches on ASCII platforms, since \XDF
-                               # is LATIN SMALL LETTER SHARP S, and the
-                               # range is just a single element
+<p>For this to happen, the class must not be inverted (see <a 
href="#perlrecharclass-Negation">Negation</a>)
+and the character must be explicitly specified, and not be part of a
+multi-character range (not even as one of its endpoints).  (<a 
href="#perlrecharclass-Character-Ranges">Character
+Ranges</a> will be explained shortly.) Therefore,
+</p>
+<pre class="verbatim"> 'ss' =~ /\A[\0-\x{ff}]\z/ui       # Doesn't match
+ 'ss' =~ /\A[\0-\N{LATIN SMALL LETTER SHARP S}]\z/ui   # No match
+ 'ss' =~ /\A[\xDF-\xDF]\z/ui   # Matches on ASCII platforms, since
+                               # \xDF is LATIN SMALL LETTER SHARP S,
+                               # and the range is just a single
+                               # element
 </pre>
 <p>Note that it isn&rsquo;t a good idea to specify these types of ranges 
anyway.
 </p>
+</li><li> Some names known to <code>\N{...}</code> refer to a sequence of 
multiple characters,
+instead of the usual single character.  When one of these is included in
+the class, the entire sequence is matched.  For example,
+
+<pre class="verbatim">  &quot;\N{TAMIL LETTER KA}\N{TAMIL VOWEL SIGN AU}&quot;
+                              =~ / ^ [\N{TAMIL SYLLABLE KAU}]  $ /x;
+</pre>
+<p>matches, because <code>\N{TAMIL SYLLABLE KAU}</code> is a named sequence
+consisting of the two characters matched against.  Like the other
+instance where a bracketed class can match multiple characters, and for
+similar reasons, the class must not be inverted, and the named sequence
+may not appear in a range, even one where it is both endpoints.  If
+these happen, it is a fatal error if the character class is within an
+extended <a 
href="#perlrecharclass-Extended-Bracketed-Character-Classes"><code>(?[...])</code></a>
+class; and only the first code point is used (with
+a <code>regexp</code>-type warning raised) otherwise.
+</p>
+</li></ul>
+
 <table class="menu" border="0" cellspacing="0">
 <tr><td align="left" valign="top">&bull; <a 
href="#perlrecharclass-Special-Characters-Inside-a-Bracketed-Character-Class" 
accesskey="1">perlrecharclass Special Characters Inside a Bracketed Character 
Class</a>:</td><td>&nbsp;&nbsp;</td><td align="left" valign="top">
 </td></tr>
@@ -82179,9 +84202,7 @@
 and
 <code>\x</code>
 are also special and have the same meanings as they do outside a
-bracketed character class.  (However, inside a bracketed character
-class, if <code>\N{<em>NAME</em>}</code> expands to a sequence of characters, 
only the first
-one in the sequence is used, with a warning.)
+bracketed character class.
 </p>
 <p>Also, a backslash followed by two or three octal digits is considered an 
octal
 number.
@@ -82204,12 +84225,12 @@
 <p>Examples:
 </p>
 <pre class="verbatim"> &quot;+&quot;   =~ /[+?*]/     #  Match, &quot;+&quot; 
in a character class is not special.
- &quot;\cH&quot; =~ /[\b]/      #  Match, \b inside in a character class.
+ &quot;\cH&quot; =~ /[\b]/      #  Match, \b inside in a character class
                       #  is equivalent to a backspace.
- &quot;]&quot;   =~ /[][]/      #  Match, as the character class contains.
+ &quot;]&quot;   =~ /[][]/      #  Match, as the character class contains
                       #  both [ and ].
  &quot;[]&quot;  =~ /[[]]/      #  Match, the pattern contains a character 
class
-                      #  containing just ], and the character class is
+                      #  containing just [, and the character class is
                       #  followed by a ].
 </pre>
 <hr>
@@ -82253,6 +84274,60 @@
              #  hyphen ('-'), or the letter 'm'.
  ['-?]       #  Matches any of the characters  '()*+,-./0123456789:;&lt;=&gt;?
              #  (But not on an EBCDIC platform).
+ [\N{APOSTROPHE}-\N{QUESTION MARK}]
+             #  Matches any of the characters  '()*+,-./0123456789:;&lt;=&gt;?
+             #  even on an EBCDIC platform.
+ [\N{U+27}-\N{U+3F}] # Same. (U+27 is &quot;'&quot;, and U+3F is &quot;?&quot;)
+</pre>
+<p>As the final two examples above show, you can achieve portablity to
+non-ASCII platforms by using the <code>\N{...}</code> form for the range
+endpoints.  These indicate that the specified range is to be interpreted
+using Unicode values, so <code>[\N{U+27}-\N{U+3F}]</code> means to match
+<code>\N{U+27}</code>, <code>\N{U+28}</code>, <code>\N{U+29}</code>, ..., 
<code>\N{U+3D}</code>, <code>\N{U+3E}</code>,
+and <code>\N{U+3F}</code>, whatever the native code point versions for those 
are.
+These are called &quot;Unicode&quot; ranges.  If either end is of the 
<code>\N{...}</code>
+form, the range is considered Unicode.  A <code>regexp</code> warning is raised
+under <code>&quot;use&nbsp;re&nbsp;'strict'&quot;<!-- /@w --></code> if the 
other endpoint is specified
+non-portably:
+</p>
+<pre class="verbatim"> [\N{U+00}-\x09]    # Warning under re 'strict'; \x09 is 
non-portable
+ [\N{U+00}-\t]      # No warning;
+</pre>
+<p>Both of the above match the characters <code>\N{U+00}</code> 
<code>\N{U+01}</code>, ...
+<code>\N{U+08}</code>, <code>\N{U+09}</code>, but the <code>\x09</code> looks 
like it could be a
+mistake so the warning is raised (under <code>re 'strict'</code>) for it.
+</p>
+<p>Perl also guarantees that the ranges <code>A-Z</code>, <code>a-z</code>, 
<code>0-9</code>, and any
+subranges of these match what an English-only speaker would expect them
+to match on any platform.  That is, <code>[A-Z]</code> matches the 26 ASCII
+uppercase letters;
+<code>[a-z]</code> matches the 26 lowercase letters; and <code>[0-9]</code> 
matches the 10
+digits.  Subranges, like <code>[h-k]</code>, match correspondingly, in this 
case
+just the four letters <code>&quot;h&quot;</code>, <code>&quot;i&quot;</code>, 
<code>&quot;j&quot;</code>, and <code>&quot;k&quot;</code>.  This is the
+natural behavior on ASCII platforms where the code points (ordinal
+values) for <code>&quot;h&quot;</code> through <code>&quot;k&quot;</code> are 
consecutive integers (0x68 through
+0x6B).  But special handling to achieve this may be needed on platforms
+with a non-ASCII native character set.  For example, on EBCDIC
+platforms, the code point for <code>&quot;h&quot;</code> is 0x88, 
<code>&quot;i&quot;</code> is 0x89, <code>&quot;j&quot;</code> is
+0x91, and <code>&quot;k&quot;</code> is 0x92.   Perl specially treats 
<code>[h-k]</code> to exclude the
+seven code points in the gap: 0x8A through 0x90.  This special handling is
+only invoked when the range is a subrange of one of the ASCII uppercase,
+lowercase, and digit ranges, AND each end of the range is expressed
+either as a literal, like <code>&quot;A&quot;</code>, or as a named character 
(<code>\N{...}</code>,
+including the <code>\N{U+...</code> form).
+</p>
+<p>EBCDIC Examples:
+</p>
+<pre class="verbatim"> [i-j]               #  Matches either &quot;i&quot; or 
&quot;j&quot;
+ [i-\N{LATIN SMALL LETTER J}]  # Same
+ [i-\N{U+6A}]        #  Same
+ [\N{U+69}-\N{U+6A}] #  Same
+ [\x{89}-\x{91}]     #  Matches 0x89 (&quot;i&quot;), 0x8A .. 0x90, 0x91 
(&quot;j&quot;)
+ [i-\x{91}]          #  Same
+ [\x{89}-j]          #  Same
+ [i-J]               #  Matches, 0x89 (&quot;i&quot;) .. 0xC1 (&quot;J&quot;); 
special
+                     #  handling doesn't apply because range is mixed
+                     #  case
 </pre>
 <hr>
 <a name="perlrecharclass-Negation"></a>
@@ -82275,9 +84350,10 @@
 else don&rsquo;t list it first.
 </p>
 <p>In inverted bracketed character classes, Perl ignores the Unicode rules
-that normally say that certain characters should match a sequence of
-multiple characters under caseless <code>/i</code> matching.  Following those
-rules could lead to highly confusing situations:
+that normally say that named sequence, and certain characters should
+match a sequence of multiple characters use under caseless <code>/i</code>
+matching.  Following those rules could lead to highly confusing
+situations:
 </p>
 <pre class="verbatim"> &quot;ss&quot; =~ /^[^\xDF]+$/ui;   # Matches!
 </pre>
@@ -82286,7 +84362,7 @@
 says that <code>&quot;ss&quot;</code> is what <code>\xDF</code> matches under 
<code>/i</code>.  So which one
 &quot;wins&quot;? Do you fail the match because the string has <code>ss</code> 
or accept it
 because it has an <code>s</code> followed by another <code>s</code>?  Perl has 
chosen the
-latter.
+latter.  (See note in <a 
href="#perlrecharclass-Bracketed-Character-Classes">Bracketed Character 
Classes</a> above.)
 </p>
 <p>Examples:
 </p>
@@ -82378,6 +84454,12 @@
  word   A Perl extension (&quot;[A-Za-z0-9_]&quot;), equivalent to 
&quot;\w&quot;.
  xdigit Any hexadecimal digit (&quot;[0-9a-fA-F]&quot;).
 </pre>
+<p>Like the <a href="#perlrecharclass-Unicode-Properties">Unicode 
properties</a>, most of the POSIX
+properties match the same regardless of whether case-insensitive 
(<code>/i</code>)
+matching is in effect or not.  The two exceptions are <code>[:upper:]</code> 
and
+<code>[:lower:]</code>.  Under <code>/i</code>, they each match the union of 
<code>[:upper:]</code> and
+<code>[:lower:]</code>.
+</p>
 <p>Most POSIX character classes have two Unicode-style <code>\p</code> property
 counterparts.  (They are not official Unicode properties, but Perl extensions
 derived from official Unicode properties.)  The table below shows the relation
@@ -82423,8 +84505,9 @@
 <dd><a name="perlrecharclass-_005b2_005d-1"></a>
 <p>Control characters don&rsquo;t produce output as such, but instead usually 
control
 the terminal somehow: for example, newline and backspace are control 
characters.
-In the ASCII range, characters whose code points are between 0 and 31 
inclusive,
-plus 127 (<code>DEL</code>) are control characters.
+On ASCII platforms, in the ASCII range, characters whose code points are
+between 0 and 31 inclusive, plus 127 (<code>DEL</code>) are control 
characters; on
+EBCDIC platforms, their counterparts are control characters.
 </p>
 </dd>
 <dt>[3]</dt>
@@ -82466,7 +84549,7 @@
 <dd><a name="perlrecharclass-_005b6_005d"></a>
 <p><code>\p{XPerlSpace}</code> and <code>\p{Space}</code> match identically 
starting with Perl
 v5.18.  In earlier versions, these differ only in that in non-locale
-matching, <code>\p{XPerlSpace}</code> does not match the vertical tab, 
<code>\cK</code>.
+matching, <code>\p{XPerlSpace}</code> did not match the vertical tab, 
<code>\cK</code>.
 Same for the two ASCII-only range forms.
 </p>
 </dd>
@@ -82666,14 +84749,11 @@
 </p>
 <pre class="verbatim"> !    complement
 </pre>
-<p>All the binary operators left associate, and are of equal precedence.
-The unary operator right associates, and has higher precedence.  Use
-parentheses to override the default associations.  Some feedback we&rsquo;ve
-received indicates a desire for intersection to have higher precedence
-than union.  This is something that feedback from the field may cause us
-to change in future releases; you may want to parenthesize copiously to
-avoid such changes affecting your code, until this feature is no longer
-considered experimental.
+<p>All the binary operators left associate; <code>&quot;&amp;&quot;</code> is 
higher precedence
+than the others, which all have equal precedence.  The unary operator
+right associates, and has highest precedence.  Thus this follows the
+normal Perl precedence rules for logical operators.  Use parentheses to
+override the default precedence and associativity.
 </p>
 <p>The main restriction is that everything is a metacharacter.  Thus,
 you cannot refer to single characters by doing something like this:
@@ -82763,16 +84843,6 @@
 
 </li></ol>
 
-<p>The <code>/x</code> processing within this class is an extended form.
-Besides the characters that are considered white space in normal 
<code>/x</code>
-processing, there are 5 others, recommended by the Unicode standard:
-</p>
-<pre class="verbatim"> U+0085 NEXT LINE
- U+200E LEFT-TO-RIGHT MARK
- U+200F RIGHT-TO-LEFT MARK
- U+2028 LINE SEPARATOR
- U+2029 PARAGRAPH SEPARATOR
-</pre>
 <p>Note that skipping white space applies only to the interior of this
 construct.  There must not be any space between any of the characters
 that form the initial <code>(?[</code>.  Nor may there be space between the
@@ -82844,7 +84914,9 @@
 </td></tr>
 <tr><td align="left" valign="top">&bull; <a 
href="#perlref-Postfix-Dereference-Syntax" accesskey="5">perlref Postfix 
Dereference Syntax</a>:</td><td>&nbsp;&nbsp;</td><td align="left" valign="top">
 </td></tr>
-<tr><td align="left" valign="top">&bull; <a href="#perlref-SEE-ALSO" 
accesskey="6">perlref SEE ALSO</a>:</td><td>&nbsp;&nbsp;</td><td align="left" 
valign="top">
+<tr><td align="left" valign="top">&bull; <a 
href="#perlref-Assigning-to-References" accesskey="6">perlref Assigning to 
References</a>:</td><td>&nbsp;&nbsp;</td><td align="left" valign="top">
+</td></tr>
+<tr><td align="left" valign="top">&bull; <a href="#perlref-SEE-ALSO" 
accesskey="7">perlref SEE ALSO</a>:</td><td>&nbsp;&nbsp;</td><td align="left" 
valign="top">
 </td></tr>
 </table>
 
@@ -83643,7 +85715,7 @@
 <a name="perlref-Postfix-Dereference-Syntax"></a>
 <div class="header">
 <p>
-Next: <a href="#perlref-SEE-ALSO" accesskey="n" rel="next">perlref SEE 
ALSO</a>, Previous: <a href="#perlref-WARNING" accesskey="p" rel="prev">perlref 
WARNING</a>, Up: <a href="#perlref" accesskey="u" rel="up">perlref</a> &nbsp; 
[<a href="#SEC_Contents" title="Table of contents" 
rel="contents">Contents</a>]</p>
+Next: <a href="#perlref-Assigning-to-References" accesskey="n" 
rel="next">perlref Assigning to References</a>, Previous: <a 
href="#perlref-WARNING" accesskey="p" rel="prev">perlref WARNING</a>, Up: <a 
href="#perlref" accesskey="u" rel="up">perlref</a> &nbsp; [<a 
href="#SEC_Contents" title="Table of contents" rel="contents">Contents</a>]</p>
 </div>
 <a name="Postfix-Dereference-Syntax"></a>
 <h3 class="section">62.5 Postfix Dereference Syntax</h3>
@@ -83717,13 +85789,131 @@
 if the additional <code>postderef_qq</code> <a 
href="feature.html#Top">(feature)</a> is enabled.
 </p>
 <hr>
+<a name="perlref-Assigning-to-References"></a>
+<div class="header">
+<p>
+Next: <a href="#perlref-SEE-ALSO" accesskey="n" rel="next">perlref SEE 
ALSO</a>, Previous: <a href="#perlref-Postfix-Dereference-Syntax" accesskey="p" 
rel="prev">perlref Postfix Dereference Syntax</a>, Up: <a href="#perlref" 
accesskey="u" rel="up">perlref</a> &nbsp; [<a href="#SEC_Contents" title="Table 
of contents" rel="contents">Contents</a>]</p>
+</div>
+<a name="Assigning-to-References"></a>
+<h3 class="section">62.6 Assigning to References</h3>
+
+<p>Beginning in v5.22.0, the referencing operator can be assigned to.  It
+performs an aliasing operation, so that the variable name referenced on the
+left-hand side becomes an alias for the thing referenced on the right-hand
+side:
+</p>
+<pre class="verbatim">    \$a = \$b; # $a and $b now point to the same scalar
+    \&amp;foo = \&amp;bar; # foo() now means bar()
+</pre>
+<p>This syntax must be enabled with <code>use feature 'refaliasing'</code>.  
It is
+experimental, and will warn by default unless <code>no warnings
+'experimental::refaliasing'</code> is in effect.
+</p>
+<p>These forms may be assigned to, and cause the right-hand side to be
+evaluated in scalar context:
+</p>
+<pre class="verbatim">    \$scalar
+    address@hidden
+    \%hash
+    \&amp;sub
+    \my $scalar
+    \my @array
+    \my %hash
+    \state $scalar # or @array, etc.
+    \our $scalar   # etc.
+    \local $scalar # etc.
+    \local our $scalar # etc.
+    \$some_array[$index]
+    \$some_hash{$key}
+    \local $some_array[$index]
+    \local $some_hash{$key}
+    condition ? \$this : \$that[0] # etc.
+</pre>
+<p>Slicing operations and parentheses cause
+the right-hand side to be evaluated in
+list context:
+</p>
+<pre class="verbatim">    address@hidden
+    (address@hidden)
+    \(@array[5..7])
+    address@hidden'foo','bar'}
+    (address@hidden'foo','bar'})
+    \(@hash{'foo','bar'})
+    (\$scalar)
+    \($scalar)
+    \(my $scalar)
+    \my($scalar)
+    (address@hidden)
+    (\%hash)
+    (\&amp;sub)
+    \(&amp;sub)
+    \($foo, @bar, %baz)
+    (\$foo, address@hidden, \%baz)
+</pre>
+<p>Each element on the right-hand side must be a reference to a datum of the
+right type.  Parentheses immediately surrounding an array (and possibly
+also <code>my</code>/<code>state</code>/<code>our</code>/<code>local</code>) 
will make each element of the array an
+alias to the corresponding scalar referenced on the right-hand side:
+</p>
+<pre class="verbatim">    \(@a) = \(@b); # @a and @b now have the same elements
+    \my(@a) = \(@b); # likewise
+    \(my @a) = \(@b); # likewise
+    push @a, 3; # but now @a has an extra element that @b lacks
+    \(@a) = (\$a, \$b, \$c); # @a now contains $a, $b, and $c
+</pre>
+<p>Combining that form with <code>local</code> and putting parentheses 
immediately
+around a hash are forbidden (because it is not clear what they should do):
+</p>
+<pre class="verbatim">    \local(@array) = foo(); # WRONG
+    \(%hash)       = bar(); # wRONG
+</pre>
+<p>Assignment to references and non-references may be combined in lists and
+conditional ternary expressions, as long as the values on the right-hand
+side are the right type for each element on the left, though this may make
+for obfuscated code:
+</p>
+<pre class="verbatim">    (my $tom, \my $dick, \my @harry) = (\1, \2, [1..3]);
+    # $tom is now \1
+    # $dick is now 2 (read-only)
+    # @harry is (1,2,3)
+
+    my $type = ref $thingy;
+    ($type ? $type == 'ARRAY' ? address@hidden : \$bar : $baz) = $thingy;
+</pre>
+<p>The <code>foreach</code> loop can also take a reference constructor for its 
loop
+variable, though the syntax is limited to one of the following, with an
+optional <code>my</code>, <code>state</code>, or <code>our</code> after the 
backslash:
+</p>
+<pre class="verbatim">    \$s
+    address@hidden
+    \%h
+    \&amp;c
+</pre>
+<p>No parentheses are permitted.  This feature is particularly useful for
+arrays-of-arrays, or arrays-of-hashes:
+</p>
+<pre class="verbatim">    foreach \my @a (@array_of_arrays) {
+        frobnicate($a[0], $a[-1]);
+    }
+
+    foreach \my %h (@array_of_hashes) {
+        $h{gelastic}++ if $h{type} == 'funny';
+    }
+</pre>
+<p><strong>CAVEAT:</strong> Aliasing does not work correctly with closures.  
If you try to
+alias lexical variables from an inner subroutine or <code>eval</code>, the 
aliasing
+will only be visible within that inner sub, and will not affect the outer
+subroutine where the variables are declared.  This bizarre behavior is
+subject to change.
+</p>
+<hr>
 <a name="perlref-SEE-ALSO"></a>
 <div class="header">
 <p>
-Previous: <a href="#perlref-Postfix-Dereference-Syntax" accesskey="p" 
rel="prev">perlref Postfix Dereference Syntax</a>, Up: <a href="#perlref" 
accesskey="u" rel="up">perlref</a> &nbsp; [<a href="#SEC_Contents" title="Table 
of contents" rel="contents">Contents</a>]</p>
+Previous: <a href="#perlref-Assigning-to-References" accesskey="p" 
rel="prev">perlref Assigning to References</a>, Up: <a href="#perlref" 
accesskey="u" rel="up">perlref</a> &nbsp; [<a href="#SEC_Contents" title="Table 
of contents" rel="contents">Contents</a>]</p>
 </div>
 <a name="SEE-ALSO-31"></a>
-<h3 class="section">62.6 SEE ALSO</h3>
+<h3 class="section">62.7 SEE ALSO</h3>
 
 <p>Besides the obvious documents, source code can be instructive.
 Some pathological examples of the use of references can be found
@@ -85720,6 +87910,8 @@
 </td></tr>
 <tr><td align="left" valign="top">&bull; <a 
href="#perlrequick-The-split-operator" accesskey="9">perlrequick The split 
operator</a>:</td><td>&nbsp;&nbsp;</td><td align="left" valign="top">
 </td></tr>
+<tr><td align="left" valign="top">&bull; <a 
href="#perlrequick-use-re-_0027strict_0027">perlrequick <code>use re 
'strict'</code></a>:</td><td>&nbsp;&nbsp;</td><td align="left" valign="top">
+</td></tr>
 </table>
 
 <hr>
@@ -85944,6 +88136,11 @@
 <p>In the last example, the end of the string is considered a word
 boundary.
 </p>
+<p>For natural language processing (so that, for example, apostrophes are
+included in words), use instead <code>\b{wb}</code>
+</p>
+<pre class="verbatim">    &quot;don't&quot; =~ / .+? \b{wb} /x;  # matches the 
whole string
+</pre>
 <hr>
 <a name="perlrequick-Matching-this-or-that"></a>
 <div class="header">
@@ -86206,7 +88403,7 @@
 <a name="perlrequick-The-split-operator"></a>
 <div class="header">
 <p>
-Previous: <a href="#perlrequick-Search-and-replace" accesskey="p" 
rel="prev">perlrequick Search and replace</a>, Up: <a 
href="#perlrequick-The-Guide" accesskey="u" rel="up">perlrequick The Guide</a> 
&nbsp; [<a href="#SEC_Contents" title="Table of contents" 
rel="contents">Contents</a>]</p>
+Next: <a href="#perlrequick-use-re-_0027strict_0027" accesskey="n" 
rel="next">perlrequick <code>use re 'strict'</code></a>, Previous: <a 
href="#perlrequick-Search-and-replace" accesskey="p" rel="prev">perlrequick 
Search and replace</a>, Up: <a href="#perlrequick-The-Guide" accesskey="u" 
rel="up">perlrequick The Guide</a> &nbsp; [<a href="#SEC_Contents" title="Table 
of contents" rel="contents">Contents</a>]</p>
 </div>
 <a name="The-split-operator"></a>
 <h4 class="subsection">66.3.9 The split operator</h4>
@@ -86243,6 +88440,21 @@
 an empty initial element to the list.
 </p>
 <hr>
+<a name="perlrequick-use-re-_0027strict_0027"></a>
+<div class="header">
+<p>
+Previous: <a href="#perlrequick-The-split-operator" accesskey="p" 
rel="prev">perlrequick The split operator</a>, Up: <a 
href="#perlrequick-The-Guide" accesskey="u" rel="up">perlrequick The Guide</a> 
&nbsp; [<a href="#SEC_Contents" title="Table of contents" 
rel="contents">Contents</a>]</p>
+</div>
+<a name="use-re-_0027strict_0027"></a>
+<h4 class="subsection">66.3.10 <code>use re 'strict'</code></h4>
+
+<p>New in v5.22, this applies stricter rules than otherwise when compiling
+regular expression patterns.  It can find things that, while legal, may
+not be what you intended.
+</p>
+<p>See <a href="re.html#g_t_0027strict_0027-mode">(re)&rsquo;strict&rsquo; in 
re</a>.
+</p>
+<hr>
 <a name="perlrequick-BUGS"></a>
 <div class="header">
 <p>
@@ -86386,7 +88598,7 @@
 </p>
 <pre class="verbatim">    $var !~ /foo/;
 </pre>
-<p><code>m/pattern/msixpogcdual</code> searches a string for a pattern match,
+<p><code>m/pattern/msixpogcdualn</code> searches a string for a pattern match,
 applying the given options.
 </p>
 <pre class="verbatim">    m  Multiline mode - ^ and $ match internal lines
@@ -86404,13 +88616,14 @@
     u  match according to Unicode rules
     d  match according to native rules unless something indicates
        Unicode
+    n  Non-capture mode. Don't let () fill in $1, $2, etc...
 </pre>
 <p>If &rsquo;pattern&rsquo; is an empty string, the last <em>successfully</em> 
matched
 regex is used. Delimiters other than &rsquo;/&rsquo; may be used for both this
 operator and the following ones. The leading <code>m</code> can be omitted
 if the delimiter is &rsquo;/&rsquo;.
 </p>
-<p><code>qr/pattern/msixpodual</code> lets you store a regex in a variable,
+<p><code>qr/pattern/msixpodualn</code> lets you store a regex in a variable,
 or pass one around. Modifiers as for <code>m//</code>, and are stored
 within the regex.
 </p>
@@ -86593,6 +88806,8 @@
 </p>
 <pre class="verbatim">   ^  Match string start (or line, if /m is used)
    $  Match string end (or line, if /m is used) or before newline
+   \b{} Match boundary of type specified within the braces
+   \B{} Match wherever \b{} doesn't match
    \b Match word boundary (between \w and \W)
    \B Match except at word boundary (between \w and \w or \W and \W)
    \A Match string start (regardless of /m)
@@ -86649,6 +88864,7 @@
    (?&lt;name&gt;...)      Named capture
    (?'name'...)      Named capture
    (?P&lt;name&gt;...)     Named capture (python syntax)
+   (?[...])          Extended bracketed character class
    (?{ code })       Embedded code, return value becomes $^R
    (??{ code })      Dynamic regex, return value used as regex
    (?N)              Recurse into subpattern number N
@@ -86942,6 +89158,10 @@
 regexp vs regex; in Perl, there is more than one way to abbreviate it.
 We&rsquo;ll use regexp in this tutorial.
 </p>
+<p>New in v5.22, <a href="re.html#g_t_0027strict_0027-mode">(re)<code>use re 
'strict'</code></a> applies stricter
+rules than otherwise when compiling regular expression patterns.  It can
+find things that, while legal, may not be what you intended.
+</p>
 <hr>
 <a name="perlretut-Part-1_003a-The-basics"></a>
 <div class="header">
@@ -87385,6 +89605,11 @@
 <p>Note in the last example, the end of the string is considered a word
 boundary.
 </p>
+<p>For natural language processing (so that, for example, apostrophes are
+included in words), use instead <code>\b{wb}</code>
+</p>
+<pre class="verbatim">    &quot;don't&quot; =~ / .+? \b{wb} /x;  # matches the 
whole string
+</pre>
 <p>You might wonder why <code>'.'</code> matches everything but 
<code>&quot;\n&quot;</code> - why not
 every character? The reason is that often one is matching against
 lines and would like to ignore the newline characters.  For instance,
@@ -87947,6 +90172,13 @@
     @num = split /(a|b)+/, $x;    # @num = ('12','a','34','a','5')
     @num = split /(?:a|b)+/, $x;  # @num = ('12','34','5')
 </pre>
+<p>In Perl 5.22 and later, all groups within a regexp can be set to
+non-capturing by using the new <code>/n</code> flag:
+</p>
+<pre class="verbatim">    &quot;hello&quot; =~ /(hi|hello)/n; # $1 is not set!
+</pre>
+<p>See <a href="#perlre-n">perlre n</a> for more information.
+</p>
 <hr>
 <a name="perlretut-Matching-repetitions"></a>
 <div class="header">
@@ -90415,7 +92647,7 @@
               in UTF-8
     L    64   normally the &quot;IOEioA&quot; are unconditional, the L makes
               them conditional on the locale environment variables
-              (the LC_ALL, LC_TYPE, and LANG, in the order of
+              (the LC_ALL, LC_CTYPE, and LANG, in the order of
               decreasing precedence) -- if the variables indicate
               UTF-8, then the selected &quot;IOEioA&quot; are in effect
     a   256   Set ${^UTF8CACHE} to -1, to run the UTF-8 caching
@@ -90540,6 +92772,8 @@
  16777216  M  trace smart match resolution
  33554432  B  dump suBroutine definitions, including special Blocks
               like BEGIN
+ 67108864  L  trace Locale-related info; what gets output is very
+              subject to change
 </pre>
 <p>All these flags require <strong>-DDEBUGGING</strong> when you compile the 
Perl
 executable (but see <code>:opd</code> in <a 
href="Devel-Peek.html#Top">(Devel-Peek)</a> or <a 
href="re.html#g_t_0027debug_0027-mode">(re)'debug' mode</a>
@@ -92454,10 +94688,6 @@
 here. There are still some bits and pieces hanging around in here that
 need to be moved. Perhaps you could move them?  Thanks!
 </p>
-</li><li> <samp>t/x2p</samp>
-
-<p>A test suite for the s2p converter.
-</p>
 </li></ul>
 
 <hr>
@@ -92870,19 +95100,19 @@
 
     sub NAME BLOCK                # A declaration and a definition.
     sub NAME(PROTO) BLOCK         #  ditto, but with prototypes
-    sub NAME SIG BLOCK            #  with signature
+    sub NAME(SIG) BLOCK           #  with a signature instead
     sub NAME : ATTRS BLOCK        #  with attributes
     sub NAME(PROTO) : ATTRS BLOCK #  with prototypes and attributes
-    sub NAME : ATTRS SIG BLOCK    #  with attributes and signature
+    sub NAME(SIG) : ATTRS BLOCK   #  with a signature and attributes
 </pre>
 <p>To define an anonymous subroutine at runtime:
 </p>
 <pre class="verbatim">    $subref = sub BLOCK;                 # no proto
     $subref = sub (PROTO) BLOCK;         # with proto
-    $subref = sub SIG BLOCK;             # with signature
+    $subref = sub (SIG) BLOCK;           # with signature
     $subref = sub : ATTRS BLOCK;         # with attributes
     $subref = sub (PROTO) : ATTRS BLOCK; # with proto and attributes
-    $subref = sub : ATTRS SIG BLOCK;     # with attribs and signature
+    $subref = sub (SIG) : ATTRS BLOCK;   # with signature and attributes
 </pre>
 <p>To import subroutines:
 </p>
@@ -93089,7 +95319,7 @@
       return($x * __SUB__-&gt;( $x - 1 ) );
     };
 </pre>
-<p>The behaviour of <code>__SUB__</code> within a regex code block (such as 
<code>/(?{...})/</code>)
+<p>The behavior of <code>__SUB__</code> within a regex code block (such as 
<code>/(?{...})/</code>)
 is subject to change.
 </p>
 <p>Subroutines whose names are in all upper case are reserved to the Perl
@@ -93214,8 +95444,8 @@
 </p>
 <p>The signature is part of a subroutine&rsquo;s body.  Normally the body of a
 subroutine is simply a braced block of code.  When using a signature,
-the signature is a parenthesised list that goes immediately before
-the braced block.  The signature declares lexical variables that are
+the signature is a parenthesised list that goes immediately after
+the subroutine name.  The signature declares lexical variables that are
 in scope for the block.  When the subroutine is called, the signature
 takes control first.  It populates the signature variables from the
 list of arguments that were passed.  If the argument list doesn&rsquo;t meet
@@ -93385,13 +95615,12 @@
 of calls to the subroutine, and the signature puts argument values into
 lexical variables at runtime.  You can therefore write
 </p>
-<pre class="verbatim">    sub foo :prototype($$) ($left, $right) {
+<pre class="verbatim">    sub foo ($left, $right) : prototype($$) {
         return $left + $right;
     }
 </pre>
-<p>The prototype attribute, and any other attributes, must come before
-the signature.  The signature always immediately precedes the block of
-the subroutine&rsquo;s body.
+<p>The prototype attribute, and any other attributes, come after 
+the signature.
 </p>
 <hr>
 <a name="perlsub-Private-Variables-via-my_0028_0029"></a>
@@ -93851,7 +96080,7 @@
 <p><strong>WARNING</strong>: Localization of tied arrays and hashes does not 
currently
 work as described.
 This will be fixed in a future release of Perl; in the meantime, avoid
-code that relies on any particular behaviour of localising tied arrays
+code that relies on any particular behavior of localising tied arrays
 or hashes (localising individual elements is still okay).
 See <a 
href="perl58delta.html#Localising-Tied-Arrays-and-Hashes-Is-Broken">(perl58delta)Localising
 Tied Arrays and Hashes Is Broken</a> for more
 details.
@@ -94074,9 +96303,6 @@
         baz();          # recursive call
     }
 </pre>
-<p>It is a known bug that lexical subroutines cannot be used as the 
<code>SUBNAME</code>
-argument to <code>sort</code>.  This will be fixed in a future version of Perl.
-</p>
 <table class="menu" border="0" cellspacing="0">
 <tr><td align="left" valign="top">&bull; <a 
href="#perlsub-state-sub-vs-my-sub" accesskey="1">perlsub <code>state 
sub</code> vs <code>my sub</code></a>:</td><td>&nbsp;&nbsp;</td><td 
align="left" valign="top">
 </td></tr>
@@ -94667,13 +96893,14 @@
     sub N () { int(OPT_BAZ) / 3 }
 
     sub FOO_SET () { 1 if FLAG_MASK &amp; FLAG_FOO }
+    sub FOO_SET2 () { if (FLAG_MASK &amp; FLAG_FOO) { 1 } }
 </pre>
-<p>Be aware that these will not be inlined; as they contain inner scopes,
-the constant folding doesn&rsquo;t reduce them to a single constant:
+<p>(Be aware that the last example was not always inlined in Perl 5.20 and
+earlier, which did not behave consistently with subroutines containing
+inner scopes.)  You can countermand inlining by using an explicit
+<code>return</code>:
 </p>
-<pre class="verbatim">    sub foo_set () { if (FLAG_MASK &amp; FLAG_FOO) { 1 } 
}
-
-    sub baz_val () {
+<pre class="verbatim">    sub baz_val () {
         if (OPT_BAZ) {
             return 23;
         }
@@ -94681,6 +96908,7 @@
             return 42;
         }
     }
+    sub bonk_val () { return 12345 }
 </pre>
 <p>As alluded to earlier you can also declare inlined subs dynamically at
 BEGIN time if their body consists of a lexically-scoped scalar which
@@ -94710,6 +96938,39 @@
     }
     print RT_79908(); # prints 79907
 </pre>
+<p>As of Perl 5.22, this buggy behavior, while preserved for backward
+compatibility, is detected and emits a deprecation warning.  If you want
+the subroutine to be inlined (with no warning), make sure the variable is
+not used in a context where it could be modified aside from where it is
+declared.
+</p>
+<pre class="verbatim">    # Fine, no warning
+    BEGIN {
+        my $x = 54321;
+        *INLINED = sub () { $x };
+    }
+    # Warns.  Future Perl versions will stop inlining it.
+    BEGIN {
+        my $x;
+        $x = 54321;
+        *ALSO_INLINED = sub () { $x };
+    }
+</pre>
+<p>Perl 5.22 also introduces the experimental &quot;const&quot; attribute as an
+alternative.  (Disable the &quot;experimental::const_attr&quot; warnings if 
you want
+to use it.)  When applied to an anonymous subroutine, it forces the sub to
+be called when the <code>sub</code> expression is evaluated.  The return value 
is
+captured and turned into a constant subroutine:
+</p>
+<pre class="verbatim">    my $x = 54321;
+    *INLINED = sub : const { $x };
+    $x++;
+</pre>
+<p>The return value of <code>INLINED</code> in this example will always be 
54321,
+regardless of later modifications to $x.  You can also put any arbitrary
+code inside the sub, at it will be executed immediately and its return
+value captured the same way.
+</p>
 <p>If you really want a subroutine with a <code>()</code> prototype that 
returns a
 lexical variable you can easily force it to not be inlined by adding
 an explicit <code>return</code>:
@@ -94722,7 +96983,7 @@
     print RT_79908(); # prints 79908
 </pre>
 <p>The easiest way to tell if a subroutine was inlined is by using
-<a href="B-Deparse.html#Top">(B-Deparse)</a>, consider this example of two 
subroutines returning
+<a href="B-Deparse.html#Top">(B-Deparse)</a>.  Consider this example of two 
subroutines returning
 <code>1</code>, one with a <code>()</code> prototype causing it to be inlined, 
and one
 without (with deparse output truncated for clarity):
 </p>
@@ -94755,7 +97016,8 @@
 you need to be able to redefine the subroutine, you need to ensure
 that it isn&rsquo;t inlined, either by dropping the <code>()</code> prototype 
(which
 changes calling semantics, so beware) or by thwarting the inlining
-mechanism in some other way, e.g. by adding an explicit <code>return</code>:
+mechanism in some other way, e.g. by adding an explicit <code>return</code>, as
+mentioned above:
 </p>
 <pre class="verbatim">    sub not_inlined () { return 23 }
 </pre>
@@ -94866,7 +97128,7 @@
 possible) with the built-in native syntax.  You can achieve this by using
 a suitable prototype.  To get the prototype of an overridable built-in,
 use the <code>prototype</code> function with an argument of 
<code>&quot;CORE::builtin_name&quot;</code>
-(see &lsquo;perlfunc prototype&rsquo;).
+(see <a href="#perlfunc-prototype">perlfunc prototype</a>).
 </p>
 <p>Note however that some built-ins can&rsquo;t have their syntax expressed by 
a
 prototype (such as <code>system</code> or <code>chomp</code>).  If you 
override them you won&rsquo;t
@@ -95594,7 +97856,7 @@
 <a name="Foreach-Loops"></a>
 <h4 class="subsection">74.2.9 Foreach Loops</h4>
 
-<p>The <code>foreach</code> loop iterates over a normal list value and sets the
+<p>The <code>foreach</code> loop iterates over a normal list value and sets 
the scalar
 variable VAR to be each element of the list in turn.  If the variable
 is preceded with the keyword <code>my</code>, then it is lexically scoped, and
 is therefore visible only within the loop.  Otherwise, the variable is
@@ -95620,6 +97882,14 @@
 <p><code>foreach</code> probably won&rsquo;t do what you expect if VAR is a 
tied or other
 special variable.   Don&rsquo;t do that either.
 </p>
+<p>As of Perl 5.22, there is an experimental variant of this loop that accepts
+a variable preceded by a backslash for VAR, in which case the items in the
+LIST must be references.  The backslashed variable will become an alias
+to each referenced item in the LIST, which must be of the correct type.
+The variable needn&rsquo;t be a scalar in this case, and the backslash may be
+followed by <code>my</code>.  To use this form, you must enable the 
<code>refaliasing</code>
+feature via <code>use feature</code>.  (See <a 
href="feature.html#Top">(feature)</a>.  See also <a 
href="#perlref-Assigning-to-References">perlref Assigning to References</a>.)
+</p>
 <p>Examples:
 </p>
 <pre class="verbatim">    for (@ary) { s/foo/bar/ }
@@ -95638,6 +97908,12 @@
     foreach $item (split(/:[\\\n:]*/, $ENV{TERMCAP})) {
         print &quot;Item: $item\n&quot;;
     }
+
+    use feature &quot;refaliasing&quot;;
+    no warnings &quot;experimental::refaliasing&quot;;
+    foreach \my %hash (@array_of_hash_references) {
+        # do something which each %hash
+    }
 </pre>
 <p>Here&rsquo;s how a C programmer might code up a particular algorithm in 
Perl:
 </p>
@@ -96124,6 +98400,9 @@
 <dd><a name="perlsyn-3_002e"></a>
 <p>A smart match that uses an explicit <code>~~</code> operator, such as 
<code>EXPR ~~ EXPR</code>.
 </p>
+<p><strong>NOTE:</strong> You will often have to use <code>$c ~~ $_</code> 
because the default case
+uses <code>$_ ~~ $c</code> , which is frequentlythe opposite of what you want.
+</p>
 </dd>
 <dt>4.</dt>
 <dd><a name="perlsyn-4_002e"></a>
@@ -96132,10 +98411,6 @@
 (<code>&lt;</code>, <code>&gt;</code>, <code>&lt;=</code>, <code>&gt;=</code>, 
<code>==</code>, and <code>!=</code>), and
 the six string comparisons (<code>lt</code>, <code>gt</code>, <code>le</code>, 
<code>ge</code>, <code>eq</code>, and <code>ne</code>).
 </p>
-<p><strong>NOTE:</strong> You will often have to use <code>$c ~~ $_</code> 
because
-the default case uses <code>$_ ~~ $c</code> , which is frequently
-the opposite of what you want.
-</p>
 </dd>
 <dt>5.</dt>
 <dd><a name="perlsyn-5_002e"></a>
@@ -99972,26 +102247,50 @@
 <a name="DESCRIPTION-79"></a>
 <h3 class="section">81.2 DESCRIPTION</h3>
 
+<p>If you haven&rsquo;t already, before reading this document, you should 
become
+familiar with both <a href="#perlunitut-NAME">perlunitut NAME</a> and <a 
href="#perluniintro-NAME">perluniintro NAME</a>.
+</p>
+<p>Unicode aims to <strong>UNI</strong>-fy the en-<strong>CODE</strong>-ings 
of all the world&rsquo;s
+character sets into a single Standard.   For quite a few of the various
+coding standards that existed when Unicode was first created, converting
+from each to Unicode essentially meant adding a constant to each code
+point in the original standard, and converting back meant just
+subtracting that same constant.  For ASCII and ISO-8859-1, the constant
+is 0.  For ISO-8859-5, (Cyrillic) the constant is 864; for Hebrew
+(ISO-8859-8), it&rsquo;s 1488; Thai (ISO-8859-11), 3424; and so forth.  This
+made it easy to do the conversions, and facilitated the adoption of
+Unicode.
+</p>
+<p>And it worked; nowadays, those legacy standards are rarely used.  Most
+everyone uses Unicode.
+</p>
+<p>Unicode is a comprehensive standard.  It specifies many things outside
+the scope of Perl, such as how to display sequences of characters.  For
+a full discussion of all aspects of Unicode, see
+<a href="http://www.unicode.org";>http://www.unicode.org</a>.
+</p>
 <table class="menu" border="0" cellspacing="0">
 <tr><td align="left" valign="top">&bull; <a 
href="#perlunicode-Important-Caveats" accesskey="1">perlunicode Important 
Caveats</a>:</td><td>&nbsp;&nbsp;</td><td align="left" valign="top">
 </td></tr>
 <tr><td align="left" valign="top">&bull; <a 
href="#perlunicode-Byte-and-Character-Semantics" accesskey="2">perlunicode Byte 
and Character Semantics</a>:</td><td>&nbsp;&nbsp;</td><td align="left" 
valign="top">
 </td></tr>
-<tr><td align="left" valign="top">&bull; <a 
href="#perlunicode-Effects-of-Character-Semantics" accesskey="3">perlunicode 
Effects of Character Semantics</a>:</td><td>&nbsp;&nbsp;</td><td align="left" 
valign="top">
+<tr><td align="left" valign="top">&bull; <a 
href="#perlunicode-ASCII-Rules-versus-Unicode-Rules" accesskey="3">perlunicode 
ASCII Rules versus Unicode Rules</a>:</td><td>&nbsp;&nbsp;</td><td align="left" 
valign="top">
 </td></tr>
-<tr><td align="left" valign="top">&bull; <a 
href="#perlunicode-Unicode-Character-Properties" accesskey="4">perlunicode 
Unicode Character Properties</a>:</td><td>&nbsp;&nbsp;</td><td align="left" 
valign="top">
+<tr><td align="left" valign="top">&bull; <a 
href="#perlunicode-Extended-Grapheme-Clusters-_0028Logical-characters_0029" 
accesskey="4">perlunicode Extended Grapheme Clusters (Logical 
characters)</a>:</td><td>&nbsp;&nbsp;</td><td align="left" valign="top">
 </td></tr>
-<tr><td align="left" valign="top">&bull; <a 
href="#perlunicode-User_002dDefined-Character-Properties" 
accesskey="5">perlunicode User-Defined Character 
Properties</a>:</td><td>&nbsp;&nbsp;</td><td align="left" valign="top">
+<tr><td align="left" valign="top">&bull; <a 
href="#perlunicode-Unicode-Character-Properties" accesskey="5">perlunicode 
Unicode Character Properties</a>:</td><td>&nbsp;&nbsp;</td><td align="left" 
valign="top">
 </td></tr>
-<tr><td align="left" valign="top">&bull; <a 
href="#perlunicode-User_002dDefined-Case-Mappings-_0028for-serious-hackers-only_0029"
 accesskey="6">perlunicode User-Defined Case Mappings (for serious hackers 
only)</a>:</td><td>&nbsp;&nbsp;</td><td align="left" valign="top">
+<tr><td align="left" valign="top">&bull; <a 
href="#perlunicode-User_002dDefined-Character-Properties" 
accesskey="6">perlunicode User-Defined Character 
Properties</a>:</td><td>&nbsp;&nbsp;</td><td align="left" valign="top">
 </td></tr>
-<tr><td align="left" valign="top">&bull; <a 
href="#perlunicode-Character-Encodings-for-Input-and-Output" 
accesskey="7">perlunicode Character Encodings for Input and 
Output</a>:</td><td>&nbsp;&nbsp;</td><td align="left" valign="top">
+<tr><td align="left" valign="top">&bull; <a 
href="#perlunicode-User_002dDefined-Case-Mappings-_0028for-serious-hackers-only_0029"
 accesskey="7">perlunicode User-Defined Case Mappings (for serious hackers 
only)</a>:</td><td>&nbsp;&nbsp;</td><td align="left" valign="top">
 </td></tr>
-<tr><td align="left" valign="top">&bull; <a 
href="#perlunicode-Unicode-Regular-Expression-Support-Level" 
accesskey="8">perlunicode Unicode Regular Expression Support 
Level</a>:</td><td>&nbsp;&nbsp;</td><td align="left" valign="top">
+<tr><td align="left" valign="top">&bull; <a 
href="#perlunicode-Character-Encodings-for-Input-and-Output" 
accesskey="8">perlunicode Character Encodings for Input and 
Output</a>:</td><td>&nbsp;&nbsp;</td><td align="left" valign="top">
 </td></tr>
-<tr><td align="left" valign="top">&bull; <a 
href="#perlunicode-Unicode-Encodings" accesskey="9">perlunicode Unicode 
Encodings</a>:</td><td>&nbsp;&nbsp;</td><td align="left" valign="top">
+<tr><td align="left" valign="top">&bull; <a 
href="#perlunicode-Unicode-Regular-Expression-Support-Level" 
accesskey="9">perlunicode Unicode Regular Expression Support 
Level</a>:</td><td>&nbsp;&nbsp;</td><td align="left" valign="top">
 </td></tr>
-<tr><td align="left" valign="top">&bull; <a 
href="#perlunicode-Non_002dcharacter-code-points">perlunicode Non-character 
code points</a>:</td><td>&nbsp;&nbsp;</td><td align="left" valign="top">
+<tr><td align="left" valign="top">&bull; <a 
href="#perlunicode-Unicode-Encodings">perlunicode Unicode 
Encodings</a>:</td><td>&nbsp;&nbsp;</td><td align="left" valign="top">
+</td></tr>
+<tr><td align="left" valign="top">&bull; <a 
href="#perlunicode-Noncharacter-code-points">perlunicode Noncharacter code 
points</a>:</td><td>&nbsp;&nbsp;</td><td align="left" valign="top">
 </td></tr>
 <tr><td align="left" valign="top">&bull; <a 
href="#perlunicode-Beyond-Unicode-code-points">perlunicode Beyond Unicode code 
points</a>:</td><td>&nbsp;&nbsp;</td><td align="left" valign="top">
 </td></tr>
@@ -100011,6 +102310,8 @@
 </td></tr>
 <tr><td align="left" valign="top">&bull; <a 
href="#perlunicode-Hacking-Perl-to-work-on-earlier-Unicode-versions-_0028for-very-serious-hackers-only_0029">perlunicode
 Hacking Perl to work on earlier Unicode versions (for very serious hackers 
only)</a>:</td><td>&nbsp;&nbsp;</td><td align="left" valign="top">
 </td></tr>
+<tr><td align="left" valign="top">&bull; <a 
href="#perlunicode-Porting-code-from-perl_002d5_002e6_002eX">perlunicode 
Porting code from perl-5.6.X</a>:</td><td>&nbsp;&nbsp;</td><td align="left" 
valign="top">
+</td></tr>
 </table>
 
 <hr>
@@ -100022,15 +102323,14 @@
 <a name="Important-Caveats"></a>
 <h4 class="subsection">81.2.1 Important Caveats</h4>
 
+<p>Even though some of this section may not be understandable to you on
+first reading, we think it&rsquo;s important enough to highlight some of the
+gotchas before delving further, so here goes:
+</p>
 <p>Unicode support is an extensive requirement. While Perl does not
 implement the Unicode standard or the accompanying technical reports
 from cover to cover, Perl does support many Unicode features.
 </p>
-<p>People who want to learn to use Unicode in Perl, should probably read
-the <a href="#perlunitut-NAME">Perl Unicode tutorial, perlunitut</a> and
-<a href="#perluniintro-NAME">perluniintro NAME</a>, before reading
-this reference document.
-</p>
 <p>Also, the use of Unicode may present security issues that aren&rsquo;t 
obvious.
 Read <a href="http://www.unicode.org/reports/tr36";>Unicode Security 
Considerations</a>.
 </p>
@@ -100039,8 +102339,9 @@
 <dd><a 
name="perlunicode-Safest-if-you-use-feature-_0027unicode_005fstrings_0027"></a>
 <p>In order to preserve backward compatibility, Perl does not turn
 on full internal Unicode support unless the pragma
-<code>use feature 'unicode_strings'</code> is specified.  (This is 
automatically
-selected if you use <code>use 5.012</code> or higher.)  Failure to do this can
+<a 
href="feature.html#The-_0027unicode_005fstrings_0027-feature">(feature)<code>use&nbsp;feature&nbsp;<span
 class="nolinebreak">'unicode_strings'</span></code><!-- /@w --></a>
+is specified.  (This is automatically
+selected if you <code>use&nbsp;5.012</code><!-- /@w --> or higher.)  Failure 
to do this can
 trigger unexpected surprises.  See <a 
href="#perlunicode-The-_0022Unicode-Bug_0022">The &quot;Unicode Bug&quot;</a> 
below.
 </p>
 <p>This pragma doesn&rsquo;t affect I/O.  Nor does it change the internal
@@ -100051,43 +102352,31 @@
 </dd>
 <dt>Input and Output Layers</dt>
 <dd><a name="perlunicode-Input-and-Output-Layers"></a>
-<p>Perl knows when a filehandle uses Perl&rsquo;s internal Unicode encodings
-(UTF-8, or UTF-EBCDIC if in EBCDIC) if the filehandle is opened with
-the <code>:encoding(utf8)</code> layer.  Other encodings can be converted to 
Perl&rsquo;s
-encoding on input or from Perl&rsquo;s encoding on output by use of the
-<code>:encoding(...)</code>  layer.  See <a href="open.html#Top">(open)</a>.
-</p>
-<p>To indicate that Perl source itself is in UTF-8, use <code>use utf8;</code>.
-</p>
-</dd>
-<dt><code>use utf8</code> still needed to enable UTF-8/UTF-EBCDIC in 
scripts</dt>
-<dd><a 
name="perlunicode-use-utf8-still-needed-to-enable-UTF_002d8_002fUTF_002dEBCDIC-in-scripts"></a>
-<p>As a compatibility measure, the <code>use utf8</code> pragma must be 
explicitly
-included to enable recognition of UTF-8 in the Perl scripts themselves
-(in string or regular expression literals, or in identifier names) on
-ASCII-based machines or to recognize UTF-EBCDIC on EBCDIC-based
-machines.  <strong>These are the only times when an explicit <code>use 
utf8</code>
-is needed.</strong>  See <a href="utf8.html#Top">(utf8)</a>.
-</p>
-</dd>
-<dt><code>BOM</code>-marked scripts and UTF-16 scripts autodetected</dt>
-<dd><a 
name="perlunicode-BOM_002dmarked-scripts-and-UTF_002d16-scripts-autodetected"></a>
-<p>If a Perl script begins marked with the Unicode <code>BOM</code> (UTF-16LE, 
UTF16-BE,
-or UTF-8), or if the script looks like non-<code>BOM</code>-marked UTF-16 of 
either
-endianness, Perl will correctly read in the script as Unicode.
-(<code>BOM</code>less UTF-8 cannot be effectively recognized or differentiated 
from
-ISO 8859-1 or other eight-bit encodings.)
-</p>
-</dd>
-<dt><code>use encoding</code> needed to upgrade non-Latin-1 byte strings</dt>
-<dd><a 
name="perlunicode-use-encoding-needed-to-upgrade-non_002dLatin_002d1-byte-strings"></a>
-<p>By default, there is a fundamental asymmetry in Perl&rsquo;s Unicode model:
-implicit upgrading from byte strings to Unicode strings assumes that
-they were encoded in <em>ISO 8859-1 (Latin-1)</em>, but Unicode strings are
-downgraded with UTF-8 encoding.  This happens because the first 256
-codepoints in Unicode happens to agree with Latin-1.
+<p>Use the <code>:encoding(...)</code> layer  to read from and write to
+filehandles using the specified encoding.  (See <a 
href="open.html#Top">(open)</a>.)
 </p>
-<p>See <a href="#perlunicode-Byte-and-Character-Semantics">Byte and Character 
Semantics</a> for more details.
+</dd>
+<dt>You should convert your non-ASCII, non-UTF-8 Perl scripts to be UTF-8.</dt>
+<dd><a 
name="perlunicode-You-should-convert-your-non_002dASCII_002c-non_002dUTF_002d8-Perl-scripts-to-be-UTF_002d8_002e"></a>
+<p>See <a href="encoding.html#Top">(encoding)</a>.
+</p>
+</dd>
+<dt><code>use utf8</code> still needed to enable <a 
href="#perlunicode-Unicode-Encodings">UTF-8</a> in scripts</dt>
+<dd><a 
name="perlunicode-use-utf8-still-needed-to-enable-perlunicode-Unicode-Encodings-in-scripts"></a>
+<p>If your Perl script is itself encoded in <a 
href="#perlunicode-Unicode-Encodings">UTF-8</a>,
+the <code>use&nbsp;utf8</code><!-- /@w --> pragma must be explicitly included 
to enable
+recognition of that (in string or regular expression literals, or in
+identifier names).  <strong>This is the only time when an explicit 
<code>use&nbsp;utf8</code><!-- /@w --> is needed.</strong>  (See <a 
href="utf8.html#Top">(utf8)</a>).
+</p>
+</dd>
+<dt><code>BOM</code>-marked scripts and <a 
href="#perlunicode-Unicode-Encodings">UTF-16</a> scripts autodetected</dt>
+<dd><a 
name="perlunicode-BOM_002dmarked-scripts-and-perlunicode-Unicode-Encodings-scripts-autodetected"></a>
+<p>However, if a Perl script begins with the Unicode <code>BOM</code> 
(UTF-16LE,
+UTF16-BE, or UTF-8), or if the script looks like non-<code>BOM</code>-marked
+UTF-16 of either endianness, Perl will correctly read in the script as
+the appropriate Unicode encoding.  (<code>BOM</code>-less UTF-8 cannot be
+effectively recognized or differentiated from ISO 8859-1 or other
+eight-bit encodings.)
 </p>
 </dd>
 </dl>
@@ -100096,200 +102385,295 @@
 <a name="perlunicode-Byte-and-Character-Semantics"></a>
 <div class="header">
 <p>
-Next: <a href="#perlunicode-Effects-of-Character-Semantics" accesskey="n" 
rel="next">perlunicode Effects of Character Semantics</a>, Previous: <a 
href="#perlunicode-Important-Caveats" accesskey="p" rel="prev">perlunicode 
Important Caveats</a>, Up: <a href="#perlunicode-DESCRIPTION" accesskey="u" 
rel="up">perlunicode DESCRIPTION</a> &nbsp; [<a href="#SEC_Contents" 
title="Table of contents" rel="contents">Contents</a>]</p>
+Next: <a href="#perlunicode-ASCII-Rules-versus-Unicode-Rules" accesskey="n" 
rel="next">perlunicode ASCII Rules versus Unicode Rules</a>, Previous: <a 
href="#perlunicode-Important-Caveats" accesskey="p" rel="prev">perlunicode 
Important Caveats</a>, Up: <a href="#perlunicode-DESCRIPTION" accesskey="u" 
rel="up">perlunicode DESCRIPTION</a> &nbsp; [<a href="#SEC_Contents" 
title="Table of contents" rel="contents">Contents</a>]</p>
 </div>
 <a name="Byte-and-Character-Semantics"></a>
 <h4 class="subsection">81.2.2 Byte and Character Semantics</h4>
 
-<p>Perl uses logically-wide characters to represent strings internally.
+<p>Before Unicode, most encodings used 8 bits (a single byte) to encode
+each character.  Thus a character was a byte, and a byte was a
+character, and there could be only 256 or fewer possible characters.
+&quot;Byte Semantics&quot; in the title of this section refers to
+this behavior.  There was no need to distinguish between &quot;Byte&quot; and
+&quot;Character&quot;.
+</p>
+<p>Then along comes Unicode which has room for over a million characters
+(and Perl allows for even more).  This means that a character may
+require more than a single byte to represent it, and so the two terms
+are no longer equivalent.  What matter are the characters as whole
+entities, and not usually the bytes that comprise them.  That&rsquo;s what the
+term &quot;Character Semantics&quot; in the title of this section refers to.
+</p>
+<p>Perl had to change internally to decouple &quot;bytes&quot; from 
&quot;characters&quot;.
+It is important that you too change your ideas, if you haven&rsquo;t already,
+so that &quot;byte&quot; and &quot;character&quot; no longer mean the same 
thing in your
+mind.
+</p>
+<p>The basic building block of Perl strings has always been a 
&quot;character&quot;.
+The changes basically come down to that the implementation no longer
+thinks that a character is always just a single byte.
 </p>
-<p>Starting in Perl 5.14, Perl-level operations work with
-characters rather than bytes within the scope of a
-<code><a href="feature.html#Top">(feature)use feature 
'unicode_strings'</a></code> (or equivalently
-<code>use 5.012</code> or higher).  (This is not true if bytes have been
-explicitly requested by <code><a href="bytes.html#Top">(bytes)use 
bytes</a></code>, nor necessarily true
-for interactions with the platform&rsquo;s operating system.)
-</p>
-<p>For earlier Perls, and when <code>unicode_strings</code> is not in effect, 
Perl
-provides a fairly safe environment that can handle both types of
-semantics in programs.  For operations where Perl can unambiguously
-decide that the input data are characters, Perl switches to character
-semantics.  For operations where this determination cannot be made
-without additional information from the user, Perl decides in favor of
-compatibility and chooses to use byte semantics.
-</p>
-<p>When <code>use locale</code> (but not <code>use locale 
':not_characters'</code>) is in
-effect, Perl uses the rules associated with the current locale.
-(<code>use locale</code> overrides <code>use feature 'unicode_strings'</code> 
in the same scope;
-while <code>use locale ':not_characters'</code> effectively also selects
-<code>use feature 'unicode_strings'</code> in its scope; see <a 
href="#perllocale-NAME">perllocale NAME</a>.)
-Otherwise, Perl uses the platform&rsquo;s native
-byte semantics for characters whose code points are less than 256, and
-Unicode rules for those greater than 255.  That means that non-ASCII
-characters are undefined except for their
-ordinal numbers.  This means that none have case (upper and lower), nor are any
-a member of character classes, like <code>[:alpha:]</code> or <code>\w</code>. 
 (But all do belong
-to the <code>\W</code> class or the Perl regular expression extension 
<code>[:^alpha:]</code>.)
-</p>
-<p>This behavior preserves compatibility with earlier versions of Perl,
-which allowed byte semantics in Perl operations only if
-none of the program&rsquo;s inputs were marked as being a source of Unicode
-character data.  Such data may come from filehandles, from calls to
-external programs, from information provided by the system (such as 
<code>%ENV</code>),
-or from literals and constants in the source text.
-</p>
-<p>The <code>utf8</code> pragma is primarily a compatibility device that 
enables
-recognition of UTF-(8|EBCDIC) in literals encountered by the parser.
-Note that this pragma is only required while Perl defaults to byte
-semantics; when character semantics become the default, this pragma
-may become a no-op.  See <a href="utf8.html#Top">(utf8)</a>.
-</p>
-<p>If strings operating under byte semantics and strings with Unicode
-character data are concatenated, the new string will have
-character semantics.  This can cause surprises: See <a 
href="#perlunicode-BUGS">BUGS</a>, below.
-You can choose to be warned when this happens.  See <code><a 
href="encoding-warnings.html#Top">(encoding-warnings)</a></code>.
-</p>
-<p>Under character semantics, many operations that formerly operated on
-bytes now operate on characters. A character in Perl is
-logically just a number ranging from 0 to 2**31 or so. Larger
-characters may encode into longer sequences of bytes internally, but
-this internal detail is mostly hidden for Perl code.
-See <a href="#perluniintro-NAME">perluniintro NAME</a> for more.
-</p>
-<hr>
-<a name="perlunicode-Effects-of-Character-Semantics"></a>
-<div class="header">
-<p>
-Next: <a href="#perlunicode-Unicode-Character-Properties" accesskey="n" 
rel="next">perlunicode Unicode Character Properties</a>, Previous: <a 
href="#perlunicode-Byte-and-Character-Semantics" accesskey="p" 
rel="prev">perlunicode Byte and Character Semantics</a>, Up: <a 
href="#perlunicode-DESCRIPTION" accesskey="u" rel="up">perlunicode 
DESCRIPTION</a> &nbsp; [<a href="#SEC_Contents" title="Table of contents" 
rel="contents">Contents</a>]</p>
-</div>
-<a name="Effects-of-Character-Semantics"></a>
-<h4 class="subsection">81.2.3 Effects of Character Semantics</h4>
+<p>There are various things to note:
+</p>
+<ul>
+<li> String handling functions, for the most part, continue to operate in
+terms of characters.  <code>length()</code>, for example, returns the number of
+characters in a string, just as before.  But that number no longer is
+necessarily the same as the number of bytes in the string (there may be
+more bytes than characters).  The other such functions include
+<code>chop()</code>, <code>chomp()</code>, <code>substr()</code>, 
<code>pos()</code>, <code>index()</code>, <code>rindex()</code>,
+<code>sort()</code>, <code>sprintf()</code>, and <code>write()</code>.
 
-<p>Character semantics have the following effects:
+<p>The exceptions are:
 </p>
 <ul>
-<li> Strings&ndash;including hash keys&ndash;and regular expression patterns 
may
-contain characters that have an ordinal value larger than 255.
+<li> the bit-oriented <code>vec</code>
+
+<p>ÃÂ 
+</p>
+</li><li> the byte-oriented <code>pack</code>/<code>unpack</code> 
<code>&quot;C&quot;</code> format
+
+<p>However, the <code>W</code> specifier does operate on whole characters, as 
does the
+<code>U</code> specifier.
+</p>
+</li><li> some operators that interact with the platform&rsquo;s operating 
system
+
+<p>Operators dealing with filenames are examples.
+</p>
+</li><li> when the functions are called from within the scope of the
+<code><a href="bytes.html#Top">(bytes)use&nbsp;bytes</a></code><!-- /@w --> 
pragma
+
+<p>Likely, you should use this only for debugging anyway.
+</p>
+</li></ul>
+
+</li><li> Strings&ndash;including hash keys&ndash;and regular expression 
patterns may
+contain characters that have ordinal values larger than 255.
 
 <p>If you use a Unicode editor to edit your program, Unicode characters may
 occur directly within the literal strings in UTF-8 encoding, or UTF-16.
 (The former requires a <code>BOM</code> or <code>use utf8</code>, the latter 
requires a <code>BOM</code>.)
 </p>
-<p>Unicode characters can also be added to a string by using the 
<code>\N{U+...}</code>
-notation.  The Unicode code for the desired character, in hexadecimal,
-should be placed in the braces, after the <code>U</code>. For instance, a 
smiley face is
-<code>\N{U+263A}</code>.
-</p>
-<p>Alternatively, you can use the <code>\x{...}</code> notation for characters 
<code>0x100</code> and
-above.  For characters below <code>0x100</code> you may get byte semantics 
instead of
-character semantics;  see <a href="#perlunicode-The-_0022Unicode-Bug_0022">The 
&quot;Unicode Bug&quot;</a>.  On EBCDIC machines there is
-the additional problem that the value for such characters gives the EBCDIC
-character rather than the Unicode one, thus it is more portable to use
-<code>\N{U+...}</code> instead.
-</p>
-<p>Additionally, you can use the <code>\N{...}</code> notation and put the 
official
-Unicode character name within the braces, such as
-<code>\N{WHITE SMILING FACE}</code>.  This automatically loads the <a 
href="charnames.html#Top">(charnames)</a>
-module with the <code>:full</code> and <code>:short</code> options.  If you 
prefer different
-options for this module, you can instead, before the <code>\N{...}</code>,
-explicitly load it with your desired options; for example,
-</p>
-<pre class="verbatim">   use charnames ':loose';
-</pre>
-</li><li> If an appropriate <a href="encoding.html#Top">(encoding)</a> is 
specified, identifiers within the
-Perl script may contain Unicode alphanumeric characters, including
-ideographs.  Perl does not currently attempt to canonicalize variable
-names.
-
-</li><li> Regular expressions match characters instead of bytes.  
<code>&quot;.&quot;</code> matches
-a character instead of a byte.
-
-</li><li> Bracketed character classes in regular expressions match characters 
instead of
-bytes and match against the character properties specified in the
-Unicode properties database.  <code>\w</code> can be used to match a Japanese
-ideograph, for instance.
-
-</li><li> Named Unicode properties, scripts, and block ranges may be used 
(like bracketed
-character classes) by using the <code>\p{}</code> &quot;matches property&quot; 
construct and
-the <code>\P{}</code> negation, &quot;doesn&rsquo;t match property&quot;.
-See <a href="#perlunicode-Unicode-Character-Properties">Unicode Character 
Properties</a> for more details.
+<p><a href="#perluniintro-Creating-Unicode">perluniintro Creating Unicode</a> 
gives other ways to place non-ASCII
+characters in your strings.
+</p>
+</li><li> The <code>chr()</code> and <code>ord()</code> functions work on 
whole characters.
 
-<p>You can define your own character properties and use them
-in the regular expression with the <code>\p{}</code> or <code>\P{}</code> 
construct.
-See <a href="#perlunicode-User_002dDefined-Character-Properties">User-Defined 
Character Properties</a> for more details.
+</li><li> Regular expressions match whole characters.  For example, 
<code>&quot;.&quot;</code> matches
+a whole character instead of only a single byte.
+
+</li><li> The <code>tr///</code> operator translates whole characters.  (Note 
that the
+<code>tr///CU</code> functionality has been removed.  For similar 
functionality to
+that, see <code>pack('U0', ...)</code> and <code>pack('C0', ...)</code>).
+
+</li><li> <code>scalar reverse()</code> reverses by character rather than by 
byte.
+
+</li><li> The bit string operators, <code>&amp; | ^ ~</code> and (starting in 
v5.22)
+<code>&amp;. |. ^.  ~.</code> can operate on characters that don&rsquo;t fit 
into a byte.
+However, the current behavior is likely to change.  You should not use
+these operators on strings that are encoded in UTF-8.  If you&rsquo;re not
+sure about the encoding of a string, downgrade it before using any of
+these operators; you can use
+<a 
href="utf8.html#Utility-functions">(utf8)<code>utf8::utf8_downgrade()</code></a>.
+
+</li></ul>
+
+<p>The bottom line is that Perl has always practiced &quot;Character 
Semantics&quot;,
+but with the advent of Unicode, that is now different than &quot;Byte
+Semantics&quot;.
 </p>
-</li><li> The special pattern <code>\X</code> matches a logical character, an 
&quot;extended grapheme
-cluster&quot; in Standardese.  In Unicode what appears to the user to be a 
single
-character, for example an accented <code>G</code>, may in fact be composed of 
a sequence
-of characters, in this case a <code>G</code> followed by an accent character.  
<code>\X</code>
-will match the entire sequence.
-
-</li><li> The <code>tr///</code> operator translates characters instead of 
bytes.  Note
-that the <code>tr///CU</code> functionality has been removed.  For similar
-functionality see pack(&rsquo;U0&rsquo;, ...) and pack(&rsquo;C0&rsquo;, ...).
-
-</li><li> Case translation operators use the Unicode case translation tables
-when character input is provided.  Note that <code>uc()</code>, or 
<code>\U</code> in
-interpolated strings, translates to uppercase, while <code>ucfirst</code>,
-or <code>\u</code> in interpolated strings, translates to titlecase in 
languages
-that make the distinction (which is equivalent to uppercase in languages
-without the distinction).
+<hr>
+<a name="perlunicode-ASCII-Rules-versus-Unicode-Rules"></a>
+<div class="header">
+<p>
+Next: <a 
href="#perlunicode-Extended-Grapheme-Clusters-_0028Logical-characters_0029" 
accesskey="n" rel="next">perlunicode Extended Grapheme Clusters (Logical 
characters)</a>, Previous: <a href="#perlunicode-Byte-and-Character-Semantics" 
accesskey="p" rel="prev">perlunicode Byte and Character Semantics</a>, Up: <a 
href="#perlunicode-DESCRIPTION" accesskey="u" rel="up">perlunicode 
DESCRIPTION</a> &nbsp; [<a href="#SEC_Contents" title="Table of contents" 
rel="contents">Contents</a>]</p>
+</div>
+<a name="ASCII-Rules-versus-Unicode-Rules"></a>
+<h4 class="subsection">81.2.3 ASCII Rules versus Unicode Rules</h4>
 
-</li><li> Most operators that deal with positions or lengths in a string will
-automatically switch to using character positions, including
-<code>chop()</code>, <code>chomp()</code>, <code>substr()</code>, 
<code>pos()</code>, <code>index()</code>, <code>rindex()</code>,
-<code>sprintf()</code>, <code>write()</code>, and <code>length()</code>.  An 
operator that
-specifically does not switch is <code>vec()</code>.  Operators that really 
don&rsquo;t
-care include operators that treat strings as a bucket of bits such as
-<code>sort()</code>, and operators dealing with filenames.
-
-</li><li> The <code>pack()</code>/<code>unpack()</code> letter <code>C</code> 
does <em>not</em> change, since it is often
-used for byte-oriented formats.  Again, think <code>char</code> in the C 
language.
-
-<p>There is a new <code>U</code> specifier that converts between Unicode 
characters
-and code points. There is also a <code>W</code> specifier that is the 
equivalent of
-<code>chr</code>/<code>ord</code> and properly handles character values even 
if they are above 255.
-</p>
-</li><li> The <code>chr()</code> and <code>ord()</code> functions work on 
characters, similar to
-<code>pack(&quot;W&quot;)</code> and <code>unpack(&quot;W&quot;)</code>, 
<em>not</em> <code>pack(&quot;C&quot;)</code> and
-<code>unpack(&quot;C&quot;)</code>.  <code>pack(&quot;C&quot;)</code> and 
<code>unpack(&quot;C&quot;)</code> are methods for
-emulating byte-oriented <code>chr()</code> and <code>ord()</code> on Unicode 
strings.
-While these methods reveal the internal encoding of Unicode strings,
-that is not something one normally needs to care about at all.
-
-</li><li> The bit string operators, <code>&amp; | ^ ~</code>, can operate on 
character data.
-However, for backward compatibility, such as when using bit string
-operations when characters are all less than 256 in ordinal value, one
-should not use <code>~</code> (the bit complement) with characters of both
-values less than 256 and values greater than 256.  Most importantly,
-DeMorgan&rsquo;s laws (<code>~($x|$y) eq ~$x&amp;~$y</code> and 
<code>~($x&amp;$y) eq ~$x|~$y</code>)
-will not hold.  The reason for this mathematical <em>faux pas</em> is that
-the complement cannot return <strong>both</strong> the 8-bit (byte-wide) bit
-complement <strong>and</strong> the full character-wide bit complement.
-
-</li><li> There is a CPAN module, <code><a 
href="Unicode-Casing.html#Top">(Unicode-Casing)</a></code>, which allows you to 
define
-your own mappings to be used in <code>lc()</code>, <code>lcfirst()</code>, 
<code>uc()</code>,
-<code>ucfirst()</code>, and <code>fc</code> (or their double-quoted string 
inlined
-versions such as <code>\U</code>).
-(Prior to Perl 5.16, this functionality was partially provided
-in the Perl core, but suffered from a number of insurmountable
-drawbacks, so the CPAN module was written instead.)
+<p>Before Unicode, when a character was a byte was a character,
+Perl knew only about the 128 characters defined by ASCII, code points 0
+through 127 (except for under <code>use&nbsp;locale</code><!-- /@w -->).  That 
left the code
+points 128 to 255 as unassigned, and available for whatever use a
+program might want.  The only semantics they have is their ordinal
+numbers, and that they are members of none of the non-negative character
+classes.  None are considered to match <code>\w</code> for example, but all 
match
+<code>\W</code>.
+</p>
+<p>Unicode, of course, assigns each of those code points a particular
+meaning (along with ones above 255).  To preserve backward
+compatibility, Perl only uses the Unicode meanings when there is some
+indication that Unicode is what is intended; otherwise the non-ASCII
+code points remain treated as if they are unassigned.
+</p>
+<p>Here are the ways that Perl knows that a string should be treated as
+Unicode:
+</p>
+<ul>
+<li> Within the scope of <code>use&nbsp;utf8</code><!-- /@w -->
+
+<p>If the whole program is Unicode (signified by using 8-bit 
<strong>U</strong>nicode
+<strong>T</strong>ransformation <strong>F</strong>ormat), then all strings 
within it must be
+Unicode.
+</p>
+</li><li> Within the scope of
+<a 
href="feature.html#The-_0027unicode_005fstrings_0027-feature">(feature)<code>use&nbsp;feature&nbsp;<span
 class="nolinebreak">'unicode_strings'</span></code><!-- /@w --></a>
+
+<p>This pragma was created so you can explicitly tell Perl that operations
+executed within its scope are to use Unicode rules.  More operations are
+affected with newer perls.  See <a 
href="#perlunicode-The-_0022Unicode-Bug_0022">The &quot;Unicode Bug&quot;</a>.
+</p>
+</li><li> Within the scope of <code>use&nbsp;5.012</code><!-- /@w --> or higher
+
+<p>This implicitly turns on <code>use&nbsp;feature&nbsp;<span 
class="nolinebreak">'unicode_strings'</span></code><!-- /@w -->.
+</p>
+</li><li> Within the scope of
+<a href="#perllocale-Unicode-and-UTF_002d8"><code>use&nbsp;locale&nbsp;<span 
class="nolinebreak">'not_characters'</span></code><!-- /@w --></a>,
+or <a href="#perllocale-NAME"><code>use&nbsp;locale</code><!-- /@w --></a> and 
the current
+locale is a UTF-8 locale.
+
+<p>The former is defined to imply Unicode handling; and the latter
+indicates a Unicode locale, hence a Unicode interpretation of all
+strings within it.
+</p>
+</li><li> When the string contains a Unicode-only code point
 
+<p>Perl has never accepted code points above 255 without them being
+Unicode, so their use implies Unicode for the whole string.
+</p>
+</li><li> When the string contains a Unicode named code point 
<code>\N{...}</code>
+
+<p>The <code>\N{...}</code> construct explicitly refers to a Unicode code 
point,
+even if it is one that is also in ASCII.  Therefore the string
+containing it must be Unicode.
+</p>
+</li><li> When the string has come from an external source marked as
+Unicode
+
+<p>The <a href="#perlrun-_002dC-_005bnumber_002flist_005d"><code>-C</code></a> 
command line option can
+specify that certain inputs to the program are Unicode, and the values
+of this can be read by your Perl code, see <a 
href="#perlvar-_0024_007b_005eUNICODE_007d">perlvar ${^UNICODE}</a>.
+</p>
+</li><li> When the string has been upgraded to UTF-8
+
+<p>The function <a 
href="utf8.html#Utility-functions">(utf8)<code>utf8::utf8_upgrade()</code></a>
+can be explicitly used to permanently (unless a subsequent
+<code>utf8::utf8_downgrade()</code> is called) cause a string to be treated as
+Unicode.
+</p>
+</li><li> There are additional methods for regular expression patterns
+
+<p>A pattern that is compiled with the <code>/u</code> or <code>/a</code> 
modifiers is
+treated as Unicode (though there are some restrictions with <code>/a</code>).
+Under the <code>/d</code> and <code>/l</code> modifiers, there are several 
other
+indications for Unicode; see <a href="#perlre-Character-set-modifiers">perlre 
Character set modifiers</a>.
+</p>
 </li></ul>
 
+<p>Note that all of the above are overridden within the scope of
+<code><a href="bytes.html#Top">(bytes)use bytes</a></code>; but you should be 
using this pragma only for
+debugging.
+</p>
+<p>Note also that some interactions with the platform&rsquo;s operating system
+never use Unicode rules.
+</p>
+<p>When Unicode rules are in effect:
+</p>
 <ul>
-<li> And finally, <code>scalar reverse()</code> reverses by character rather 
than by byte.
+<li> Case translation operators use the Unicode case translation tables.
+
+<p>Note that <code>uc()</code>, or <code>\U</code> in interpolated strings, 
translates to
+uppercase, while <code>ucfirst</code>, or <code>\u</code> in interpolated 
strings,
+translates to titlecase in languages that make the distinction (which is
+equivalent to uppercase in languages without the distinction).
+</p>
+<p>There is a CPAN module, <code><a 
href="Unicode-Casing.html#Top">(Unicode-Casing)</a></code>, which allows you to
+define your own mappings to be used in <code>lc()</code>, 
<code>lcfirst()</code>, <code>uc()</code>,
+<code>ucfirst()</code>, and <code>fc</code> (or their double-quoted string 
inlined versions
+such as <code>\U</code>).  (Prior to Perl 5.16, this functionality was 
partially
+provided in the Perl core, but suffered from a number of insurmountable
+drawbacks, so the CPAN module was written instead.)
+</p>
+</li><li> Character classes in regular expressions match based on the character
+properties specified in the Unicode properties database.
+
+<p><code>\w</code> can be used to match a Japanese ideograph, for instance; and
+<code>[[:digit:]]</code> a Bengali number.
+</p>
+</li><li> Named Unicode properties, scripts, and block ranges may be used (like
+bracketed character classes) by using the <code>\p{}</code> &quot;matches 
property&quot;
+construct and the <code>\P{}</code> negation, &quot;doesn&rsquo;t match 
property&quot;.
 
+<p>See <a href="#perlunicode-Unicode-Character-Properties">Unicode Character 
Properties</a> for more details.
+</p>
+<p>You can define your own character properties and use them
+in the regular expression with the <code>\p{}</code> or <code>\P{}</code> 
construct.
+See <a href="#perlunicode-User_002dDefined-Character-Properties">User-Defined 
Character Properties</a> for more details.
+</p>
 </li></ul>
 
 <hr>
+<a 
name="perlunicode-Extended-Grapheme-Clusters-_0028Logical-characters_0029"></a>
+<div class="header">
+<p>
+Next: <a href="#perlunicode-Unicode-Character-Properties" accesskey="n" 
rel="next">perlunicode Unicode Character Properties</a>, Previous: <a 
href="#perlunicode-ASCII-Rules-versus-Unicode-Rules" accesskey="p" 
rel="prev">perlunicode ASCII Rules versus Unicode Rules</a>, Up: <a 
href="#perlunicode-DESCRIPTION" accesskey="u" rel="up">perlunicode 
DESCRIPTION</a> &nbsp; [<a href="#SEC_Contents" title="Table of contents" 
rel="contents">Contents</a>]</p>
+</div>
+<a name="Extended-Grapheme-Clusters-_0028Logical-characters_0029"></a>
+<h4 class="subsection">81.2.4 Extended Grapheme Clusters (Logical 
characters)</h4>
+
+<p>Consider a character, say <code>H</code>.  It could appear with various 
marks around it,
+such as an acute accent, or a circumflex, or various hooks, circles, arrows,
+<em>etc.</em>, above, below, to one side or the other, <em>etc</em>.  There 
are many
+possibilities among the world&rsquo;s languages.  The number of combinations is
+astronomical, and if there were a character for each combination, it would
+soon exhaust Unicode&rsquo;s more than a million possible characters.  So 
Unicode
+took a different approach: there is a character for the base <code>H</code>, 
and a
+character for each of the possible marks, and these can be variously combined
+to get a final logical character.  So a logical character&ndash;what appears 
to be a
+single character&ndash;can be a sequence of more than one individual 
characters.
+The Unicode standard calls these &quot;extended grapheme clusters&quot; (which
+is an improved version of the no-longer much used &quot;grapheme 
cluster&quot;);
+Perl furnishes the <code>\X</code> regular expression construct to match such
+sequences in their entirety.
+</p>
+<p>But Unicode&rsquo;s intent is to unify the existing character set standards 
and
+practices, and several pre-existing standards have single characters that
+mean the same thing as some of these combinations, like ISO-8859-1,
+which has quite a few of them. For example, <code>&quot;LATIN CAPITAL LETTER E
+WITH ACUTE&quot;</code> was already in this standard when Unicode came along.
+Unicode therefore added it to its repertoire as that single character.
+But this character is considered by Unicode to be equivalent to the
+sequence consisting of the character <code>&quot;LATIN CAPITAL LETTER 
E&quot;</code>
+followed by the character <code>&quot;COMBINING ACUTE ACCENT&quot;</code>.
+</p>
+<p><code>&quot;LATIN CAPITAL LETTER E WITH ACUTE&quot;</code> is called a 
&quot;pre-composed&quot;
+character, and its equivalence with the &quot;E&quot; and the &quot;COMBINING 
ACCENT&quot;
+sequence is called canonical equivalence.  All pre-composed characters
+are said to have a decomposition (into the equivalent sequence), and the
+decomposition type is also called canonical.  A string may be comprised
+as much as possible of precomposed characters, or it may be comprised of
+entirely decomposed characters.  Unicode calls these respectively,
+&quot;Normalization Form Composed&quot; (NFC) and &quot;Normalization Form 
Decomposed&quot;.
+The <code><a href="Unicode-Normalize.html#Top">(Unicode-Normalize)</a></code> 
module contains functions that convert
+between the two.  A string may also have both composed characters and
+decomposed characters; this module can be used to make it all one or the
+other.
+</p>
+<p>You may be presented with strings in any of these equivalent forms.
+There is currently nothing in Perl 5 that ignores the differences.  So
+you&rsquo;ll have to specially hanlde it.  The usual advice is to convert your
+inputs to <code>NFD</code> before processing further.
+</p>
+<p>For more detailed information, see <a 
href="http://unicode.org/reports/tr15/";>http://unicode.org/reports/tr15/</a>.
+</p>
+<hr>
 <a name="perlunicode-Unicode-Character-Properties"></a>
 <div class="header">
 <p>
-Next: <a href="#perlunicode-User_002dDefined-Character-Properties" 
accesskey="n" rel="next">perlunicode User-Defined Character Properties</a>, 
Previous: <a href="#perlunicode-Effects-of-Character-Semantics" accesskey="p" 
rel="prev">perlunicode Effects of Character Semantics</a>, Up: <a 
href="#perlunicode-DESCRIPTION" accesskey="u" rel="up">perlunicode 
DESCRIPTION</a> &nbsp; [<a href="#SEC_Contents" title="Table of contents" 
rel="contents">Contents</a>]</p>
+Next: <a href="#perlunicode-User_002dDefined-Character-Properties" 
accesskey="n" rel="next">perlunicode User-Defined Character Properties</a>, 
Previous: <a 
href="#perlunicode-Extended-Grapheme-Clusters-_0028Logical-characters_0029" 
accesskey="p" rel="prev">perlunicode Extended Grapheme Clusters (Logical 
characters)</a>, Up: <a href="#perlunicode-DESCRIPTION" accesskey="u" 
rel="up">perlunicode DESCRIPTION</a> &nbsp; [<a href="#SEC_Contents" 
title="Table of contents" rel="contents">Contents</a>]</p>
 </div>
 <a name="Unicode-Character-Properties"></a>
-<h4 class="subsection">81.2.4 Unicode Character Properties</h4>
+<h4 class="subsection">81.2.5 Unicode Character Properties</h4>
 
 <p>(The only time that Perl considers a sequence of individual code
 points as a single logical character is in the <code>\X</code> construct, 
already
@@ -100318,7 +102702,7 @@
 values, such as <code>Left</code>, <code>Right</code>, 
<code>Whitespace</code>, and others.  To match these, one needs
 to specify both the property name (<code>Bidi_Class</code>), AND the value 
being
 matched against
-(<code>Left</code>, <code>Right</code>, etc.).  This is done, as in the 
examples above, by having the
+(<code>Left</code>, <code>Right</code>, <em>etc.</em>).  This is done, as in 
the examples above, by having the
 two components separated by an equal sign (or interchangeably, a colon), like
 <code>\p{Bidi_Class: Left}</code>.
 </p>
@@ -100379,8 +102763,8 @@
 This set also includes its subsets <code>PosixUpper</code> and 
<code>PosixLower</code> both
 of which under <code>/i</code> match <code>PosixAlpha</code>.
 (The difference between these sets is that some things, such as Roman
-numerals, come in both upper and lower case so they are <code>Cased</code>, 
but aren&rsquo;t considered
-letters, so they aren&rsquo;t <code>Cased_Letter</code>s.)
+numerals, come in both upper and lower case so they are <code>Cased</code>, but
+aren&rsquo;t considered letters, so they aren&rsquo;t 
<code>Cased_Letter</code>&rsquo;s.)
 </p>
 <p>See <a href="#perlunicode-Beyond-Unicode-code-points">Beyond Unicode code 
points</a> for special considerations when
 matching Unicode properties against non-Unicode code points.
@@ -100407,14 +102791,14 @@
 Next: <a href="#perlunicode-Bidirectional-Character-Types" accesskey="n" 
rel="next">perlunicode <strong>Bidirectional Character Types</strong></a>, Up: 
<a href="#perlunicode-Unicode-Character-Properties" accesskey="u" 
rel="up">perlunicode Unicode Character Properties</a> &nbsp; [<a 
href="#SEC_Contents" title="Table of contents" rel="contents">Contents</a>]</p>
 </div>
 <a name="General_005fCategory"></a>
-<h4 class="subsubsection">81.2.4.1 <strong>General_Category</strong></h4>
+<h4 class="subsubsection">81.2.5.1 <strong>General_Category</strong></h4>
 
 <p>Every Unicode character is assigned a general category, which is the 
&quot;most
 usual categorization of a character&quot; (from
 <a 
href="http://www.unicode.org/reports/tr44";>http://www.unicode.org/reports/tr44</a>).
 </p>
 <p>The compound way of writing these is like 
<code>\p{General_Category=Number}</code>
-(short, <code>\p{gc:n}</code>).  But Perl furnishes shortcuts in which 
everything up
+(short: <code>\p{gc:n}</code>).  But Perl furnishes shortcuts in which 
everything up
 through the equal or colon separator is omitted.  So you can instead just write
 <code>\pN</code>.
 </p>
@@ -100481,7 +102865,7 @@
 Next: <a href="#perlunicode-Scripts" accesskey="n" rel="next">perlunicode 
<strong>Scripts</strong></a>, Previous: <a 
href="#perlunicode-General_005fCategory" accesskey="p" rel="prev">perlunicode 
<strong>General_Category</strong></a>, Up: <a 
href="#perlunicode-Unicode-Character-Properties" accesskey="u" 
rel="up">perlunicode Unicode Character Properties</a> &nbsp; [<a 
href="#SEC_Contents" title="Table of contents" rel="contents">Contents</a>]</p>
 </div>
 <a name="Bidirectional-Character-Types"></a>
-<h4 class="subsubsection">81.2.4.2 <strong>Bidirectional Character 
Types</strong></h4>
+<h4 class="subsubsection">81.2.5.2 <strong>Bidirectional Character 
Types</strong></h4>
 
 <p>Because scripts differ in their directionality (Hebrew and Arabic are
 written right to left, for example) Unicode supplies a <code>Bidi_Class</code> 
property.
@@ -100526,14 +102910,14 @@
 Next: <a href="#perlunicode-Use-of-the-_0022Is_0022-Prefix" accesskey="n" 
rel="next">perlunicode <strong>Use of the <code>&quot;Is&quot;</code> 
Prefix</strong></a>, Previous: <a 
href="#perlunicode-Bidirectional-Character-Types" accesskey="p" 
rel="prev">perlunicode <strong>Bidirectional Character Types</strong></a>, Up: 
<a href="#perlunicode-Unicode-Character-Properties" accesskey="u" 
rel="up">perlunicode Unicode Character Properties</a> &nbsp; [<a 
href="#SEC_Contents" title="Table of contents" rel="contents">Contents</a>]</p>
 </div>
 <a name="Scripts"></a>
-<h4 class="subsubsection">81.2.4.3 <strong>Scripts</strong></h4>
+<h4 class="subsubsection">81.2.5.3 <strong>Scripts</strong></h4>
 
 <p>The world&rsquo;s languages are written in many different scripts.  This 
sentence
 (unless you&rsquo;re reading it in translation) is written in Latin, while 
Russian is
 written in Cyrillic, and Greek is written in, well, Greek; Japanese mainly in
 Hiragana or Katakana.  There are many more.
 </p>
-<p>The Unicode Script and Script_Extensions properties give what script a
+<p>The Unicode <code>Script</code> and <code>Script_Extensions</code> 
properties give what script a
 given character is in.  Either property can be specified with the
 compound form like
 <code>\p{Script=Hebrew}</code> (short: <code>\p{sc=hebr}</code>), or
@@ -100575,10 +102959,12 @@
 fewer characters in the <code>Common</code> script, and correspondingly more in
 other scripts.  It is new in Unicode version 6.0, and its data are likely
 to change significantly in later releases, as things get sorted out.
+New code should probably be using <code>Script_Extensions</code> and not plain
+<code>Script</code>.
 </p>
 <p>(Actually, besides <code>Common</code>, the <code>Inherited</code> script, 
contains
 characters that are used in multiple scripts.  These are modifier
-characters which modify other characters, and inherit the script value
+characters which inherit the script value
 of the controlling character.  Some of these are used in many scripts,
 and so go into <code>Inherited</code> in both <code>Script</code> and 
<code>Script_Extensions</code>.
 Others are used in just a few scripts, so are in <code>Inherited</code> in
@@ -100600,9 +102986,10 @@
 Next: <a href="#perlunicode-Blocks" accesskey="n" rel="next">perlunicode 
<strong>Blocks</strong></a>, Previous: <a href="#perlunicode-Scripts" 
accesskey="p" rel="prev">perlunicode <strong>Scripts</strong></a>, Up: <a 
href="#perlunicode-Unicode-Character-Properties" accesskey="u" 
rel="up">perlunicode Unicode Character Properties</a> &nbsp; [<a 
href="#SEC_Contents" title="Table of contents" rel="contents">Contents</a>]</p>
 </div>
 <a name="Use-of-the-_0022Is_0022-Prefix"></a>
-<h4 class="subsubsection">81.2.4.4 <strong>Use of the 
<code>&quot;Is&quot;</code> Prefix</strong></h4>
+<h4 class="subsubsection">81.2.5.4 <strong>Use of the 
<code>&quot;Is&quot;</code> Prefix</strong></h4>
 
-<p>For backward compatibility (with Perl 5.6), all properties mentioned
+<p>For backward compatibility (with Perl 5.6), all properties writable
+without using the compound form mentioned
 so far may have <code>Is</code> or <code>Is_</code> prepended to their name, 
so <code>\P{Is_Lu}</code>, for
 example, is equal to <code>\P{Lu}</code>, and <code>\p{IsScript:Arabic}</code> 
is equal to
 <code>\p{Arabic}</code>.
@@ -100614,17 +103001,17 @@
 Next: <a href="#perlunicode-Other-Properties" accesskey="n" 
rel="next">perlunicode <strong>Other Properties</strong></a>, Previous: <a 
href="#perlunicode-Use-of-the-_0022Is_0022-Prefix" accesskey="p" 
rel="prev">perlunicode <strong>Use of the <code>&quot;Is&quot;</code> 
Prefix</strong></a>, Up: <a href="#perlunicode-Unicode-Character-Properties" 
accesskey="u" rel="up">perlunicode Unicode Character Properties</a> &nbsp; [<a 
href="#SEC_Contents" title="Table of contents" rel="contents">Contents</a>]</p>
 </div>
 <a name="Blocks"></a>
-<h4 class="subsubsection">81.2.4.5 <strong>Blocks</strong></h4>
+<h4 class="subsubsection">81.2.5.5 <strong>Blocks</strong></h4>
 
 <p>In addition to <strong>scripts</strong>, Unicode also defines 
<strong>blocks</strong> of
 characters.  The difference between scripts and blocks is that the
 concept of scripts is closer to natural languages, while the concept
 of blocks is more of an artificial grouping based on groups of Unicode
 characters with consecutive ordinal values. For example, the <code>&quot;Basic 
Latin&quot;</code>
-block is all characters whose ordinals are between 0 and 127, inclusive; in
+block is all the characters whose ordinals are between 0 and 127, inclusive; in
 other words, the ASCII characters.  The <code>&quot;Latin&quot;</code> script 
contains some letters
 from this as well as several other blocks, like <code>&quot;Latin-1 
Supplement&quot;</code>,
-<code>&quot;Latin Extended-A&quot;</code>, etc., but it does not contain all 
the characters from
+<code>&quot;Latin Extended-A&quot;</code>, <em>etc.</em>, but it does not 
contain all the characters from
 those blocks. It does not, for example, contain the digits 0-9, because
 those digits are shared across many scripts, and hence are in the
 <code>Common</code> script.
@@ -100639,29 +103026,28 @@
 </p>
 <p>Block names are matched in the compound form, like <code>\p{Block: 
Arrows}</code> or
 <code>\p{Blk=Hebrew}</code>.  Unlike most other properties, only a few block 
names have a
-Unicode-defined short name.  But Perl does provide a (slight) shortcut:  You
-can say, for example <code>\p{In_Arrows}</code> or <code>\p{In_Hebrew}</code>. 
 For backwards
-compatibility, the <code>In</code> prefix may be omitted if there is no naming 
conflict
-with a script or any other property, and you can even use an <code>Is</code> 
prefix
-instead in those cases.  But it is not a good idea to do this, for a couple
-reasons:
-</p>
-<ol>
-<li> It is confusing.  There are many naming conflicts, and you may forget 
some.
-For example, <code>\p{Hebrew}</code> means the <em>script</em> Hebrew, and NOT 
the <em>block</em>
-Hebrew.  But would you remember that 6 months from now?
-
-</li><li> It is unstable.  A new version of Unicode may preempt the current 
meaning by
-creating a property with the same name.  There was a time in very early Unicode
-releases when <code>\p{Hebrew}</code> would have matched the <em>block</em> 
Hebrew; now it
-doesn&rsquo;t.
-
-</li></ol>
-
-<p>Some people prefer to always use <code>\p{Block: foo}</code> and 
<code>\p{Script: bar}</code>
-instead of the shortcuts, whether for clarity, because they can&rsquo;t 
remember the
-difference between &rsquo;In&rsquo; and &rsquo;Is&rsquo; anyway, or they 
aren&rsquo;t confident that those who
-eventually will read their code will know that difference.
+Unicode-defined short name.  But Perl does provide a (slight, no longer
+recommended) shortcut:  You can say, for example <code>\p{In_Arrows}</code> or
+<code>\p{In_Hebrew}</code>.
+</p>
+<p>For backwards compatibility, the <code>In</code> prefix may be
+omitted if there is no naming conflict with a script or any other
+property, and you can even use an <code>Is</code> prefix instead in those 
cases.
+But don&rsquo;t do this for new code because your code could break in new
+releases, and this has already happened: There was a time in very
+early Unicode releases when <code>\p{Hebrew}</code> would have matched the
+<em>block</em> Hebrew; now it doesn&rsquo;t.
+</p>
+<p>Using the <code>In</code> prefix avoids this ambiguity, so far.  But new 
versions
+of Unicode continue to add new properties whose names begin with 
<code>In</code>.
+There is a possibility that one of them someday will conflict with your
+usage.  Since this is just a Perl extension, Unicode&rsquo;s name will take
+precedence and your code will become broken.  Also, Unicode is free to
+add a script whose name begins with <code>In</code>; that would cause problems.
+</p>
+<p>So it&rsquo;s clearer and best to use the compound form when specifying
+blocks.  And be sure that is what you really really want to do.  In most
+cases scripts are what you want instead.
 </p>
 <p>A complete list of blocks and their shortcuts is in <a 
href="perluniprops.html#Top">(perluniprops)</a>.
 </p>
@@ -100672,7 +103058,7 @@
 Previous: <a href="#perlunicode-Blocks" accesskey="p" rel="prev">perlunicode 
<strong>Blocks</strong></a>, Up: <a 
href="#perlunicode-Unicode-Character-Properties" accesskey="u" 
rel="up">perlunicode Unicode Character Properties</a> &nbsp; [<a 
href="#SEC_Contents" title="Table of contents" rel="contents">Contents</a>]</p>
 </div>
 <a name="Other-Properties"></a>
-<h4 class="subsubsection">81.2.4.6 <strong>Other Properties</strong></h4>
+<h4 class="subsubsection">81.2.5.6 <strong>Other Properties</strong></h4>
 
 <p>There are many more properties than the very basic ones described here.
 A complete list is in <a href="perluniprops.html#Top">(perluniprops)</a>.
@@ -100730,45 +103116,19 @@
 <dd><a 
name="perlunicode-_005cp_007bDecomposition_005fType_003a-Non_005fCanonical_007d-_0028Short_003a-_005cp_007bDt_003dNonCanon_007d_0029"></a>
 <p>Matches a character that has a non-canonical decomposition.
 </p>
-<p>To understand the use of this rarely used <em>property=value</em> 
combination, it is
-necessary to know some basics about decomposition.
-Consider a character, say H.  It could appear with various marks around it,
-such as an acute accent, or a circumflex, or various hooks, circles, arrows,
-<em>etc.</em>, above, below, to one side or the other, etc.  There are many
-possibilities among the world&rsquo;s languages.  The number of combinations is
-astronomical, and if there were a character for each combination, it would
-soon exhaust Unicode&rsquo;s more than a million possible characters.  So 
Unicode
-took a different approach: there is a character for the base H, and a
-character for each of the possible marks, and these can be variously combined
-to get a final logical character.  So a logical character&ndash;what appears 
to be a
-single character&ndash;can be a sequence of more than one individual 
characters.
-This is called an &quot;extended grapheme cluster&quot;;  Perl furnishes the 
<code>\X</code>
-regular expression construct to match such sequences.
-</p>
-<p>But Unicode&rsquo;s intent is to unify the existing character set standards 
and
-practices, and several pre-existing standards have single characters that
-mean the same thing as some of these combinations.  An example is ISO-8859-1,
-which has quite a few of these in the Latin-1 range, an example being 
<code>&quot;LATIN
-CAPITAL LETTER E WITH ACUTE&quot;</code>.  Because this character was in this 
pre-existing
-standard, Unicode added it to its repertoire.  But this character is considered
-by Unicode to be equivalent to the sequence consisting of the character
-<code>&quot;LATIN CAPITAL LETTER E&quot;</code> followed by the character 
<code>&quot;COMBINING ACUTE ACCENT&quot;</code>.
-</p>
-<p><code>&quot;LATIN CAPITAL LETTER E WITH ACUTE&quot;</code> is called a 
&quot;pre-composed&quot; character, and
-its equivalence with the sequence is called canonical equivalence.  All
-pre-composed characters are said to have a decomposition (into the equivalent
-sequence), and the decomposition type is also called canonical.
-</p>
-<p>However, many more characters have a different type of decomposition, a
-&quot;compatible&quot; or &quot;non-canonical&quot; decomposition.  The 
sequences that form these
-decompositions are not considered canonically equivalent to the pre-composed
-character.  An example, again in the Latin-1 range, is the 
<code>&quot;SUPERSCRIPT ONE&quot;</code>.
-It is somewhat like a regular digit 1, but not exactly; its decomposition
-into the digit 1 is called a &quot;compatible&quot; decomposition, 
specifically a
+<p>The <a 
href="#perlunicode-Extended-Grapheme-Clusters-_0028Logical-characters_0029">Extended
 Grapheme Clusters (Logical characters)</a> section above
+talked about canonical decompositions.  However, many more characters
+have a different type of decomposition, a &quot;compatible&quot; or
+&quot;non-canonical&quot; decomposition.  The sequences that form these
+decompositions are not considered canonically equivalent to the
+pre-composed character.  An example is the <code>&quot;SUPERSCRIPT 
ONE&quot;</code>.  It is
+somewhat like a regular digit 1, but not exactly; its decomposition into
+the digit 1 is called a &quot;compatible&quot; decomposition, specifically a
 &quot;super&quot; decomposition.  There are several such compatibility
-decompositions (see <a 
href="http://www.unicode.org/reports/tr44";>http://www.unicode.org/reports/tr44</a>),
 including one
-called &quot;compat&quot;, which means some miscellaneous type of decomposition
-that doesn&rsquo;t fit into the decomposition categories that Unicode has 
chosen.
+decompositions (see <a 
href="http://www.unicode.org/reports/tr44";>http://www.unicode.org/reports/tr44</a>),
 including
+one called &quot;compat&quot;, which means some miscellaneous type of
+decomposition that doesn&rsquo;t fit into the other decomposition categories
+that Unicode has chosen.
 </p>
 <p>Note that most Unicode characters don&rsquo;t have a decomposition, so their
 decomposition type is <code>&quot;None&quot;</code>.
@@ -100797,7 +103157,7 @@
 <dt><strong><code>\p{PerlSpace}</code></strong></dt>
 <dd><a name="perlunicode-_005cp_007bPerlSpace_007d"></a>
 <p>This is the same as <code>\s</code>, restricted to ASCII, namely 
<code>[&nbsp;\f\n\r\t]<!-- /@w --></code>
-and starting in Perl v5.18, experimentally, a vertical tab.
+and starting in Perl v5.18, a vertical tab.
 </p>
 <p>Mnemonic: Perl&rsquo;s (original) space
 </p>
@@ -100811,8 +103171,8 @@
 </dd>
 <dt><strong><code>\p{Posix...}</code></strong></dt>
 <dd><a name="perlunicode-_005cp_007bPosix_002e_002e_002e_007d"></a>
-<p>There are several of these, which are equivalents using the 
<code>\p{}</code>
-notation for Posix classes and are described in
+<p>There are several of these, which are equivalents, using the 
<code>\p{}</code>
+notation, for Posix classes and are described in
 <a href="#perlrecharclass-POSIX-Character-Classes">perlrecharclass POSIX 
Character Classes</a>.
 </p>
 </dd>
@@ -100861,7 +103221,7 @@
 <p>This is the same as <code>\s</code>, including beyond ASCII.
 </p>
 <p>Mnemonic: Space, as modified by Perl.  (It doesn&rsquo;t include the 
vertical tab
-which both the Posix standard and Unicode consider white space.)
+until v5.18, which both the Posix standard and Unicode consider white space.)
 </p>
 </dd>
 <dt><strong><code>\p{Title}</code></strong> and  
<strong><code>\p{Titlecase}</code></strong></dt>
@@ -100904,7 +103264,7 @@
 Next: <a 
href="#perlunicode-User_002dDefined-Case-Mappings-_0028for-serious-hackers-only_0029"
 accesskey="n" rel="next">perlunicode User-Defined Case Mappings (for serious 
hackers only)</a>, Previous: <a 
href="#perlunicode-Unicode-Character-Properties" accesskey="p" 
rel="prev">perlunicode Unicode Character Properties</a>, Up: <a 
href="#perlunicode-DESCRIPTION" accesskey="u" rel="up">perlunicode 
DESCRIPTION</a> &nbsp; [<a href="#SEC_Contents" title="Table of contents" 
rel="contents">Contents</a>]</p>
 </div>
 <a name="User_002dDefined-Character-Properties"></a>
-<h4 class="subsection">81.2.5 User-Defined Character Properties</h4>
+<h4 class="subsection">81.2.6 User-Defined Character Properties</h4>
 
 <p>You can define your own binary character properties by defining subroutines
 whose names begin with <code>&quot;In&quot;</code> or 
<code>&quot;Is&quot;</code>.  (The experimental feature
@@ -100993,7 +103353,7 @@
 </pre>
 <p>Suppose you wanted to match only the allocated characters,
 not the raw block ranges: in other words, you want to remove
-the non-characters:
+the unassigned characters:
 </p>
 <pre class="verbatim">    sub InKana {
         return &lt;&lt;'END';
@@ -101044,7 +103404,7 @@
 Next: <a href="#perlunicode-Character-Encodings-for-Input-and-Output" 
accesskey="n" rel="next">perlunicode Character Encodings for Input and 
Output</a>, Previous: <a 
href="#perlunicode-User_002dDefined-Character-Properties" accesskey="p" 
rel="prev">perlunicode User-Defined Character Properties</a>, Up: <a 
href="#perlunicode-DESCRIPTION" accesskey="u" rel="up">perlunicode 
DESCRIPTION</a> &nbsp; [<a href="#SEC_Contents" title="Table of contents" 
rel="contents">Contents</a>]</p>
 </div>
 <a 
name="User_002dDefined-Case-Mappings-_0028for-serious-hackers-only_0029"></a>
-<h4 class="subsection">81.2.6 User-Defined Case Mappings (for serious hackers 
only)</h4>
+<h4 class="subsection">81.2.7 User-Defined Case Mappings (for serious hackers 
only)</h4>
 
 <p><strong>This feature has been removed as of Perl 5.16.</strong>
 The CPAN module <code><a 
href="Unicode-Casing.html#Top">(Unicode-Casing)</a></code> provides better 
functionality without
@@ -101060,7 +103420,7 @@
 Next: <a href="#perlunicode-Unicode-Regular-Expression-Support-Level" 
accesskey="n" rel="next">perlunicode Unicode Regular Expression Support 
Level</a>, Previous: <a 
href="#perlunicode-User_002dDefined-Case-Mappings-_0028for-serious-hackers-only_0029"
 accesskey="p" rel="prev">perlunicode User-Defined Case Mappings (for serious 
hackers only)</a>, Up: <a href="#perlunicode-DESCRIPTION" accesskey="u" 
rel="up">perlunicode DESCRIPTION</a> &nbsp; [<a href="#SEC_Contents" 
title="Table of contents" rel="contents">Contents</a>]</p>
 </div>
 <a name="Character-Encodings-for-Input-and-Output"></a>
-<h4 class="subsection">81.2.7 Character Encodings for Input and Output</h4>
+<h4 class="subsection">81.2.8 Character Encodings for Input and Output</h4>
 
 <p>See <a href="Encode.html#Top">(Encode)</a>.
 </p>
@@ -101071,7 +103431,7 @@
 Next: <a href="#perlunicode-Unicode-Encodings" accesskey="n" 
rel="next">perlunicode Unicode Encodings</a>, Previous: <a 
href="#perlunicode-Character-Encodings-for-Input-and-Output" accesskey="p" 
rel="prev">perlunicode Character Encodings for Input and Output</a>, Up: <a 
href="#perlunicode-DESCRIPTION" accesskey="u" rel="up">perlunicode 
DESCRIPTION</a> &nbsp; [<a href="#SEC_Contents" title="Table of contents" 
rel="contents">Contents</a>]</p>
 </div>
 <a name="Unicode-Regular-Expression-Support-Level"></a>
-<h4 class="subsection">81.2.8 Unicode Regular Expression Support Level</h4>
+<h4 class="subsection">81.2.9 Unicode Regular Expression Support Level</h4>
 
 <p>The following list of Unicode supported features for regular expressions 
describes
 all features currently directly supported by core Perl.  The references to 
&quot;Level N&quot;
@@ -101091,31 +103451,22 @@
  RL1.7   Supplementary Code Points        - done          [10]
 </pre>
 <dl compact="compact">
-<dt>[1]</dt>
-<dd><a name="perlunicode-_005b1_005d"></a>
-<p><code>\x{...}</code>
-</p>
-</dd>
-<dt>[2]</dt>
-<dd><a name="perlunicode-_005b2_005d"></a>
-<p><code>\p{...}</code> <code>\P{...}</code>
-</p>
+<dt>[1] <code>\N{U+...}</code> and <code>\x{...}</code></dt>
+<dd><a 
name="perlunicode-_005b1_005d-_005cN_007bU_002b_002e_002e_002e_007d-and-_005cx_007b_002e_002e_002e_007d"></a>
 </dd>
-<dt>[3]</dt>
-<dd><a name="perlunicode-_005b3_005d"></a>
-<p>supports not only minimal list, but all Unicode character properties (see 
Unicode Character Properties above)
-</p>
+<dt>[2] <code>\p{...}</code> <code>\P{...}</code></dt>
+<dd><a 
name="perlunicode-_005b2_005d-_005cp_007b_002e_002e_002e_007d-_005cP_007b_002e_002e_002e_007d"></a>
 </dd>
-<dt>[4]</dt>
-<dd><a name="perlunicode-_005b4_005d"></a>
-<p><code>\d</code> <code>\D</code> <code>\s</code> <code>\S</code> 
<code>\w</code> <code>\W</code> <code>\X</code> <code>[:<em>prop</em>:]</code> 
<code>[:^<em>prop</em>:]</code>
-</p>
+<dt>[3] supports not only minimal list, but all Unicode character properties 
(see Unicode Character Properties above)</dt>
+<dd><a 
name="perlunicode-_005b3_005d-supports-not-only-minimal-list_002c-but-all-Unicode-character-properties-_0028see-Unicode-Character-Properties-above_0029"></a>
 </dd>
-<dt>[5]</dt>
-<dd><a name="perlunicode-_005b5_005d"></a>
-<p>The experimental feature in v5.18 <code>&quot;(?[...])&quot;</code> 
accomplishes this.  See
-<a href="#perlre-_0028_003f_005b-_005d_0029">perlre <code>(?[ ])</code></a>.  
If you don&rsquo;t want to use an experimental feature,
-you can use one of the following:
+<dt>[4] <code>\d</code> <code>\D</code> <code>\s</code> <code>\S</code> 
<code>\w</code> <code>\W</code> <code>\X</code> <code>[:<em>prop</em>:]</code> 
<code>[:^<em>prop</em>:]</code></dt>
+<dd><a 
name="perlunicode-_005b4_005d-_005cd-_005cD-_005cs-_005cS-_005cw-_005cW-_005cX-_005b_003aprop_003a_005d-_005b_003a_005eprop_003a_005d"></a>
+</dd>
+<dt>[5] The experimental feature starting in v5.18 
<code>&quot;(?[...])&quot;</code> accomplishes this.</dt>
+<dd><a 
name="perlunicode-_005b5_005d-The-experimental-feature-starting-in-v5_002e18-_0022_0028_003f_005b_002e_002e_002e_005d_0029_0022-accomplishes-this_002e"></a>
+<p>See <a href="#perlre-_0028_003f_005b-_005d_0029">perlre <code>(?[ 
])</code></a>.  If you don&rsquo;t want to use an experimental
+feature, you can use one of the following:
 </p>
 <ul>
 <li> Regular expression look-ahead
@@ -101148,58 +103499,63 @@
 </li></ul>
 
 </dd>
-<dt>[6]</dt>
-<dd><a name="perlunicode-_005b6_005d"></a>
-<p><code>\b</code> <code>\B</code>
-</p>
+<dt>[6] <code>\b</code> <code>\B</code></dt>
+<dd><a name="perlunicode-_005b6_005d-_005cb-_005cB"></a>
 </dd>
-<dt>[7]</dt>
-<dd><a name="perlunicode-_005b7_005d"></a>
-<p>Note that Perl does Full case-folding in matching (but with bugs), not
-Simple: for example <code>U+1F88</code> is equivalent to <code>U+1F00 
U+03B9</code>, instead of
-just <code>U+1F80</code>.  This difference matters mainly for certain Greek 
capital
+<dt>[7] Note that Perl does Full case-folding in matching, not Simple:</dt>
+<dd><a 
name="perlunicode-_005b7_005d-Note-that-Perl-does-Full-case_002dfolding-in-matching_002c-not-Simple_003a"></a>
+<p>For example <code>U+1F88</code> is equivalent to <code>U+1F00 
U+03B9</code>, instead of just
+<code>U+1F80</code>.  This difference matters mainly for certain Greek capital
 letters with certain modifiers: the Full case-folding decomposes the
 letter, while the Simple case-folding would map it to a single
 character.
 </p>
 </dd>
-<dt>[8]</dt>
-<dd><a name="perlunicode-_005b8_005d"></a>
-<p>Should do <code>^</code> and <code>$</code> also on <code>U+000B</code> 
(<code>\v</code> in C), <code>FF</code> (<code>\f</code>),
-<code>CR</code> (<code>\r</code>), <code>CRLF</code> (<code>\r\n</code>), 
<code>NEL</code> (<code>U+0085</code>), <code>LS</code> (<code>U+2028</code>),
-and <code>PS</code> (<code>U+2029</code>); should also affect 
<code>&lt;&gt;</code>, <code>$.</code>, and
-script line numbers; should not split lines within <code>CRLF</code> (i.e. 
there
-is no empty line between <code>\r</code> and <code>\n</code>).  For 
<code>CRLF</code>, try the
-<code>:crlf</code> layer (see <a href="PerlIO.html#Top">(PerlIO)</a>).
-</p>
-</dd>
-<dt>[9]</dt>
-<dd><a name="perlunicode-_005b9_005d"></a>
-<p>Linebreaking conformant with <a 
href="http://www.unicode.org/reports/tr14";>UAX#14 &quot;Unicode Line Breaking
-Algorithm&quot;</a>
-is available through the <code><a 
href="Unicode-LineBreak.html#Top">(Unicode-LineBreak)</a></code> module.
-</p>
-</dd>
-<dt>[10]</dt>
-<dd><a name="perlunicode-_005b10_005d"></a>
-<p>UTF-8/UTF-EBDDIC used in Perl allows not only <code>U+10000</code> to
-<code>U+10FFFF</code> but also beyond <code>U+10FFFF</code>
+<dt>[8] Perl treats <code>\n</code> as the start- and end-line delimiter.  
Unicode specifies more characters that should be so-interpreted.</dt>
+<dd><a 
name="perlunicode-_005b8_005d-Perl-treats-_005cn-as-the-start_002d-and-end_002dline-delimiter_002e-Unicode-specifies-more-characters-that-should-be-so_002dinterpreted_002e"></a>
+<p>These are:
+</p>
+<pre class="verbatim"> VT   U+000B  (\v in C)
+ FF   U+000C  (\f)
+ CR   U+000D  (\r)
+ NEL  U+0085
+ LS   U+2028
+ PS   U+2029
+</pre>
+<p><code>^</code> and <code>$</code> in regular expression patterns are 
supposed to match all
+these, but don&rsquo;t.
+These characters also don&rsquo;t, but should, affect <code>&lt;&gt;</code> 
<code>$.</code>, and
+script line numbers.
+</p>
+<p>Also, lines should not be split within <code>CRLF</code> (i.e. there is no
+empty line between <code>\r</code> and <code>\n</code>).  For 
<code>CRLF</code>, try the <code>:crlf</code>
+layer (see <a href="PerlIO.html#Top">(PerlIO)</a>).
+</p>
+</dd>
+<dt>[9] But <code><a 
href="Unicode-LineBreak.html#Top">(Unicode-LineBreak)</a></code> is 
available.</dt>
+<dd><a 
name="perlunicode-_005b9_005d-But-Unicode_002dLineBreak-is-available_002e"></a>
+<p>This module supplies line breaking conformant with
+<a href="http://www.unicode.org/reports/tr14";>UAX#14 &quot;Unicode Line 
Breaking Algorithm&quot;</a>.
 </p>
 </dd>
+<dt>[10] UTF-8/UTF-EBDDIC used in Perl allows not only <code>U+10000</code> to 
<code>U+10FFFF</code> but also beyond <code>U+10FFFF</code></dt>
+<dd><a 
name="perlunicode-_005b10_005d-UTF_002d8_002fUTF_002dEBDDIC-used-in-Perl-allows-not-only-U_002b10000-to-U_002b10FFFF-but-also-beyond-U_002b10FFFF"></a>
+</dd>
 </dl>
 
 </li><li> Level 2 - Extended Unicode Support
 
 <pre class="verbatim"> RL2.1   Canonical Equivalents           - MISSING       
[10][11]
  RL2.2   Default Grapheme Clusters       - MISSING       [12]
- RL2.3   Default Word Boundaries         - MISSING       [14]
+ RL2.3   Default Word Boundaries         - DONE          [14]
  RL2.4   Default Loose Matches           - MISSING       [15]
  RL2.5   Name Properties                 - DONE
  RL2.6   Wildcard Properties             - MISSING
 
  [10] see UAX#15 &quot;Unicode Normalization Forms&quot;
  [11] have Unicode::Normalize but not integrated to regexes
- [12] have \X but we don't have a &quot;Grapheme Cluster Mode&quot;
+ [12] have \X and \b{gcb} but we don't have a &quot;Grapheme Cluster
+      Mode&quot;
  [14] see UAX#29, Word Boundaries
  [15] This is covered in Chapter 3.13 (in Unicode 6.0)
 </pre>
@@ -101232,10 +103588,10 @@
 <a name="perlunicode-Unicode-Encodings"></a>
 <div class="header">
 <p>
-Next: <a href="#perlunicode-Non_002dcharacter-code-points" accesskey="n" 
rel="next">perlunicode Non-character code points</a>, Previous: <a 
href="#perlunicode-Unicode-Regular-Expression-Support-Level" accesskey="p" 
rel="prev">perlunicode Unicode Regular Expression Support Level</a>, Up: <a 
href="#perlunicode-DESCRIPTION" accesskey="u" rel="up">perlunicode 
DESCRIPTION</a> &nbsp; [<a href="#SEC_Contents" title="Table of contents" 
rel="contents">Contents</a>]</p>
+Next: <a href="#perlunicode-Noncharacter-code-points" accesskey="n" 
rel="next">perlunicode Noncharacter code points</a>, Previous: <a 
href="#perlunicode-Unicode-Regular-Expression-Support-Level" accesskey="p" 
rel="prev">perlunicode Unicode Regular Expression Support Level</a>, Up: <a 
href="#perlunicode-DESCRIPTION" accesskey="u" rel="up">perlunicode 
DESCRIPTION</a> &nbsp; [<a href="#SEC_Contents" title="Table of contents" 
rel="contents">Contents</a>]</p>
 </div>
 <a name="Unicode-Encodings"></a>
-<h4 class="subsection">81.2.9 Unicode Encodings</h4>
+<h4 class="subsection">81.2.10 Unicode Encodings</h4>
 
 <p>Unicode characters are assigned to <em>code points</em>, which are abstract
 numbers.  To use these numbers, various encodings are needed.
@@ -101244,8 +103600,11 @@
 <li> UTF-8
 
 <p>UTF-8 is a variable-length (1 to 4 bytes), byte-order independent
-encoding. For ASCII (and we really do mean 7-bit ASCII, not another
-8-bit encoding), UTF-8 is transparent.
+encoding.  In most of Perl&rsquo;s documentation, including elsewhere in this
+document, the term &quot;UTF-8&quot; means also &quot;UTF-EBCDIC&quot;.  But 
in this section,
+&quot;UTF-8&quot; refers only to the encoding used on ASCII platforms.  It is a
+superset of 7-bit US-ASCII, so anything encoded in ASCII has the
+identical representation when encoded in UTF-8.
 </p>
 <p>The following table is from Unicode 3.2.
 </p>
@@ -101288,14 +103647,19 @@
 these as being non-portable; and under strict UTF-8 input protocols,
 they are forbidden.
 </p>
-<p>The Unicode non-character code points are also disallowed in UTF-8 in
-&quot;open interchange&quot;.  See <a 
href="#perlunicode-Non_002dcharacter-code-points">Non-character code points</a>.
-</p>
 </li><li> UTF-EBCDIC
 
-<p>Like UTF-8 but EBCDIC-safe, in the way that UTF-8 is ASCII-safe.
+<p>Like UTF-8, but EBCDIC-safe, in the way that UTF-8 is ASCII-safe.
+This means that all the basic characters (which includes all
+those that have ASCII equivalents (like <code>&quot;A&quot;</code>, 
<code>&quot;0&quot;</code>, <code>&quot;%&quot;</code>, <em>etc.</em>)
+are the same in both EBCDIC and UTF-EBCDIC.)
+</p>
+<p>UTF-EBCDIC is used on EBCDIC platforms.  The largest Unicode code points
+take 5 bytes to represent (instead of 4 in UTF-8), and Perl extends it
+to a maximum of 7 bytes to encode pode points up to what can fit in a
+32-bit word (instead of 13 bytes and a 64-bit word in UTF-8).
 </p>
-</li><li> UTF-16, UTF-16BE, UTF-16LE, Surrogates, and <code>BOM</code>s (Byte 
Order Marks)
+</li><li> UTF-16, UTF-16BE, UTF-16LE, Surrogates, and <code>BOM</code>&rsquo;s 
(Byte Order Marks)
 
 <p>The followings items are mostly for reference and general Unicode
 knowledge, Perl doesn&rsquo;t use these constructs internally.
@@ -101327,7 +103691,7 @@
 </p>
 <p>This introduces another problem: what if you just know that your data
 is UTF-16, but you don&rsquo;t know which endianness?  Byte Order Marks, or
-<code>BOM</code>s, are a solution to this.  A special character has been 
reserved
+<code>BOM</code>&rsquo;s, are a solution to this.  A special character has 
been reserved
 in Unicode to function as a byte order marker: the character with the
 code point <code>U+FEFF</code> is the <code>BOM</code>.
 </p>
@@ -101335,7 +103699,8 @@
 since if it was written on a big-endian platform, you will read the
 bytes <code>0xFE 0xFF</code>, but if it was written on a little-endian 
platform,
 you will read the bytes <code>0xFF 0xFE</code>.  (And if the originating 
platform
-was writing in UTF-8, you will read the bytes <code>0xEF 0xBB 0xBF</code>.)
+was writing in ASCII platform UTF-8, you will read the bytes
+<code>0xEF 0xBB 0xBF</code>.)
 </p>
 <p>The way this trick works is that the character with the code point
 <code>U+FFFE</code> is not supposed to be in input streams, so the
@@ -101358,7 +103723,7 @@
 </p>
 </li><li> UTF-32, UTF-32BE, UTF-32LE
 
-<p>The UTF-32 family is pretty much like the UTF-16 family, expect that
+<p>The UTF-32 family is pretty much like the UTF-16 family, except that
 the units are 32-bit, and therefore the surrogate scheme is not
 needed.  UTF-32 is a fixed-width encoding.  The <code>BOM</code> signatures are
 <code>0x00 0x00 0xFE 0xFF</code> for BE and <code>0xFF 0xFE 0x00 0x00</code> 
for LE.
@@ -101379,40 +103744,90 @@
 </li></ul>
 
 <hr>
-<a name="perlunicode-Non_002dcharacter-code-points"></a>
+<a name="perlunicode-Noncharacter-code-points"></a>
 <div class="header">
 <p>
 Next: <a href="#perlunicode-Beyond-Unicode-code-points" accesskey="n" 
rel="next">perlunicode Beyond Unicode code points</a>, Previous: <a 
href="#perlunicode-Unicode-Encodings" accesskey="p" rel="prev">perlunicode 
Unicode Encodings</a>, Up: <a href="#perlunicode-DESCRIPTION" accesskey="u" 
rel="up">perlunicode DESCRIPTION</a> &nbsp; [<a href="#SEC_Contents" 
title="Table of contents" rel="contents">Contents</a>]</p>
 </div>
-<a name="Non_002dcharacter-code-points"></a>
-<h4 class="subsection">81.2.10 Non-character code points</h4>
+<a name="Noncharacter-code-points"></a>
+<h4 class="subsection">81.2.11 Noncharacter code points</h4>
 
-<p>66 code points are set aside in Unicode as &quot;non-character code 
points&quot;.
+<p>66 code points are set aside in Unicode as &quot;noncharacter code 
points&quot;.
 These all have the <code>Unassigned</code> (<code>Cn</code>) <code><a 
href="#perlunicode-General_005fCategory">General_Category</a></code>, and
-they never will
-be assigned.  These are never supposed to be in legal Unicode input
-streams, so that code can use them as sentinels that can be mixed in
-with character data, and they always will be distinguishable from that data.
-To keep them out of Perl input streams, strict UTF-8 should be
-specified, such as by using the layer <code>:encoding('UTF-8')</code>.  The
-non-character code points are the 32 between <code>U+FDD0</code> and 
<code>U+FDEF</code>, and the
-34 code points <code>U+FFFE</code>, <code>U+FFFF</code>, <code>U+1FFFE</code>, 
<code>U+1FFFF</code>, ... <code>U+10FFFE</code>, <code>U+10FFFF</code>.
-Some people are under the mistaken impression that these are 
&quot;illegal&quot;,
-but that is not true.  An application or cooperating set of applications
-can legally use them at will internally; but these code points are
-&quot;illegal for open interchange&quot;.  Therefore, Perl will not accept 
these
-from input streams unless lax rules are being used, and will warn
-(using the warning category <code>&quot;nonchar&quot;</code>, which is a 
sub-category of <code>&quot;utf8&quot;</code>) if
-an attempt is made to output them.
+no character will ever be assigned to any of them.  They are the 32 code
+points between <code>U+FDD0</code> and <code>U+FDEF</code> inclusive, and the 
34 code
+points:
+</p>
+<pre class="verbatim"> U+FFFE   U+FFFF
+ U+1FFFE  U+1FFFF
+ U+2FFFE  U+2FFFF
+ ...
+ U+EFFFE  U+EFFFF
+ U+FFFFE  U+FFFFF
+ U+10FFFE U+10FFFF
+</pre>
+<p>Until Unicode 7.0, the noncharacters were &quot;<strong>forbidden</strong> 
for use in open
+interchange of Unicode text data&quot;, so that code that processed those
+streams could use these code points as sentinels that could be mixed in
+with character data, and would always be distinguishable from that data.
+(Emphasis above and in the next paragraph are added in this document.)
+</p>
+<p>Unicode 7.0 changed the wording so that they are &quot;<strong>not 
recommended</strong> for
+use in open interchange of Unicode text data&quot;.  The 7.0 Standard goes on
+to say:
+</p>
+<blockquote>
+<p>&quot;If a noncharacter is received in open interchange, an application is
+not required to interpret it in any way.  It is good practice, however,
+to recognize it as a noncharacter and to take appropriate action, such
+as replacing it with <code>U+FFFD</code> replacement character, to indicate the
+problem in the text.  It is not recommended to simply delete
+noncharacter code points from such text, because of the potential
+security issues caused by deleting uninterpreted characters.  (See
+conformance clause C7 in Section 3.2, Conformance Requirements, and
+<a 
href="http://www.unicode.org/reports/tr36/#Substituting_for_Ill_Formed_Subsequences";>Unicode
 Technical Report #36, &quot;Unicode Security
+Considerations&quot;</a>).&quot;
+</p>
+</blockquote>
+
+<p>This change was made because it was found that various commercial tools
+like editors, or for things like source code control, had been written
+so that they would not handle program files that used these code points,
+effectively precluding their use almost entirely!  And that was never
+the intent.  They&rsquo;ve always been meant to be usable within an
+application, or cooperating set of applications, at will.
+</p>
+<p>If you&rsquo;re writing code, such as an editor, that is supposed to be able
+to handle any Unicode text data, then you shouldn&rsquo;t be using these code
+points yourself, and instead allow them in the input.  If you need
+sentinels, they should instead be something that isn&rsquo;t legal Unicode.
+For UTF-8 data, you can use the bytes 0xC1 and 0xC2 as sentinels, as
+they never appear in well-formed UTF-8.  (There are equivalents for
+UTF-EBCDIC).  You can also store your Unicode code points in integer
+variables and use negative values as sentinels.
+</p>
+<p>If you&rsquo;re not writing such a tool, then whether you accept 
noncharacters
+as input is up to you (though the Standard recommends that you not).  If
+you do strict input stream checking with Perl, these code points
+continue to be forbidden.  This is to maintain backward compatibility
+(otherwise potential security holes could open up, as an unsuspecting
+application that was written assuming the noncharacters would be
+filtered out before getting to it, could now, without warning, start
+getting them).  To do strict checking, you can use the layer
+<code>:encoding('UTF-8')</code>.
+</p>
+<p>Perl continues to warn (using the warning category 
<code>&quot;nonchar&quot;</code>, which
+is a sub-category of <code>&quot;utf8&quot;</code>) if an attempt is made to 
output
+noncharacters.
 </p>
 <hr>
 <a name="perlunicode-Beyond-Unicode-code-points"></a>
 <div class="header">
 <p>
-Next: <a href="#perlunicode-Security-Implications-of-Unicode" accesskey="n" 
rel="next">perlunicode Security Implications of Unicode</a>, Previous: <a 
href="#perlunicode-Non_002dcharacter-code-points" accesskey="p" 
rel="prev">perlunicode Non-character code points</a>, Up: <a 
href="#perlunicode-DESCRIPTION" accesskey="u" rel="up">perlunicode 
DESCRIPTION</a> &nbsp; [<a href="#SEC_Contents" title="Table of contents" 
rel="contents">Contents</a>]</p>
+Next: <a href="#perlunicode-Security-Implications-of-Unicode" accesskey="n" 
rel="next">perlunicode Security Implications of Unicode</a>, Previous: <a 
href="#perlunicode-Noncharacter-code-points" accesskey="p" 
rel="prev">perlunicode Noncharacter code points</a>, Up: <a 
href="#perlunicode-DESCRIPTION" accesskey="u" rel="up">perlunicode 
DESCRIPTION</a> &nbsp; [<a href="#SEC_Contents" title="Table of contents" 
rel="contents">Contents</a>]</p>
 </div>
 <a name="Beyond-Unicode-code-points"></a>
-<h4 class="subsection">81.2.11 Beyond Unicode code points</h4>
+<h4 class="subsection">81.2.12 Beyond Unicode code points</h4>
 
 <p>The maximum Unicode code point is <code>U+10FFFF</code>, and Unicode only 
defines
 operations on code points up through that.  But Perl works on code
@@ -101427,8 +103842,8 @@
 category.  For example, <code>uc(&quot;\x{11_0000}&quot;)</code> will generate 
such a
 warning, returning the input parameter as its result, since Perl defines
 the uppercase of every non-Unicode code point to be the code point
-itself.  In fact, all the case changing operations, not just
-uppercasing, work this way.
+itself.  (All the case changing operations, not just uppercasing, work
+this way.)
 </p>
 <p>The situation with matching Unicode properties in regular expressions,
 the <code>\p{}</code> and <code>\P{}</code> constructs, against these code 
points is not as
@@ -101462,7 +103877,7 @@
 <p>As a result of these problems, starting in v5.20, what Perl does is
 to treat non-Unicode code points as just typical unassigned Unicode
 characters, and matches accordingly.  (Note: Unicode has atypical
-unassigned code points.  For example, it has non-character code points,
+unassigned code points.  For example, it has noncharacter code points,
 and ones that, when they do get assigned, are destined to be written
 Right-to-left, as Arabic and Hebrew are.  Perl assumes that no
 non-Unicode code point has any atypical properties.)
@@ -101533,10 +103948,12 @@
 Next: <a href="#perlunicode-Unicode-in-Perl-on-EBCDIC" accesskey="n" 
rel="next">perlunicode Unicode in Perl on EBCDIC</a>, Previous: <a 
href="#perlunicode-Beyond-Unicode-code-points" accesskey="p" 
rel="prev">perlunicode Beyond Unicode code points</a>, Up: <a 
href="#perlunicode-DESCRIPTION" accesskey="u" rel="up">perlunicode 
DESCRIPTION</a> &nbsp; [<a href="#SEC_Contents" title="Table of contents" 
rel="contents">Contents</a>]</p>
 </div>
 <a name="Security-Implications-of-Unicode"></a>
-<h4 class="subsection">81.2.12 Security Implications of Unicode</h4>
+<h4 class="subsection">81.2.13 Security Implications of Unicode</h4>
 
-<p>Read <a href="http://www.unicode.org/reports/tr36";>Unicode Security 
Considerations</a>.
-Also, note the following:
+<p>First, read
+<a href="http://www.unicode.org/reports/tr36";>Unicode Security 
Considerations</a>.
+</p>
+<p>Also, note the following:
 </p>
 <ul>
 <li> Malformed UTF-8
@@ -101559,14 +103976,10 @@
 </li></ul>
 
 <p>As discussed elsewhere, Perl has one foot (two hooves?) planted in
-each of two worlds: the old world of bytes and the new world of
-characters, upgrading from bytes to characters when necessary.
+each of two worlds: the old world of ASCII and single-byte locales, and
+the new world of Unicode, upgrading when necessary.
 If your legacy code does not explicitly use Unicode, no automatic
-switch-over to characters should happen.  Characters shouldn&rsquo;t get
-downgraded to bytes, either.  It is possible to accidentally mix bytes
-and characters, however (see <a href="#perluniintro-NAME">perluniintro 
NAME</a>), in which case <code>\w</code> in
-regular expressions might start behaving differently (unless the 
<code>/a</code>
-modifier is in effect).  Review your code.  Use warnings and the 
<code>strict</code> pragma.
+switch-over to Unicode should happen.
 </p>
 <hr>
 <a name="perlunicode-Unicode-in-Perl-on-EBCDIC"></a>
@@ -101575,16 +103988,20 @@
 Next: <a href="#perlunicode-Locales" accesskey="n" rel="next">perlunicode 
Locales</a>, Previous: <a href="#perlunicode-Security-Implications-of-Unicode" 
accesskey="p" rel="prev">perlunicode Security Implications of Unicode</a>, Up: 
<a href="#perlunicode-DESCRIPTION" accesskey="u" rel="up">perlunicode 
DESCRIPTION</a> &nbsp; [<a href="#SEC_Contents" title="Table of contents" 
rel="contents">Contents</a>]</p>
 </div>
 <a name="Unicode-in-Perl-on-EBCDIC"></a>
-<h4 class="subsection">81.2.13 Unicode in Perl on EBCDIC</h4>
+<h4 class="subsection">81.2.14 Unicode in Perl on EBCDIC</h4>
 
-<p>The way Unicode is handled on EBCDIC platforms is still
-experimental.  On such platforms, references to UTF-8 encoding in this
-document and elsewhere should be read as meaning the UTF-EBCDIC
-specified in Unicode Technical Report 16, unless ASCII vs. EBCDIC issues
-are specifically discussed. There is no <code>utfebcdic</code> pragma or
-<code>&quot;:utfebcdic&quot;</code> layer; rather, 
<code>&quot;utf8&quot;</code> and <code>&quot;:utf8&quot;</code> are reused to 
mean
-the platform&rsquo;s &quot;natural&quot; 8-bit encoding of Unicode. See <a 
href="#perlebcdic-NAME">perlebcdic NAME</a>
-for more discussion of the issues.
+<p>Unicode is supported on EBCDIC platforms.  See <a 
href="#perlebcdic-NAME">perlebcdic NAME</a>.
+</p>
+<p>Unless ASCII vs. EBCDIC issues are specifically being discussed,
+references to UTF-8 encoding in this document and elsewhere should be
+read as meaning UTF-EBCDIC on EBCDIC platforms.
+See <a href="#perlebcdic-Unicode-and-UTF">perlebcdic Unicode and UTF</a>.
+</p>
+<p>Because UTF-EBCDIC is so similar to UTF-8, the differences are mostly
+hidden from you; <code>use&nbsp;utf8</code><!-- /@w --> (and NOT something like
+<code>use&nbsp;utfebcdic</code><!-- /@w -->) declares the the script is in the 
platform&rsquo;s
+&quot;native&quot; 8-bit encoding of Unicode.  (Similarly for the 
<code>&quot;:utf8&quot;</code>
+layer.)
 </p>
 <hr>
 <a name="perlunicode-Locales"></a>
@@ -101593,7 +104010,7 @@
 Next: <a href="#perlunicode-When-Unicode-Does-Not-Happen" accesskey="n" 
rel="next">perlunicode When Unicode Does Not Happen</a>, Previous: <a 
href="#perlunicode-Unicode-in-Perl-on-EBCDIC" accesskey="p" 
rel="prev">perlunicode Unicode in Perl on EBCDIC</a>, Up: <a 
href="#perlunicode-DESCRIPTION" accesskey="u" rel="up">perlunicode 
DESCRIPTION</a> &nbsp; [<a href="#SEC_Contents" title="Table of contents" 
rel="contents">Contents</a>]</p>
 </div>
 <a name="Locales"></a>
-<h4 class="subsection">81.2.14 Locales</h4>
+<h4 class="subsection">81.2.15 Locales</h4>
 
 <p>See <a href="#perllocale-Unicode-and-UTF_002d8">perllocale Unicode and 
UTF-8</a>
 </p>
@@ -101604,18 +104021,18 @@
 Next: <a href="#perlunicode-The-_0022Unicode-Bug_0022" accesskey="n" 
rel="next">perlunicode The &quot;Unicode Bug&quot;</a>, Previous: <a 
href="#perlunicode-Locales" accesskey="p" rel="prev">perlunicode Locales</a>, 
Up: <a href="#perlunicode-DESCRIPTION" accesskey="u" rel="up">perlunicode 
DESCRIPTION</a> &nbsp; [<a href="#SEC_Contents" title="Table of contents" 
rel="contents">Contents</a>]</p>
 </div>
 <a name="When-Unicode-Does-Not-Happen"></a>
-<h4 class="subsection">81.2.15 When Unicode Does Not Happen</h4>
+<h4 class="subsection">81.2.16 When Unicode Does Not Happen</h4>
 
-<p>While Perl does have extensive ways to input and output in Unicode,
-and a few other &quot;entry points&quot; like the <code>@ARGV</code> array 
(which can sometimes be
-interpreted as UTF-8), there are still many places where Unicode
-(in some encoding or another) could be given as arguments or received as
-results, or both, but it is not.
+<p>There are still many places where Unicode (in some encoding or
+another) could be given as arguments or received as results, or both in
+Perl, but it is not, in spite of Perl having extensive ways to input and
+output in Unicode, and a few other &quot;entry points&quot; like the 
<code>@ARGV</code>
+array (which can sometimes be interpreted as UTF-8).
 </p>
 <p>The following are such interfaces.  Also, see <a 
href="#perlunicode-The-_0022Unicode-Bug_0022">The &quot;Unicode Bug&quot;</a>.
 For all of these interfaces Perl
 currently (as of v5.16.0) simply assumes byte strings both as arguments
-and results, or UTF-8 strings if the (problematic) <code>encoding</code> 
pragma has been used.
+and results, or UTF-8 strings if the (deprecated) <code>encoding</code> pragma 
has been used.
 </p>
 <p>One reason that Perl does not attempt to resolve the role of Unicode in
 these situations is that the answers are highly dependent on the operating
@@ -101647,96 +104064,99 @@
 Next: <a 
href="#perlunicode-Forcing-Unicode-in-Perl-_0028Or-Unforcing-Unicode-in-Perl_0029"
 accesskey="n" rel="next">perlunicode Forcing Unicode in Perl (Or Unforcing 
Unicode in Perl)</a>, Previous: <a 
href="#perlunicode-When-Unicode-Does-Not-Happen" accesskey="p" 
rel="prev">perlunicode When Unicode Does Not Happen</a>, Up: <a 
href="#perlunicode-DESCRIPTION" accesskey="u" rel="up">perlunicode 
DESCRIPTION</a> &nbsp; [<a href="#SEC_Contents" title="Table of contents" 
rel="contents">Contents</a>]</p>
 </div>
 <a name="The-_0022Unicode-Bug_0022"></a>
-<h4 class="subsection">81.2.16 The &quot;Unicode Bug&quot;</h4>
+<h4 class="subsection">81.2.17 The &quot;Unicode Bug&quot;</h4>
 
-<p>The term, &quot;Unicode bug&quot; has been applied to an inconsistency
-on ASCII platforms with the
-Unicode code points in the <code>Latin-1 Supplement</code> block, that
-is, between 128 and 255.  Without a locale specified, unlike all other
-characters or code points, these characters have very different semantics in
-byte semantics versus character semantics, unless
-<code>use feature 'unicode_strings'</code> is specified, directly or 
indirectly.
-(It is indirectly specified by a <code>use v5.12</code> or higher.)
-</p>
-<p>In character semantics these upper-Latin1 characters are interpreted as
-Unicode code points, which means
-they have the same semantics as Latin-1 (ISO-8859-1).
-</p>
-<p>In byte semantics (without <code>unicode_strings</code>), they are 
considered to
-be unassigned characters, meaning that the only semantics they have is
-their ordinal numbers, and that they are
-not members of various character classes.  None are considered to match 
<code>\w</code>
-for example, but all match <code>\W</code>.
-</p>
-<p>Perl 5.12.0 added <code>unicode_strings</code> to force character semantics 
on
-these code points in some circumstances, which fixed portions of the
-bug; Perl 5.14.0 fixed almost all of it; and Perl 5.16.0 fixed the
-remainder (so far as we know, anyway).  The lesson here is to enable
-<code>unicode_strings</code> to avoid the headaches described below.
+<p>The term, &quot;Unicode bug&quot; has been applied to an inconsistency with 
the
+code points in the <code>Latin-1 Supplement</code> block, that is, between
+128 and 255.  Without a locale specified, unlike all other characters or
+code points, these characters can have very different semantics
+depending on the rules in effect.  (Characters whose code points are
+above 255 force Unicode rules; whereas the rules for ASCII characters
+are the same under both ASCII and Unicode rules.)
+</p>
+<p>Under Unicode rules, these upper-Latin1 characters are interpreted as
+Unicode code points, which means they have the same semantics as Latin-1
+(ISO-8859-1) and C1 controls.
+</p>
+<p>As explained in <a 
href="#perlunicode-ASCII-Rules-versus-Unicode-Rules">ASCII Rules versus Unicode 
Rules</a>, under ASCII rules,
+they are considered to be unassigned characters.
+</p>
+<p>This can lead to unexpected results.  For example, a string&rsquo;s
+semantics can suddenly change if a code point above 255 is appended to
+it, which changes the rules from ASCII to Unicode.  As an
+example, consider the following program and its output:
 </p>
-<p>The old, problematic behavior affects these areas:
+<pre class="verbatim"> $ perl -le'
+     no feature 'unicode_strings';
+     $s1 = &quot;\xC2&quot;;
+     $s2 = &quot;\x{2660}&quot;;
+     for ($s1, $s2, $s1.$s2) {
+         print /\w/ || 0;
+     }
+ '
+ 0
+ 0
+ 1
+</pre>
+<p>If there&rsquo;s no <code>\w</code> in <code>s1</code> nor in 
<code>s2</code>, why does their concatenation
+have one?
+</p>
+<p>This anomaly stems from Perl&rsquo;s attempt to not disturb older programs 
that
+didn&rsquo;t use Unicode, along with Perl&rsquo;s desire to add Unicode support
+seamlessly.  But the result turned out to not be seamless.  (By the way,
+you can choose to be warned when things like this happen.  See
+<code><a href="encoding-warnings.html#Top">(encoding-warnings)</a></code>.)
+</p>
+<p><a 
href="feature.html#The-_0027unicode_005fstrings_0027-feature">(feature)<code>use&nbsp;feature&nbsp;<span
 class="nolinebreak">'unicode_strings'</span></code><!-- /@w --></a>
+was added, starting in Perl v5.12, to address this problem.  It affects
+these things:
 </p>
 <ul>
 <li> Changing the case of a scalar, that is, using <code>uc()</code>, 
<code>ucfirst()</code>, <code>lc()</code>,
 and <code>lcfirst()</code>, or <code>\L</code>, <code>\U</code>, 
<code>\u</code> and <code>\l</code> in double-quotish
 contexts, such as regular expression substitutions.
-Under <code>unicode_strings</code> starting in Perl 5.12.0, character 
semantics are
+
+<p>Under <code>unicode_strings</code> starting in Perl 5.12.0, Unicode rules 
are
 generally used.  See <a href="#perlfunc-lc">perlfunc lc</a> for details on how 
this works
 in combination with various other pragmas.
-
+</p>
 </li><li> Using caseless (<code>/i</code>) regular expression matching.
-Starting in Perl 5.14.0, regular expressions compiled within
-the scope of <code>unicode_strings</code> use character semantics
+
+<p>Starting in Perl 5.14.0, regular expressions compiled within
+the scope of <code>unicode_strings</code> use Unicode rules
 even when executed or compiled into larger
 regular expressions outside the scope.
+</p>
+</li><li> Matching any of several properties in regular expressions.
 
-</li><li> Matching any of several properties in regular expressions, namely 
<code>\b</code>,
-<code>\B</code>, <code>\s</code>, <code>\S</code>, <code>\w</code>, 
<code>\W</code>, and all the Posix character classes
+<p>These properties are <code>\b</code> (without braces), <code>\B</code> 
(without braces),
+<code>\s</code>, <code>\S</code>, <code>\w</code>, <code>\W</code>, and all 
the Posix character classes
 <em>except</em> <code>[[:ascii:]]</code>.
-Starting in Perl 5.14.0, regular expressions compiled within
-the scope of <code>unicode_strings</code> use character semantics
+</p>
+<p>Starting in Perl 5.14.0, regular expressions compiled within
+the scope of <code>unicode_strings</code> use Unicode rules
 even when executed or compiled into larger
 regular expressions outside the scope.
+</p>
+</li><li> In <code>quotemeta</code> or its inline equivalent <code>\Q</code>.
 
-</li><li> In <code>quotemeta</code> or its inline equivalent <code>\Q</code>, 
no code points above 127
-are quoted in UTF-8 encoded strings, but in byte encoded strings, code
-points between 128-255 are always quoted.
-Starting in Perl 5.16.0, consistent quoting rules are used within the
+<p>Starting in Perl 5.16.0, consistent quoting rules are used within the
 scope of <code>unicode_strings</code>, as described in <a 
href="#perlfunc-quotemeta">perlfunc quotemeta</a>.
-
+Prior to that, or outside its scope, no code points above 127 are quoted
+in UTF-8 encoded strings, but in byte encoded strings, code points
+between 128-255 are always quoted.
+</p>
 </li></ul>
 
-<p>This behavior can lead to unexpected results in which a string&rsquo;s 
semantics
-suddenly change if a code point above 255 is appended to or removed from it,
-which changes the string&rsquo;s semantics from byte to character or vice 
versa.  As
-an example, consider the following program and its output:
-</p>
-<pre class="verbatim"> $ perl -le'
-     no feature 'unicode_strings';
-     $s1 = &quot;\xC2&quot;;
-     $s2 = &quot;\x{2660}&quot;;
-     for ($s1, $s2, $s1.$s2) {
-         print /\w/ || 0;
-     }
- '
- 0
- 0
- 1
-</pre>
-<p>If there&rsquo;s no <code>\w</code> in <code>s1</code> or in 
<code>s2</code>, why does their concatenation have one?
-</p>
-<p>This anomaly stems from Perl&rsquo;s attempt to not disturb older programs 
that
-didn&rsquo;t use Unicode, and hence had no semantics for characters outside of 
the
-ASCII range (except in a locale), along with Perl&rsquo;s desire to add Unicode
-support seamlessly.  The result wasn&rsquo;t seamless: these characters were
-orphaned.
+<p>You can see from the above that the effect of <code>unicode_strings</code>
+increased over several Perl releases.  (And Perl&rsquo;s support for Unicode
+continues to improve; it&rsquo;s best to use the latest available release in
+order to get the most complete and accurate results possible.)  Note that
+<code>unicode_strings</code> is automatically chosen if you 
<code>use&nbsp;5.012</code><!-- /@w --> or
+higher.
 </p>
 <p>For Perls earlier than those described above, or when a string is passed
-to a function outside the subpragma&rsquo;s scope, a workaround is to always
-call <a 
href="utf8.html#Utility-functions">(utf8)<code>utf8::upgrade($string)</code></a>,
-or to use the standard module <a href="Encode.html#Top">(Encode)</a>.   Also, 
a scalar that has any characters
-whose ordinal is <code>0x100</code> or above, or which were specified using 
either of the
-<code>\N{...}</code> notations, will automatically have character semantics.
+to a function outside the scope of <code>unicode_strings</code>, see the next 
section.
 </p>
 <hr>
 <a 
name="perlunicode-Forcing-Unicode-in-Perl-_0028Or-Unforcing-Unicode-in-Perl_0029"></a>
@@ -101745,14 +104165,14 @@
 Next: <a href="#perlunicode-Using-Unicode-in-XS" accesskey="n" 
rel="next">perlunicode Using Unicode in XS</a>, Previous: <a 
href="#perlunicode-The-_0022Unicode-Bug_0022" accesskey="p" 
rel="prev">perlunicode The &quot;Unicode Bug&quot;</a>, Up: <a 
href="#perlunicode-DESCRIPTION" accesskey="u" rel="up">perlunicode 
DESCRIPTION</a> &nbsp; [<a href="#SEC_Contents" title="Table of contents" 
rel="contents">Contents</a>]</p>
 </div>
 <a name="Forcing-Unicode-in-Perl-_0028Or-Unforcing-Unicode-in-Perl_0029"></a>
-<h4 class="subsection">81.2.17 Forcing Unicode in Perl (Or Unforcing Unicode 
in Perl)</h4>
+<h4 class="subsection">81.2.18 Forcing Unicode in Perl (Or Unforcing Unicode 
in Perl)</h4>
 
 <p>Sometimes (see <a href="#perlunicode-When-Unicode-Does-Not-Happen">When 
Unicode Does Not Happen</a> or <a 
href="#perlunicode-The-_0022Unicode-Bug_0022">The &quot;Unicode Bug&quot;</a>)
 there are situations where you simply need to force a byte
-string into UTF-8, or vice versa.  The low-level calls
+string into UTF-8, or vice versa.  The standard module <a 
href="Encode.html#Top">(Encode)</a> can be
+used for this, or the low-level calls
 <a 
href="utf8.html#Utility-functions">(utf8)<code>utf8::upgrade($bytestring)</code></a>
 and
-<a 
href="utf8.html#Utility-functions">(utf8)<code>utf8::downgrade($utf8string[, 
FAIL_OK])</code></a> are
-the answers.
+<a 
href="utf8.html#Utility-functions">(utf8)<code>utf8::downgrade($utf8string[, 
FAIL_OK])</code></a>.
 </p>
 <p>Note that <code>utf8::downgrade()</code> can fail if the string contains 
characters
 that don&rsquo;t fit into a byte.
@@ -101760,6 +104180,9 @@
 <p>Calling either function on a string that already is in the desired state is 
a
 no-op.
 </p>
+<p><a href="#perlunicode-ASCII-Rules-versus-Unicode-Rules">ASCII Rules versus 
Unicode Rules</a> gives all the ways that a string is
+made to use Unicode rules.
+</p>
 <hr>
 <a name="perlunicode-Using-Unicode-in-XS"></a>
 <div class="header">
@@ -101767,100 +104190,24 @@
 Next: <a 
href="#perlunicode-Hacking-Perl-to-work-on-earlier-Unicode-versions-_0028for-very-serious-hackers-only_0029"
 accesskey="n" rel="next">perlunicode Hacking Perl to work on earlier Unicode 
versions (for very serious hackers only)</a>, Previous: <a 
href="#perlunicode-Forcing-Unicode-in-Perl-_0028Or-Unforcing-Unicode-in-Perl_0029"
 accesskey="p" rel="prev">perlunicode Forcing Unicode in Perl (Or Unforcing 
Unicode in Perl)</a>, Up: <a href="#perlunicode-DESCRIPTION" accesskey="u" 
rel="up">perlunicode DESCRIPTION</a> &nbsp; [<a href="#SEC_Contents" 
title="Table of contents" rel="contents">Contents</a>]</p>
 </div>
 <a name="Using-Unicode-in-XS"></a>
-<h4 class="subsection">81.2.18 Using Unicode in XS</h4>
-
-<p>If you want to handle Perl Unicode in XS extensions, you may find the
-following C APIs useful.  See also <a 
href="#perlguts-Unicode-Support">perlguts Unicode Support</a> for an
-explanation about Unicode at the XS level, and <a 
href="perlapi.html#Top">(perlapi)</a> for the API
-details.
-</p>
-<ul>
-<li> <code>DO_UTF8(sv)</code> returns true if the <code>UTF8</code> flag is on 
and the bytes
-pragma is not in effect.  <code>SvUTF8(sv)</code> returns true if the 
<code>UTF8</code>
-flag is on; the <code>bytes</code> pragma is ignored.  The <code>UTF8</code> 
flag being on
-does <strong>not</strong> mean that there are any characters of code points 
greater
-than 255 (or 127) in the scalar or that there are even any characters
-in the scalar.  What the <code>UTF8</code> flag means is that the sequence of
-octets in the representation of the scalar is the sequence of UTF-8
-encoded code points of the characters of a string.  The <code>UTF8</code> flag
-being off means that each octet in this representation encodes a
-single character with code point 0..255 within the string.  Perl&rsquo;s
-Unicode model is not to use UTF-8 until it is absolutely necessary.
-
-</li><li> <code>uvchr_to_utf8(buf, chr)</code> writes a Unicode character code 
point into
-a buffer encoding the code point as UTF-8, and returns a pointer
-pointing after the UTF-8 bytes.  It works appropriately on EBCDIC machines.
-
-</li><li> <code>utf8_to_uvchr_buf(buf, bufend, lenp)</code> reads UTF-8 
encoded bytes from a
-buffer and
-returns the Unicode character code point and, optionally, the length of
-the UTF-8 byte sequence.  It works appropriately on EBCDIC machines.
+<h4 class="subsection">81.2.19 Using Unicode in XS</h4>
 
-</li><li> <code>utf8_length(start, end)</code> returns the length of the UTF-8 
encoded buffer
-in characters.  <code>sv_len_utf8(sv)</code> returns the length of the UTF-8 
encoded
-scalar.
-
-</li><li> <code>sv_utf8_upgrade(sv)</code> converts the string of the scalar 
to its UTF-8
-encoded form.  <code>sv_utf8_downgrade(sv)</code> does the opposite, if
-possible.  <code>sv_utf8_encode(sv)</code> is like sv_utf8_upgrade except that
-it does not set the <code>UTF8</code> flag.  <code>sv_utf8_decode()</code> 
does the
-opposite of <code>sv_utf8_encode()</code>.  Note that none of these are to be
-used as general-purpose encoding or decoding interfaces: <code>use 
Encode</code>
-for that.  <code>sv_utf8_upgrade()</code> is affected by the encoding pragma
-but <code>sv_utf8_downgrade()</code> is not (since the encoding pragma is
-designed to be a one-way street).
-
-</li><li> <code>is_utf8_string(buf, len)</code> returns true if 
<code>len</code> bytes of the buffer
-are valid UTF-8.
-
-</li><li> <code>is_utf8_char_buf(buf, buf_end)</code> returns true if the 
pointer points to
-a valid UTF-8 character.
-
-</li><li> <code>UTF8SKIP(buf)</code> will return the number of bytes in the 
UTF-8 encoded
-character in the buffer.  <code>UNISKIP(chr)</code> will return the number of 
bytes
-required to UTF-8-encode the Unicode character code point.  
<code>UTF8SKIP()</code>
-is useful for example for iterating over the characters of a UTF-8
-encoded buffer; <code>UNISKIP()</code> is useful, for example, in computing
-the size required for a UTF-8 encoded buffer.
-
-</li><li> <code>utf8_distance(a, b)</code> will tell the distance in 
characters between the
-two pointers pointing to the same UTF-8 encoded buffer.
-
-</li><li> <code>utf8_hop(s, off)</code> will return a pointer to a UTF-8 
encoded buffer
-that is <code>off</code> (positive or negative) Unicode characters displaced
-from the UTF-8 buffer <code>s</code>.  Be careful not to overstep the buffer:
-<code>utf8_hop()</code> will merrily run off the end or the beginning of the
-buffer if told to do so.
-
-</li><li> <code>pv_uni_display(dsv, spv, len, pvlim, flags)</code> and
-<code>sv_uni_display(dsv, ssv, pvlim, flags)</code> are useful for debugging 
the
-output of Unicode strings and scalars.  By default they are useful
-only for debugging&ndash;they display <strong>all</strong> characters as 
hexadecimal code
-points&ndash;but with the flags <code>UNI_DISPLAY_ISPRINT</code>,
-<code>UNI_DISPLAY_BACKSLASH</code>, and <code>UNI_DISPLAY_QQ</code> you can 
make the
-output more readable.
-
-</li><li> <code>foldEQ_utf8(s1, pe1, l1, u1, s2, pe2, l2, u2)</code> can be 
used to
-compare two strings case-insensitively in Unicode.  For case-sensitive
-comparisons you can just use <code>memEQ()</code> and <code>memNE()</code> as 
usual, except
-if one string is in utf8 and the other isn&rsquo;t.
-
-</li></ul>
-
-<p>For more information, see <a href="perlapi.html#Top">(perlapi)</a>, and 
<samp>utf8.c</samp> and <samp>utf8.h</samp>
-in the Perl source code distribution.
+<p>See <a href="#perlguts-Unicode-Support">perlguts Unicode Support</a> for an 
introduction to Unicode at
+the XS level, and <a href="perlapi.html#Unicode-Support">(perlapi)Unicode 
Support</a> for the API details.
 </p>
 <hr>
 <a 
name="perlunicode-Hacking-Perl-to-work-on-earlier-Unicode-versions-_0028for-very-serious-hackers-only_0029"></a>
 <div class="header">
 <p>
-Previous: <a href="#perlunicode-Using-Unicode-in-XS" accesskey="p" 
rel="prev">perlunicode Using Unicode in XS</a>, Up: <a 
href="#perlunicode-DESCRIPTION" accesskey="u" rel="up">perlunicode 
DESCRIPTION</a> &nbsp; [<a href="#SEC_Contents" title="Table of contents" 
rel="contents">Contents</a>]</p>
+Next: <a href="#perlunicode-Porting-code-from-perl_002d5_002e6_002eX" 
accesskey="n" rel="next">perlunicode Porting code from perl-5.6.X</a>, 
Previous: <a href="#perlunicode-Using-Unicode-in-XS" accesskey="p" 
rel="prev">perlunicode Using Unicode in XS</a>, Up: <a 
href="#perlunicode-DESCRIPTION" accesskey="u" rel="up">perlunicode 
DESCRIPTION</a> &nbsp; [<a href="#SEC_Contents" title="Table of contents" 
rel="contents">Contents</a>]</p>
 </div>
 <a 
name="Hacking-Perl-to-work-on-earlier-Unicode-versions-_0028for-very-serious-hackers-only_0029"></a>
-<h4 class="subsection">81.2.19 Hacking Perl to work on earlier Unicode 
versions (for very serious hackers only)</h4>
+<h4 class="subsection">81.2.20 Hacking Perl to work on earlier Unicode 
versions (for very serious hackers only)</h4>
 
-<p>Perl by default comes with the latest supported Unicode version built in, 
but
-you can change to use any earlier one.
+<p>Perl by default comes with the latest supported Unicode version built-in, 
but
+the goal is to allow you to change to use any earlier one.  In Perls
+v5.20 and v5.22, however, the earliest usable version is Unicode 5.1.
+Perl v5.18 is able to handle all earlier versions.
 </p>
 <p>Download the files in the desired version of Unicode from the Unicode web
 site <a href="http://www.unicode.org";>http://www.unicode.org</a>).  These 
should replace the existing files in
@@ -101869,59 +104216,135 @@
 perl (see <a href="INSTALL.html#Top">(INSTALL)</a>).
 </p>
 <hr>
-<a name="perlunicode-BUGS"></a>
+<a name="perlunicode-Porting-code-from-perl_002d5_002e6_002eX"></a>
 <div class="header">
 <p>
-Next: <a href="#perlunicode-SEE-ALSO" accesskey="n" rel="next">perlunicode SEE 
ALSO</a>, Previous: <a href="#perlunicode-DESCRIPTION" accesskey="p" 
rel="prev">perlunicode DESCRIPTION</a>, Up: <a href="#perlunicode" 
accesskey="u" rel="up">perlunicode</a> &nbsp; [<a href="#SEC_Contents" 
title="Table of contents" rel="contents">Contents</a>]</p>
+Previous: <a 
href="#perlunicode-Hacking-Perl-to-work-on-earlier-Unicode-versions-_0028for-very-serious-hackers-only_0029"
 accesskey="p" rel="prev">perlunicode Hacking Perl to work on earlier Unicode 
versions (for very serious hackers only)</a>, Up: <a 
href="#perlunicode-DESCRIPTION" accesskey="u" rel="up">perlunicode 
DESCRIPTION</a> &nbsp; [<a href="#SEC_Contents" title="Table of contents" 
rel="contents">Contents</a>]</p>
 </div>
-<a name="BUGS-10"></a>
-<h3 class="section">81.3 BUGS</h3>
+<a name="Porting-code-from-perl_002d5_002e6_002eX"></a>
+<h4 class="subsection">81.2.21 Porting code from perl-5.6.X</h4>
 
-<table class="menu" border="0" cellspacing="0">
-<tr><td align="left" valign="top">&bull; <a 
href="#perlunicode-Interaction-with-Locales" accesskey="1">perlunicode 
Interaction with Locales</a>:</td><td>&nbsp;&nbsp;</td><td align="left" 
valign="top">
-</td></tr>
-<tr><td align="left" valign="top">&bull; <a 
href="#perlunicode-Problems-with-characters-in-the-Latin_002d1-Supplement-range"
 accesskey="2">perlunicode Problems with characters in the Latin-1 Supplement 
range</a>:</td><td>&nbsp;&nbsp;</td><td align="left" valign="top">
-</td></tr>
-<tr><td align="left" valign="top">&bull; <a 
href="#perlunicode-Interaction-with-Extensions" accesskey="3">perlunicode 
Interaction with Extensions</a>:</td><td>&nbsp;&nbsp;</td><td align="left" 
valign="top">
-</td></tr>
-<tr><td align="left" valign="top">&bull; <a href="#perlunicode-Speed" 
accesskey="4">perlunicode Speed</a>:</td><td>&nbsp;&nbsp;</td><td align="left" 
valign="top">
-</td></tr>
-<tr><td align="left" valign="top">&bull; <a 
href="#perlunicode-Problems-on-EBCDIC-platforms" accesskey="5">perlunicode 
Problems on EBCDIC platforms</a>:</td><td>&nbsp;&nbsp;</td><td align="left" 
valign="top">
-</td></tr>
-<tr><td align="left" valign="top">&bull; <a 
href="#perlunicode-Porting-code-from-perl_002d5_002e6_002eX" 
accesskey="6">perlunicode Porting code from 
perl-5.6.X</a>:</td><td>&nbsp;&nbsp;</td><td align="left" valign="top">
-</td></tr>
-</table>
+<p>Perls starting in 5.8 have a different Unicode model from 5.6. In 5.6 the
+programmer was required to use the <code>utf8</code> pragma to declare that a
+given scope expected to deal with Unicode data and had to make sure that
+only Unicode data were reaching that scope. If you have code that is
+working with 5.6, you will need some of the following adjustments to
+your code. The examples are written such that the code will continue to
+work under 5.6, so you should be safe to try them out.
+</p>
+<ul>
+<li> A filehandle that should read or write UTF-8
 
-<hr>
-<a name="perlunicode-Interaction-with-Locales"></a>
-<div class="header">
-<p>
-Next: <a 
href="#perlunicode-Problems-with-characters-in-the-Latin_002d1-Supplement-range"
 accesskey="n" rel="next">perlunicode Problems with characters in the Latin-1 
Supplement range</a>, Up: <a href="#perlunicode-BUGS" accesskey="u" 
rel="up">perlunicode BUGS</a> &nbsp; [<a href="#SEC_Contents" title="Table of 
contents" rel="contents">Contents</a>]</p>
-</div>
-<a name="Interaction-with-Locales"></a>
-<h4 class="subsection">81.3.1 Interaction with Locales</h4>
+<pre class="verbatim">  if ($] &gt; 5.008) {
+    binmode $fh, &quot;:encoding(utf8)&quot;;
+  }
+</pre>
+</li><li> A scalar that is going to be passed to some extension
 
-<p>See <a href="#perllocale-Unicode-and-UTF_002d8">perllocale Unicode and 
UTF-8</a>
+<p>Be it <code>Compress::Zlib</code>, <code>Apache::Request</code> or any 
extension that has no
+mention of Unicode in the manpage, you need to make sure that the
+UTF8 flag is stripped off. Note that at the time of this writing
+(January 2012) the mentioned modules are not UTF-8-aware. Please
+check the documentation to verify if this is still true.
+</p>
+<pre class="verbatim">  if ($] &gt; 5.008) {
+    require Encode;
+    $val = Encode::encode_utf8($val); # make octets
+  }
+</pre>
+</li><li> A scalar we got back from an extension
+
+<p>If you believe the scalar comes back as UTF-8, you will most likely
+want the UTF8 flag restored:
+</p>
+<pre class="verbatim">  if ($] &gt; 5.008) {
+    require Encode;
+    $val = Encode::decode_utf8($val);
+  }
+</pre>
+</li><li> Same thing, if you are really sure it is UTF-8
+
+<pre class="verbatim">  if ($] &gt; 5.008) {
+    require Encode;
+    Encode::_utf8_on($val);
+  }
+</pre>
+</li><li> A wrapper for <a href="DBI.html#Top">(DBI)</a> 
<code>fetchrow_array</code> and <code>fetchrow_hashref</code>
+
+<p>When the database contains only UTF-8, a wrapper function or method is
+a convenient way to replace all your <code>fetchrow_array</code> and
+<code>fetchrow_hashref</code> calls. A wrapper function will also make it 
easier to
+adapt to future enhancements in your database driver. Note that at the
+time of this writing (January 2012), the DBI has no standardized way
+to deal with UTF-8 data. Please check the <a href="DBI.html#Top">(DBI)DBI 
documentation</a> to verify if
+that is still true.
 </p>
+<pre class="verbatim">  sub fetchrow {
+    # $what is one of fetchrow_{array,hashref}
+    my($self, $sth, $what) = @_;
+    if ($] &lt; 5.008) {
+      return $sth-&gt;$what;
+    } else {
+      require Encode;
+      if (wantarray) {
+        my @arr = $sth-&gt;$what;
+        for (@arr) {
+          defined &amp;&amp; /[^\000-\177]/ &amp;&amp; Encode::_utf8_on($_);
+        }
+        return @arr;
+      } else {
+        my $ret = $sth-&gt;$what;
+        if (ref $ret) {
+          for my $k (keys %$ret) {
+            defined
+            &amp;&amp; /[^\000-\177]/
+            &amp;&amp; Encode::_utf8_on($_) for $ret-&gt;{$k};
+          }
+          return $ret;
+        } else {
+          defined &amp;&amp; /[^\000-\177]/ &amp;&amp; Encode::_utf8_on($_) 
for $ret;
+          return $ret;
+        }
+      }
+    }
+  }
+</pre>
+</li><li> A large scalar that you know can only contain ASCII
+
+<p>Scalars that contain only ASCII and are marked as UTF-8 are sometimes
+a drag to your program. If you recognize such a situation, just remove
+the UTF8 flag:
+</p>
+<pre class="verbatim">  utf8::downgrade($val) if $] &gt; 5.008;
+</pre>
+</li></ul>
+
 <hr>
-<a 
name="perlunicode-Problems-with-characters-in-the-Latin_002d1-Supplement-range"></a>
+<a name="perlunicode-BUGS"></a>
 <div class="header">
 <p>
-Next: <a href="#perlunicode-Interaction-with-Extensions" accesskey="n" 
rel="next">perlunicode Interaction with Extensions</a>, Previous: <a 
href="#perlunicode-Interaction-with-Locales" accesskey="p" 
rel="prev">perlunicode Interaction with Locales</a>, Up: <a 
href="#perlunicode-BUGS" accesskey="u" rel="up">perlunicode BUGS</a> &nbsp; [<a 
href="#SEC_Contents" title="Table of contents" rel="contents">Contents</a>]</p>
+Next: <a href="#perlunicode-SEE-ALSO" accesskey="n" rel="next">perlunicode SEE 
ALSO</a>, Previous: <a href="#perlunicode-DESCRIPTION" accesskey="p" 
rel="prev">perlunicode DESCRIPTION</a>, Up: <a href="#perlunicode" 
accesskey="u" rel="up">perlunicode</a> &nbsp; [<a href="#SEC_Contents" 
title="Table of contents" rel="contents">Contents</a>]</p>
 </div>
-<a name="Problems-with-characters-in-the-Latin_002d1-Supplement-range"></a>
-<h4 class="subsection">81.3.2 Problems with characters in the Latin-1 
Supplement range</h4>
+<a name="BUGS-10"></a>
+<h3 class="section">81.3 BUGS</h3>
 
-<p>See <a href="#perlunicode-The-_0022Unicode-Bug_0022">The &quot;Unicode 
Bug&quot;</a>
+<p>See also <a href="#perlunicode-The-_0022Unicode-Bug_0022">The &quot;Unicode 
Bug&quot;</a> above.
 </p>
+<table class="menu" border="0" cellspacing="0">
+<tr><td align="left" valign="top">&bull; <a 
href="#perlunicode-Interaction-with-Extensions" accesskey="1">perlunicode 
Interaction with Extensions</a>:</td><td>&nbsp;&nbsp;</td><td align="left" 
valign="top">
+</td></tr>
+<tr><td align="left" valign="top">&bull; <a href="#perlunicode-Speed" 
accesskey="2">perlunicode Speed</a>:</td><td>&nbsp;&nbsp;</td><td align="left" 
valign="top">
+</td></tr>
+</table>
+
 <hr>
 <a name="perlunicode-Interaction-with-Extensions"></a>
 <div class="header">
 <p>
-Next: <a href="#perlunicode-Speed" accesskey="n" rel="next">perlunicode 
Speed</a>, Previous: <a 
href="#perlunicode-Problems-with-characters-in-the-Latin_002d1-Supplement-range"
 accesskey="p" rel="prev">perlunicode Problems with characters in the Latin-1 
Supplement range</a>, Up: <a href="#perlunicode-BUGS" accesskey="u" 
rel="up">perlunicode BUGS</a> &nbsp; [<a href="#SEC_Contents" title="Table of 
contents" rel="contents">Contents</a>]</p>
+Next: <a href="#perlunicode-Speed" accesskey="n" rel="next">perlunicode 
Speed</a>, Up: <a href="#perlunicode-BUGS" accesskey="u" rel="up">perlunicode 
BUGS</a> &nbsp; [<a href="#SEC_Contents" title="Table of contents" 
rel="contents">Contents</a>]</p>
 </div>
 <a name="Interaction-with-Extensions"></a>
-<h4 class="subsection">81.3.3 Interaction with Extensions</h4>
+<h4 class="subsection">81.3.1 Interaction with Extensions</h4>
 
 <p>When Perl exchanges data with an extension, the extension should be
 able to understand the UTF8 flag and act accordingly. If the
@@ -101956,7 +104379,7 @@
     }
 </pre>
 <p>Sometimes, when the extension does not convert data but just stores
-and retrieves them, you will be able to use the otherwise
+and retrieves it, you will be able to use the otherwise
 dangerous <a 
href="Encode.html#g_t_005futf8_005fon">(Encode)<code>Encode::_utf8_on()</code></a>
 function. Let&rsquo;s say
 the popular <code>Foo::Bar</code> extension, written in C, provides a 
<code>param</code>
 method that lets you store and retrieve data according to these prototypes:
@@ -101982,17 +104405,17 @@
 </pre>
 <p>Some extensions provide filters on data entry/exit points, such as
 <code>DB_File::filter_store_key</code> and family. Look out for such filters in
-the documentation of your extensions, they can make the transition to
+the documentation of your extensions; they can make the transition to
 Unicode data much easier.
 </p>
 <hr>
 <a name="perlunicode-Speed"></a>
 <div class="header">
 <p>
-Next: <a href="#perlunicode-Problems-on-EBCDIC-platforms" accesskey="n" 
rel="next">perlunicode Problems on EBCDIC platforms</a>, Previous: <a 
href="#perlunicode-Interaction-with-Extensions" accesskey="p" 
rel="prev">perlunicode Interaction with Extensions</a>, Up: <a 
href="#perlunicode-BUGS" accesskey="u" rel="up">perlunicode BUGS</a> &nbsp; [<a 
href="#SEC_Contents" title="Table of contents" rel="contents">Contents</a>]</p>
+Previous: <a href="#perlunicode-Interaction-with-Extensions" accesskey="p" 
rel="prev">perlunicode Interaction with Extensions</a>, Up: <a 
href="#perlunicode-BUGS" accesskey="u" rel="up">perlunicode BUGS</a> &nbsp; [<a 
href="#SEC_Contents" title="Table of contents" rel="contents">Contents</a>]</p>
 </div>
 <a name="Speed"></a>
-<h4 class="subsection">81.3.4 Speed</h4>
+<h4 class="subsection">81.3.2 Speed</h4>
 
 <p>Some functions are slower when working on UTF-8 encoded strings than
 on byte encoded strings.  All functions that need to hop over
@@ -102001,137 +104424,13 @@
 byte-encoded.
 </p>
 <p>In Perl 5.8.0 the slowness was often quite spectacular; in Perl 5.8.1
-a caching scheme was introduced which will hopefully make the slowness
-somewhat less spectacular, at least for some operations.  In general,
+a caching scheme was introduced which improved the situation.  In general,
 operations with UTF-8 encoded strings are still slower. As an example,
 the Unicode properties (character classes) like <code>\p{Nd}</code> are known 
to
 be quite a bit slower (5-20 times) than their simpler counterparts
-like <code>\d</code> (then again, there are hundreds of Unicode characters 
matching <code>Nd</code>
-compared with the 10 ASCII characters matching <code>d</code>).
-</p>
-<hr>
-<a name="perlunicode-Problems-on-EBCDIC-platforms"></a>
-<div class="header">
-<p>
-Next: <a href="#perlunicode-Porting-code-from-perl_002d5_002e6_002eX" 
accesskey="n" rel="next">perlunicode Porting code from perl-5.6.X</a>, 
Previous: <a href="#perlunicode-Speed" accesskey="p" rel="prev">perlunicode 
Speed</a>, Up: <a href="#perlunicode-BUGS" accesskey="u" rel="up">perlunicode 
BUGS</a> &nbsp; [<a href="#SEC_Contents" title="Table of contents" 
rel="contents">Contents</a>]</p>
-</div>
-<a name="Problems-on-EBCDIC-platforms"></a>
-<h4 class="subsection">81.3.5 Problems on EBCDIC platforms</h4>
-
-<p>There are several known problems with Perl on EBCDIC platforms.  If you
-want to use Perl there, send email to address@hidden
-</p>
-<p>In earlier versions, when byte and character data were concatenated,
-the new string was sometimes created by
-decoding the byte strings as <em>ISO 8859-1 (Latin-1)</em>, even if the
-old Unicode string used EBCDIC.
-</p>
-<p>If you find any of these, please report them as bugs.
-</p>
-<hr>
-<a name="perlunicode-Porting-code-from-perl_002d5_002e6_002eX"></a>
-<div class="header">
-<p>
-Previous: <a href="#perlunicode-Problems-on-EBCDIC-platforms" accesskey="p" 
rel="prev">perlunicode Problems on EBCDIC platforms</a>, Up: <a 
href="#perlunicode-BUGS" accesskey="u" rel="up">perlunicode BUGS</a> &nbsp; [<a 
href="#SEC_Contents" title="Table of contents" rel="contents">Contents</a>]</p>
-</div>
-<a name="Porting-code-from-perl_002d5_002e6_002eX"></a>
-<h4 class="subsection">81.3.6 Porting code from perl-5.6.X</h4>
-
-<p>Perl 5.8 has a different Unicode model from 5.6. In 5.6 the programmer
-was required to use the <code>utf8</code> pragma to declare that a given scope
-expected to deal with Unicode data and had to make sure that only
-Unicode data were reaching that scope. If you have code that is
-working with 5.6, you will need some of the following adjustments to
-your code. The examples are written such that the code will continue
-to work under 5.6, so you should be safe to try them out.
-</p>
-<ul>
-<li> A filehandle that should read or write UTF-8
-
-<pre class="verbatim">  if ($] &gt; 5.008) {
-    binmode $fh, &quot;:encoding(utf8)&quot;;
-  }
-</pre>
-</li><li> A scalar that is going to be passed to some extension
-
-<p>Be it <code>Compress::Zlib</code>, <code>Apache::Request</code> or any 
extension that has no
-mention of Unicode in the manpage, you need to make sure that the
-UTF8 flag is stripped off. Note that at the time of this writing
-(January 2012) the mentioned modules are not UTF-8-aware. Please
-check the documentation to verify if this is still true.
-</p>
-<pre class="verbatim">  if ($] &gt; 5.008) {
-    require Encode;
-    $val = Encode::encode_utf8($val); # make octets
-  }
-</pre>
-</li><li> A scalar we got back from an extension
-
-<p>If you believe the scalar comes back as UTF-8, you will most likely
-want the UTF8 flag restored:
-</p>
-<pre class="verbatim">  if ($] &gt; 5.008) {
-    require Encode;
-    $val = Encode::decode_utf8($val);
-  }
-</pre>
-</li><li> Same thing, if you are really sure it is UTF-8
-
-<pre class="verbatim">  if ($] &gt; 5.008) {
-    require Encode;
-    Encode::_utf8_on($val);
-  }
-</pre>
-</li><li> A wrapper for <a href="DBI.html#Top">(DBI)</a> 
<code>fetchrow_array</code> and <code>fetchrow_hashref</code>
-
-<p>When the database contains only UTF-8, a wrapper function or method is
-a convenient way to replace all your <code>fetchrow_array</code> and
-<code>fetchrow_hashref</code> calls. A wrapper function will also make it 
easier to
-adapt to future enhancements in your database driver. Note that at the
-time of this writing (January 2012), the DBI has no standardized way
-to deal with UTF-8 data. Please check the <a href="DBI.html#Top">(DBI)DBI 
documentation</a> to verify if
-that is still true.
-</p>
-<pre class="verbatim">  sub fetchrow {
-    # $what is one of fetchrow_{array,hashref}
-    my($self, $sth, $what) = @_;
-    if ($] &lt; 5.008) {
-      return $sth-&gt;$what;
-    } else {
-      require Encode;
-      if (wantarray) {
-        my @arr = $sth-&gt;$what;
-        for (@arr) {
-          defined &amp;&amp; /[^\000-\177]/ &amp;&amp; Encode::_utf8_on($_);
-        }
-        return @arr;
-      } else {
-        my $ret = $sth-&gt;$what;
-        if (ref $ret) {
-          for my $k (keys %$ret) {
-            defined
-            &amp;&amp; /[^\000-\177]/
-            &amp;&amp; Encode::_utf8_on($_) for $ret-&gt;{$k};
-          }
-          return $ret;
-        } else {
-          defined &amp;&amp; /[^\000-\177]/ &amp;&amp; Encode::_utf8_on($_) 
for $ret;
-          return $ret;
-        }
-      }
-    }
-  }
-</pre>
-</li><li> A large scalar that you know can only contain ASCII
-
-<p>Scalars that contain only ASCII and are marked as UTF-8 are sometimes
-a drag to your program. If you recognize such a situation, just remove
-the UTF8 flag:
+like <code>[0-9]</code> (then again, there are hundreds of Unicode characters 
matching
+<code>Nd</code> compared with the 10 ASCII characters matching 
<code>[0-9]</code>).
 </p>
-<pre class="verbatim">  utf8::downgrade($val) if $] &gt; 5.008;
-</pre>
-</li></ul>
-
 <hr>
 <a name="perlunicode-SEE-ALSO"></a>
 <div class="header">
@@ -102142,7 +104441,7 @@
 <h3 class="section">81.4 SEE ALSO</h3>
 
 <p><a href="#perlunitut-NAME">perlunitut NAME</a>, <a 
href="#perluniintro-NAME">perluniintro NAME</a>, <a 
href="perluniprops.html#Top">(perluniprops)</a>, <a 
href="Encode.html#Top">(Encode)</a>, <a href="open.html#Top">(open)</a>, <a 
href="utf8.html#Top">(utf8)</a>, <a href="bytes.html#Top">(bytes)</a>,
-<a href="#perlretut-NAME">perlretut NAME</a>, <a 
href="#perlvar-_0024_007b_005eUNICODE_007d">perlvar ${^UNICODE}</a>
+<a href="#perlretut-NAME">perlretut NAME</a>, <a 
href="#perlvar-_0024_007b_005eUNICODE_007d">perlvar ${^UNICODE}</a>,
 <a 
href="http://www.unicode.org/reports/tr44";>http://www.unicode.org/reports/tr44</a>).
 </p>
 <hr>
@@ -102542,14 +104841,12 @@
 <a name="What-is-a-_0022wide-character_0022_003f"></a>
 <h4 class="subsection">82.2.17 What is a &quot;wide character&quot;?</h4>
 
-<p>This is a term used both for characters with an ordinal value greater than 
127,
-characters with an ordinal value greater than 255, or any character occupying
-more than one byte, depending on the context.
-</p>
-<p>The Perl warning &quot;Wide character in ...&quot; is caused by a character 
with an
-ordinal value greater than 255. With no specified encoding layer, Perl tries to
-fit things in ISO-8859-1 for backward compatibility reasons. When it 
can&rsquo;t, it
-emits this warning (if warnings are enabled), and outputs UTF-8 encoded data
+<p>This is a term used for characters occupying more than one byte.
+</p>
+<p>The Perl warning &quot;Wide character in ...&quot; is caused by such a 
character.
+With no specified encoding layer, Perl tries to
+fit things into a single byte.  When it can&rsquo;t, it
+emits this warning (if warnings are enabled), and uses UTF-8 encoded data
 instead.
 </p>
 <p>To avoid this warning and to avoid having different output encodings in a 
single
@@ -102904,15 +105201,15 @@
 number for every character&quot; idea breaks down a bit: instead, there is
 &quot;at least one number for every character&quot;.  The same character could
 be represented differently in several legacy encodings.  The
-converse is not also true: some code points do not have an assigned
+converse is not true: some code points do not have an assigned
 character.  Firstly, there are unallocated code points within
 otherwise used blocks.  Secondly, there are special Unicode control
 characters that do not represent true characters.
 </p>
 <p>When Unicode was first conceived, it was thought that all the world&rsquo;s
 characters could be represented using a 16-bit word; that is a maximum of
-<code>0x10000</code> (or 65536) characters from <code>0x0000</code> to 
<code>0xFFFF</code> would be
-needed.  This soon proved to be false, and since Unicode 2.0 (July
+<code>0x10000</code> (or 65,536) characters would be needed, from 
<code>0x0000</code> to
+<code>0xFFFF</code>.  This soon proved to be wrong, and since Unicode 2.0 (July
 1996), Unicode has been defined all the way up to 21 bits 
(<code>0x10FFFF</code>),
 and Unicode 3.1 (March 2001) defined the first characters above 
<code>0xFFFF</code>.
 The first <code>0x10000</code> characters are called the <em>Plane 0</em>, or 
the
@@ -102949,8 +105246,8 @@
 <p>The Unicode code points are just abstract numbers.  To input and
 output these abstract numbers, the numbers must be <em>encoded</em> or
 <em>serialised</em> somehow.  Unicode defines several <em>character encoding
-forms</em>, of which <em>UTF-8</em> is perhaps the most popular.  UTF-8 is a
-variable length encoding that encodes Unicode characters as 1 to 6
+forms</em>, of which <em>UTF-8</em> is the most popular.  UTF-8 is a
+variable length encoding that encodes Unicode characters as 1 to 4
 bytes.  Other encodings
 include UTF-16 and UTF-32 and their big- and little-endian variants
 (UTF-8 is byte-order independent).  The ISO/IEC 10646 defines the UCS-2
@@ -102975,7 +105272,7 @@
 regular expressions still do not work with Unicode in 5.6.1.
 Perl v5.14.0 is the first release where Unicode support is
 (almost) seamlessly integrable without some gotchas (the exception being
-some differences in <a href="#perlfunc-quotemeta">quotemeta</a>, which is fixed
+some differences in <a href="#perlfunc-quotemeta">quotemeta</a>, and that is 
fixed
 starting in Perl 5.16.0).   To enable this
 seamless support, you should <code>use feature 'unicode_strings'</code> (which 
is
 automatically selected if you <code>use 5.012</code> or higher).  See <a 
href="feature.html#Top">(feature)</a>.
@@ -103067,18 +105364,19 @@
 <a name="Unicode-and-EBCDIC"></a>
 <h4 class="subsection">83.2.4 Unicode and EBCDIC</h4>
 
-<p>Perl 5.8.0 also supports Unicode on EBCDIC platforms.  There,
-Unicode support is somewhat more complex to implement since
-additional conversions are needed at every step.
-</p>
-<p>Later Perl releases have added code that will not work on EBCDIC platforms, 
and
-no one has complained, so the divergence has continued.  If you want to run
-Perl on an EBCDIC platform, send email to address@hidden
+<p>Perl 5.8.0 added support for Unicode on EBCDIC platforms.  This support
+was allowed to lapse in later releases, but was revived in 5.22.
+Unicode support is somewhat more complex to implement since additional
+conversions are needed.  See <a href="#perlebcdic-NAME">perlebcdic NAME</a> 
for more information.
 </p>
 <p>On EBCDIC platforms, the internal Unicode encoding form is UTF-EBCDIC
 instead of UTF-8.  The difference is that as UTF-8 is &quot;ASCII-safe&quot; in
 that ASCII characters encode to UTF-8 as-is, while UTF-EBCDIC is
-&quot;EBCDIC-safe&quot;.
+&quot;EBCDIC-safe&quot;, in that all the basic characters (which includes all
+those that have ASCII equivalents (like <code>&quot;A&quot;</code>, 
<code>&quot;0&quot;</code>, <code>&quot;%&quot;</code>, <em>etc.</em>)
+are the same in both EBCDIC and UTF-EBCDIC.  Often, documentation
+will use the term &quot;UTF-8&quot; to mean UTF-EBCDIC as well.  This is the 
case
+in this document.
 </p>
 <hr>
 <a name="perluniintro-Creating-Unicode"></a>
@@ -103089,55 +105387,95 @@
 <a name="Creating-Unicode"></a>
 <h4 class="subsection">83.2.5 Creating Unicode</h4>
 
-<p>To create Unicode characters in literals for code points above 
<code>0xFF</code>,
-use the <code>\x{...}</code> notation in double-quoted strings:
+<p>This section applies fully to Perls starting with v5.22.  Various
+caveats for earlier releases are in the <a 
href="#perluniintro-Earlier-releases-caveats">Earlier releases caveats</a>
+subsection below.
 </p>
-<pre class="verbatim">    my $smiley = &quot;\x{263a}&quot;;
-</pre>
-<p>Similarly, it can be used in regular expression literals
+<p>To create Unicode characters in literals,
+use the <code>\N{...}</code> notation in double-quoted strings:
 </p>
-<pre class="verbatim">    $smiley =~ /\x{263a}/;
+<pre class="verbatim"> my $smiley_from_name = &quot;\N{WHITE SMILING 
FACE}&quot;;
+ my $smiley_from_code_point = &quot;\N{U+263a}&quot;;
 </pre>
-<p>At run-time you can use <code>chr()</code>:
+<p>Similarly, they can be used in regular expression literals
 </p>
-<pre class="verbatim">    my $hebrew_alef = chr(0x05d0);
+<pre class="verbatim"> $smiley =~ /\N{WHITE SMILING FACE}/;
+ $smiley =~ /\N{U+263a}/;
 </pre>
-<p>See <a href="#perluniintro-Further-Resources">Further Resources</a> for how 
to find all these numeric codes.
+<p>At run-time you can use:
 </p>
+<pre class="verbatim"> use charnames ();
+ my $hebrew_alef_from_name
+                      = charnames::string_vianame(&quot;HEBREW LETTER 
ALEF&quot;);
+ my $hebrew_alef_from_code_point = 
charnames::string_vianame(&quot;U+05D0&quot;);
+</pre>
 <p>Naturally, <code>ord()</code> will do the reverse: it turns a character into
 a code point.
 </p>
-<p>Note that <code>\x..</code> (no <code>{}</code> and only two hexadecimal 
digits), <code>\x{...}</code>,
-and <code>chr(...)</code> for arguments less than <code>0x100</code> (decimal 
256)
-generate an eight-bit character for backward compatibility with older
-Perls.  For arguments of <code>0x100</code> or more, Unicode characters are
-always produced. If you want to force the production of Unicode
-characters regardless of the numeric value, use <code>pack(&quot;U&quot;, 
...)</code>
-instead of <code>\x..</code>, <code>\x{...}</code>, or <code>chr()</code>.
+<p>There are other runtime options as well.  You can use <code>pack()</code>:
 </p>
-<p>You can invoke characters
-by name in double-quoted strings:
+<pre class="verbatim"> my $hebrew_alef_from_code_point = pack(&quot;U&quot;, 
0x05d0);
+</pre>
+<p>Or you can use <code>chr()</code>, though it is less convenient in the 
general
+case:
 </p>
-<pre class="verbatim">    my $arabic_alef = &quot;\N{ARABIC LETTER ALEF}&quot;;
+<pre class="verbatim"> $hebrew_alef_from_code_point = 
chr(utf8::unicode_to_native(0x05d0));
+ utf8::upgrade($hebrew_alef_from_code_point);
 </pre>
-<p>And, as mentioned above, you can also <code>pack()</code> numbers into 
Unicode
-characters:
+<p>The <code>utf8::unicode_to_native()</code> and <code>utf8::upgrade()</code> 
aren&rsquo;t needed if
+the argument is above 0xFF, so the above could have been written as
 </p>
-<pre class="verbatim">   my $georgian_an  = pack(&quot;U&quot;, 0x10a0);
+<pre class="verbatim"> $hebrew_alef_from_code_point = chr(0x05d0);
 </pre>
-<p>Note that both <code>\x{...}</code> and <code>\N{...}</code> are 
compile-time string
-constants: you cannot use variables in them.  if you want similar
-run-time functionality, use <code>chr()</code> and 
<code>charnames::string_vianame()</code>.
+<p>since 0x5d0 is above 255.
 </p>
-<p>If you want to force the result to Unicode characters, use the special
-<code>&quot;U0&quot;</code> prefix.  It consumes no arguments but causes the 
following bytes
-to be interpreted as the UTF-8 encoding of Unicode characters:
+<p><code>\x{}</code> and <code>\o{}</code> can also be used to specify code 
points at compile
+time in double-quotish strings, but, for backward compatibility with
+older Perls, the same rules apply as with <code>chr()</code> for code points 
less
+than 256.
 </p>
-<pre class="verbatim">   my $chars = pack(&quot;U0W*&quot;, 0x80, 0x42);
-</pre>
-<p>Likewise, you can stop such UTF-8 interpretation by using the special
-<code>&quot;C0&quot;</code> prefix.
+<p><code>utf8::unicode_to_native()</code> is used so that the Perl code is 
portable
+to EBCDIC platforms.  You can omit it if you&rsquo;re <em>really</em> sure no 
one
+will ever want to use your code on a non-ASCII platform.  Starting in
+Perl v5.22, calls to it on ASCII platforms are optimized out, so there&rsquo;s
+no performance penalty at all in adding it.  Or you can simply use the
+other constructs that don&rsquo;t require it.
+</p>
+<p>See <a href="#perluniintro-Further-Resources">Further Resources</a> for how 
to find all these names and numeric
+codes.
 </p>
+<table class="menu" border="0" cellspacing="0">
+<tr><td align="left" valign="top">&bull; <a 
href="#perluniintro-Earlier-releases-caveats" accesskey="1">perluniintro 
Earlier releases caveats</a>:</td><td>&nbsp;&nbsp;</td><td align="left" 
valign="top">
+</td></tr>
+</table>
+
+<hr>
+<a name="perluniintro-Earlier-releases-caveats"></a>
+<div class="header">
+<p>
+Up: <a href="#perluniintro-Creating-Unicode" accesskey="u" 
rel="up">perluniintro Creating Unicode</a> &nbsp; [<a href="#SEC_Contents" 
title="Table of contents" rel="contents">Contents</a>]</p>
+</div>
+<a name="Earlier-releases-caveats"></a>
+<h4 class="subsubsection">83.2.5.1 Earlier releases caveats</h4>
+
+<p>On EBCDIC platforms, prior to v5.22, using <code>\N{U+...}</code> 
doesn&rsquo;t work
+properly.
+</p>
+<p>Prior to v5.16, using <code>\N{...}</code> with a character name (as 
opposed to a
+<code>U+...</code> code point) required a 
<code>use&nbsp;charnames&nbsp;:full</code><!-- /@w -->.
+</p>
+<p>Prior to v5.14, there were some bugs in <code>\N{...}</code> with a 
character name
+(as opposed to a <code>U+...</code> code point).
+</p>
+<p><code>charnames::string_vianame()</code> was introduced in v5.14.  Prior to 
that,
+<code>charnames::vianame()</code> should work, but only if the argument is of 
the
+form <code>&quot;U+...&quot;</code>.  Your best bet there for runtime Unicode 
by character
+name is probably:
+</p>
+<pre class="verbatim"> use charnames ();
+ my $hebrew_alef_from_name
+                  = pack(&quot;U&quot;, charnames::vianame(&quot;HEBREW LETTER 
ALEF&quot;));
+</pre>
 <hr>
 <a name="perluniintro-Handling-Unicode"></a>
 <div class="header">
@@ -103364,6 +105702,9 @@
 </pre>
 <p>which is ready to be printed.
 </p>
+<p>(<code>\\x{}</code> is used here instead of <code>\\N{}</code>, since 
it&rsquo;s most likely that
+you want to see what the native values are.)
+</p>
 <hr>
 <a name="perluniintro-Special-Cases"></a>
 <div class="header">
@@ -103485,8 +105826,17 @@
 character classes that are Unicode-aware.  There are dozens of them, see
 <a href="perluniprops.html#Top">(perluniprops)</a>.
 </p>
-<p>You can use Unicode code points as the end points of character ranges, and 
the
-range will include all Unicode code points that lie between those end points.
+<p>Starting in v5.22, you can use Unicode code points as the end points of
+regular expression pattern character ranges, and the range will include
+all Unicode code points that lie between those end points, inclusive.
+</p>
+<pre class="verbatim"> qr/ [\N{U+03]-\N{U+20}] /x
+</pre>
+<p>includes the code points
+<code>\N{U+03}</code>, <code>\N{U+04}</code>, ..., <code>\N{U+20}</code>.
+</p>
+<p>(It is planned to extend this behavior to ranges in <code>tr///</code> in 
Perl
+v5.24.)
 </p>
 </li><li> String-To-Number Conversions
 
@@ -103532,7 +105882,7 @@
 
 <p>You shouldn&rsquo;t have to care.  But you may if your Perl is before 5.14.0
 or you haven&rsquo;t specified <code>use feature 'unicode_strings'</code> or 
<code>use
-5.012</code> (or higher) because otherwise the semantics of the code points
+5.012</code> (or higher) because otherwise the rules for the code points
 in the range 128 to 255 are different depending on
 whether the string they are contained within is in Unicode or not.
 (See <a href="#perlunicode-When-Unicode-Does-Not-Happen">perlunicode When 
Unicode Does Not Happen</a>.)
@@ -103835,7 +106185,8 @@
 <a name="AUTHOR_002c-COPYRIGHT_002c-AND-LICENSE"></a>
 <h3 class="section">83.6 AUTHOR, COPYRIGHT, AND LICENSE</h3>
 
-<p>Copyright 2001-2011 Jarkko Hietaniemi &lt;address@hidden&gt;
+<p>Copyright 2001-2011 Jarkko Hietaniemi &lt;address@hidden&gt;.
+Now maintained by Perl 5 Porters.
 </p>
 <p>This document may be distributed under the same terms as Perl itself.
 </p>
@@ -103984,8 +106335,8 @@
 the world has standardized on UTF-8. 
 </p>
 <p>UTF-8 treats the first 128 codepoints, 0..127, the same as ASCII. They take
-only one byte per character. All other characters are encoded as two or more
-(up to six) bytes using a complex scheme. Fortunately, Perl handles this for
+only one byte per character. All other characters are encoded as two to
+four bytes using a complex scheme. Fortunately, Perl handles this for
 us, so we don&rsquo;t have to worry about this.
 </p>
 <hr>
@@ -104170,7 +106521,8 @@
 <a name="Q-and-A-_0028or-FAQ_0029"></a>
 <h3 class="section">84.4 Q and A (or FAQ)</h3>
 
-<p>After reading this document, you ought to read <a 
href="#perlunifaq-NAME">perlunifaq NAME</a> too. 
+<p>After reading this document, you ought to read <a 
href="#perlunifaq-NAME">perlunifaq NAME</a> too, then
+<a href="#perluniintro-NAME">perluniintro NAME</a>.
 </p>
 <hr>
 <a name="perlunitut-ACKNOWLEDGEMENTS"></a>
@@ -104373,58 +106725,9 @@
 <a name="Converters"></a>
 <h4 class="subsection">85.3.2 Converters</h4>
 
-<p>To help you convert legacy programs to Perl, we&rsquo;ve included three
-conversion filters:
-</p>
-<dl compact="compact">
-<dt><a href="a2p.html#Top">(a2p)a2p</a></dt>
-<dd><a name="perlutil-a2p"></a>
-<p><samp>a2p</samp> converts <samp>awk</samp> scripts to Perl programs; for 
example, <code>a2p -F:</code>
-on the simple <samp>awk</samp> script <code>{print $2}</code> will produce a 
Perl program
-based around this code:
-</p>
-<pre class="verbatim">    while (&lt;&gt;) {
-        ($Fld1,$Fld2) = split(/[:\n]/, $_, -1);
-        print $Fld2;
-    }
-</pre>
-</dd>
-<dt><a href="s2p.html#Top">(s2p)s2p</a> and <a 
href="psed.html#Top">(psed)</a></dt>
-<dd><a name="perlutil-s2p-and-psed"></a>
-<p>Similarly, <samp>s2p</samp> converts <samp>sed</samp> scripts to Perl 
programs. <samp>s2p</samp> run
-on <code>s/foo/bar</code> will produce a Perl program based around this:
-</p>
-<pre class="verbatim">    while (&lt;&gt;) {
-        chomp;
-        s/foo/bar/g;
-        print if $printit;
-    }
-</pre>
-<p>When invoked as <samp>psed</samp>, it behaves as a <samp>sed</samp> 
implementation, written in
-Perl.
-</p>
-</dd>
-<dt><a href="find2perl.html#Top">(find2perl)find2perl</a></dt>
-<dd><a name="perlutil-find2perl"></a>
-<p>Finally, <samp>find2perl</samp> translates <code>find</code> commands to 
Perl equivalents which 
-use the <a href="File-Find.html#Top">(File-Find)File::Find</a> module. As an 
example, 
-<code>find2perl . -user root -perm 4000 -print</code> produces the following 
callback
-subroutine for <code>File::Find</code>:
-</p>
-<pre class="verbatim">    sub wanted {
-        my ($dev,$ino,$mode,$nlink,$uid,$gid);
-        (($dev,$ino,$mode,$nlink,$uid,$gid) = lstat($_)) &amp;&amp;
-        $uid == $uid{'root'}) &amp;&amp;
-        (($mode &amp; 0777) == 04000);
-        print(&quot;$name\n&quot;);
-    }
-</pre>
-</dd>
-</dl>
-
-<p>As well as these filters for converting other languages, the
-<a href="pl2pm.html#Top">(pl2pm)pl2pm</a> utility will help you convert 
old-style Perl 4 libraries to 
-new-style Perl5 modules.
+<p>To help you convert legacy programs to more modern Perl, the
+<a href="pl2pm.html#Top">(pl2pm)pl2pm</a> utility will help you convert 
old-style Perl 4 libraries
+to new-style Perl5 modules.
 </p>
 <hr>
 <a name="perlutil-Administration"></a>
@@ -104436,12 +106739,6 @@
 <h4 class="subsection">85.3.3 Administration</h4>
 
 <dl compact="compact">
-<dt><a href="config_data.html#Top">(config_data)config_data</a></dt>
-<dd><a name="perlutil-config_005fdata"></a>
-<p>Query or change configuration of Perl modules that use Module::Build-based
-configuration files for features and config data.
-</p>
-</dd>
 <dt><a href="libnetcfg.html#Top">(libnetcfg)libnetcfg</a></dt>
 <dd><a name="perlutil-libnetcfg"></a>
 <p>To display and change the libnet configuration run the libnetcfg command.
@@ -104639,11 +106936,10 @@
 <p><a href="perldoc.html#Top">(perldoc)perldoc</a>, <a 
href="pod2man.html#Top">(pod2man)pod2man</a>, <a href="#perlpod-NAME">perlpod 
NAME</a>,
 <a href="pod2html.html#Top">(pod2html)pod2html</a>, <a 
href="pod2usage.html#Top">(pod2usage)pod2usage</a>, <a 
href="podselect.html#Top">(podselect)podselect</a>,
 <a href="podchecker.html#Top">(podchecker)podchecker</a>, <a 
href="splain.html#Top">(splain)splain</a>, <a href="#perldiag-NAME">perldiag 
NAME</a>,
-<code>roffitall|roffitall</code>, <a href="a2p.html#Top">(a2p)a2p</a>, <a 
href="s2p.html#Top">(s2p)s2p</a>, <a 
href="find2perl.html#Top">(find2perl)find2perl</a>,
-<a href="File-Find.html#Top">(File-Find)File::Find</a>, <a 
href="pl2pm.html#Top">(pl2pm)pl2pm</a>, <a 
href="perlbug.html#Top">(perlbug)perlbug</a>,
-<a href="h2ph.html#Top">(h2ph)h2ph</a>, <a 
href="c2ph.html#Top">(c2ph)c2ph</a>, <a href="h2xs.html#Top">(h2xs)h2xs</a>, <a 
href="enc2xs.html#Top">(enc2xs)</a>, <a href="xsubpp.html#Top">(xsubpp)</a>,
-<a href="cpan.html#Top">(cpan)</a>, <a 
href="instmodsh.html#Top">(instmodsh)</a>, <a 
href="piconv.html#Top">(piconv)</a>, <a href="prove.html#Top">(prove)</a>,
-<a href="corelist.html#Top">(corelist)</a>, <a 
href="ptar.html#Top">(ptar)</a>, <a href="ptardiff.html#Top">(ptardiff)</a>, <a 
href="shasum.html#Top">(shasum)</a>, <a 
href="zipdetails.html#Top">(zipdetails)</a>
+<code>roffitall|roffitall</code>, <a 
href="File-Find.html#Top">(File-Find)File::Find</a>, <a 
href="pl2pm.html#Top">(pl2pm)pl2pm</a>,
+<a href="perlbug.html#Top">(perlbug)perlbug</a>, <a 
href="h2ph.html#Top">(h2ph)h2ph</a>, <a href="c2ph.html#Top">(c2ph)c2ph</a>, <a 
href="h2xs.html#Top">(h2xs)h2xs</a>, <a href="enc2xs.html#Top">(enc2xs)</a>,
+<a href="xsubpp.html#Top">(xsubpp)</a>, <a href="cpan.html#Top">(cpan)</a>, <a 
href="instmodsh.html#Top">(instmodsh)</a>, <a 
href="piconv.html#Top">(piconv)</a>, <a href="prove.html#Top">(prove)</a>, <a 
href="corelist.html#Top">(corelist)</a>, <a href="ptar.html#Top">(ptar)</a>,
+<a href="ptardiff.html#Top">(ptardiff)</a>, <a 
href="shasum.html#Top">(shasum)</a>, <a 
href="zipdetails.html#Top">(zipdetails)</a>
 </p>
 <hr>
 <a name="perlvar"></a>
@@ -104716,8 +107012,9 @@
 control-<code>W</code>.  This is better than typing a literal 
control-<code>W</code>
 into your program.
 </p>
-<p>Since Perl v5.6.0, Perl variable names may be alphanumeric
-strings that begin with control characters (or better yet, a caret).
+<p>Since Perl v5.6.0, Perl variable names may be alphanumeric strings that
+begin with a caret (or a control character, but this form is
+deprecated).
 These variables must be written in the form <code>${^Foo}</code>; the braces
 are not optional.  <code>${^Foo}</code> denotes the scalar variable whose
 name is a control-<code>F</code> followed by two <code>o</code>&rsquo;s.  
These variables are
@@ -105154,6 +107451,41 @@
 foreign processes.
 </p>
 </dd>
+<dt>$OLD_PERL_VERSION</dt>
+<dd><a name="perlvar-_0024OLD_005fPERL_005fVERSION"></a>
+</dd>
+<dt>$]</dt>
+<dd><a name="perlvar-_0024_005d"></a>
+<p>The revision, version, and subversion of the Perl interpreter, represented
+as a decimal of the form 5.XXXYYY, where XXX is the version / 1e3 and YYY
+is the subversion / 1e6.  For example, Perl v5.10.1 would be 
&quot;5.010001&quot;.
+</p>
+<p>This variable can be used to determine whether the Perl interpreter
+executing a script is in the right range of versions:
+</p>
+<pre class="verbatim">    warn &quot;No PerlIO!\n&quot; if $] lt '5.008';
+</pre>
+<p>When comparing <code>$]</code>, string comparison operators are 
<strong>highly
+recommended</strong>.  The inherent limitations of binary floating point
+representation can sometimes lead to incorrect comparisons for some
+numbers on some architectures.
+</p>
+<p>See also the documentation of <code>use VERSION</code> and <code>require 
VERSION</code>
+for a convenient way to fail if the running Perl interpreter is too old.
+</p>
+<p>See <a href="#perlvar-_0024_005eV">$^V</a> for a representation of the Perl 
version as a <a href="version.html#Top">(version)</a>
+object, which allows more flexible string comparisons.
+</p>
+<p>The main advantage of <code>$]</code> over <code>$^V</code> is that it 
works the same on any
+version of Perl.  The disadvantages are that it can&rsquo;t easily be compared
+to versions in other formats (e.g. literal v-strings, &quot;v1.2.3&quot; or
+version objects) and numeric comparisons can occasionally fail; it&rsquo;s good
+for string literal version checks and bad for comparing to a variable
+that hasn&rsquo;t been sanity-checked.
+</p>
+<p>Mnemonic: Is this version of perl in the right bracket?
+</p>
+</dd>
 <dt>$SYSTEM_FD_MAX</dt>
 <dd><a name="perlvar-_0024SYSTEM_005fFD_005fMAX"></a>
 </dd>
@@ -105382,30 +107714,35 @@
 <dt>$^V</dt>
 <dd><a name="perlvar-_0024_005eV"></a>
 <p>The revision, version, and subversion of the Perl interpreter,
-represented as a <code>version</code> object.
+represented as a <a href="version.html#Top">(version)</a> object.
 </p>
 <p>This variable first appeared in perl v5.6.0; earlier versions of perl
 will see an undefined value.  Before perl v5.10.0 <code>$^V</code> was 
represented
-as a v-string.
+as a v-string rather than a <a href="version.html#Top">(version)</a> object.
 </p>
 <p><code>$^V</code> can be used to determine whether the Perl interpreter 
executing
 a script is in the right range of versions.  For example:
 </p>
 <pre class="verbatim">    warn &quot;Hashes not randomized!\n&quot; if !$^V or 
$^V lt v5.8.1
 </pre>
-<p>To convert <code>$^V</code> into its string representation use 
<code>sprintf()</code>&rsquo;s
-<code>&quot;%vd&quot;</code> conversion:
+<p>While version objects overload stringification, to portably convert
+<code>$^V</code> into its string representation, use 
<code>sprintf()</code>&rsquo;s <code>&quot;%vd&quot;</code>
+conversion, which works for both v-strings or version objects:
 </p>
 <pre class="verbatim">    printf &quot;version is v%vd\n&quot;, $^V;  # Perl's 
version
 </pre>
 <p>See the documentation of <code>use VERSION</code> and <code>require 
VERSION</code>
 for a convenient way to fail if the running Perl interpreter is too old.
 </p>
-<p>See also <code>$]</code> for an older representation of the Perl version.
+<p>See also <code>$]</code> for a decimal representation of the Perl version.
 </p>
-<p>This variable was added in Perl v5.6.0.
+<p>The main advantage of <code>$^V</code> over <code>$]</code> is that, for 
Perl v5.10.0 or
+later, it overloads operators, allowing easy comparison against other
+version representations (e.g. decimal, literal v-string, &quot;v1.2.3&quot;, or
+objects).  The disadvantage is that prior to v5.10.0, it was only a
+literal v-string, which can&rsquo;t be easily printed or compared.
 </p>
-<p>Mnemonic: use ^V for Version Control.
+<p>Mnemonic: use ^V for a version object.
 </p>
 </dd>
 <dt>${^WIN32_SLOPPY_STAT}</dt>
@@ -106581,10 +108918,6 @@
 to set the exit value, or to inspect the system error string
 corresponding to error <em>n</em>, or to restore <code>$!</code> to a 
meaningful state.
 </p>
-<p>Note that when stringified, the text is always returned as if both
-<a href="#perllocale-NAME"><code>&quot;use&nbsp;locale&quot;</code></a><!-- 
/@w --> and <a 
href="bytes.html#Top">(bytes)<code>&quot;use&nbsp;bytes&quot;</code></a><!-- 
/@w --> are in
-effect.  This is likely to change in v5.22.
-</p>
 <p>Mnemonic: What just went bang?
 </p>
 </dd>
@@ -106702,10 +109035,34 @@
 </dd>
 <dt>${^ENCODING}</dt>
 <dd><a name="perlvar-_0024_007b_005eENCODING_007d"></a>
+<p>DEPRECATED!!!
+</p>
 <p>The <em>object reference</em> to the <code>Encode</code> object that is 
used to convert
 the source code to Unicode.  Thanks to this variable your Perl script
-does not have to be written in UTF-8.  Default is <em>undef</em>.  The direct
-manipulation of this variable is highly discouraged.
+does not have to be written in UTF-8.  Default is <code>undef</code>.
+</p>
+<p>Setting this variable to any other value than <code>undef</code> is 
deprecated due
+to fundamental defects in its design and implementation.  It is planned
+to remove it from a future Perl version.  Its purpose was to allow your
+non-ASCII Perl scripts to not have to be written in UTF-8; this was
+useful before editors that worked on UTF-8 encoded text were common, but
+that was long ago.  It causes problems, such as affecting the operation
+of other modules that aren&rsquo;t expecting it, causing general mayhem.  Its
+use can lead to segfaults.
+</p>
+<p>If you need something like this functionality, you should use the
+<a href="encoding.html#Top">(encoding)</a> pragma, which is also deprecated, 
but has fewer nasty side
+effects.
+</p>
+<p>If you are coming here because code of yours is being adversely affected
+by someone&rsquo;s use of this variable, you can usually work around it by
+doing this:
+</p>
+<pre class="verbatim"> local ${^ENCODING};
+</pre>
+<p>near the beginning of the functions that are getting broken.  This
+undefines the variable during the scope of execution of the including
+function.
 </p>
 <p>This variable was added in Perl 5.8.2.
 </p>
@@ -106876,7 +109233,9 @@
 <dd><a name="perlvar-_0025_005eH"></a>
 <p>The <code>%^H</code> hash provides the same scoping semantic as 
<code>$^H</code>.  This makes
 it useful for implementation of lexically scoped pragmas.  See
-<a href="#perlpragma-NAME">perlpragma NAME</a>.
+<a href="#perlpragma-NAME">perlpragma NAME</a>.   All the entries are 
stringified when accessed at
+runtime, so only simple values can be accommodated.  This means no
+pointers to objects, for example.
 </p>
 <p>When putting items into <code>%^H</code>, in order to avoid conflicting 
with other
 users of the hash there is a convention regarding which keys to use.
@@ -107101,26 +109460,6 @@
 <p>Deprecated in Perl v5.12.0.
 </p>
 </dd>
-<dt>$]</dt>
-<dd><a name="perlvar-_0024_005d"></a>
-<p>See <a href="#perlvar-_0024_005eV">$^V</a> for a more modern representation 
of the Perl version that allows
-accurate string comparisons.
-</p>
-<p>The version + patchlevel / 1000 of the Perl interpreter.  This variable
-can be used to determine whether the Perl interpreter executing a
-script is in the right range of versions:
-</p>
-<pre class="verbatim">    warn &quot;No PerlIO!\n&quot; if $] lt '5.008';
-</pre>
-<p>The floating point representation can sometimes lead to inaccurate
-numeric comparisons, so string comparisons are recommended.
-</p>
-<p>See also the documentation of <code>use VERSION</code> and <code>require 
VERSION</code>
-for a convenient way to fail if the running Perl interpreter is too old.
-</p>
-<p>Mnemonic: Is this version of perl in the right bracket?
-</p>
-</dd>
 </dl>
 
 <hr>
@@ -107209,7 +109548,7 @@
 
 <p>Directions for building and installing Perl 5 can be found in 
 the file <samp>README.vms</samp> in the main source directory of the 
-Perl distribution..
+Perl distribution.
 </p>
 <hr>
 <a name="perlvms-Organization-of-Perl-Images"></a>
@@ -107240,22 +109579,21 @@
 <a name="Core-Images"></a>
 <h4 class="subsection">87.4.1 Core Images</h4>
 
-<p>During the installation process, three Perl images are produced.
+<p>During the build process, three Perl images are produced.
 <samp>Miniperl.Exe</samp> is an executable image which contains all of
 the basic functionality of Perl, but cannot take advantage of
-Perl extensions.  It is used to generate several files needed
-to build the complete Perl and various extensions.  Once you&rsquo;ve
-finished installing Perl, you can delete this image.
-</p>
-<p>Most of the complete Perl resides in the shareable image
-<samp>PerlShr.Exe</samp>, which provides a core to which the Perl executable
-image and all Perl extensions are linked.  You should place this
-image in <samp>Sys$Share</samp>, or define the logical name 
<samp>PerlShr</samp> to
-translate to the full file specification of this image.  It should
-be world readable.  (Remember that if a user has execute only access
-to <samp>PerlShr</samp>, VMS will treat it as if it were a privileged shareable
-image, and will therefore require all downstream shareable images to be
-INSTALLed, etc.)
+Perl XS extensions and has a hard-wired list of library locations
+for loading pure-Perl modules.  It is used extensively to build and
+test Perl and various extensions, but is not installed.
+</p>
+<p>Most of the complete Perl resides in the shareable image 
<samp>PerlShr.Exe</samp>,
+which provides a core to which the Perl executable image and all Perl
+extensions are linked. It is generally located via the logical name
+<samp>PERLSHR</samp>.  While it&rsquo;s possible to put the image in 
<samp>SYS$SHARE</samp> to
+make it loadable, that&rsquo;s not recommended. And while you may wish to
+INSTALL the image for performance reasons, you should not install it
+with privileges; if you do, the result will not be what you expect as
+image privileges are disabled during Perl start-up.
 </p>
 <p>Finally, <samp>Perl.Exe</samp> is an executable image containing the main
 entry point for Perl, as well as some initialization code.  It
@@ -107349,35 +109687,10 @@
     $ mmk test          ! Run test code, if supplied
     $ mmk install       ! Install into public Perl tree
 </pre>
-<p><em>N.B.</em> The procedure by which extensions are built and
-tested creates several levels (at least 4) under the
-directory in which the extension&rsquo;s source files live.
-For this reason if you are running a version of VMS prior
-to V7.1 you shouldn&rsquo;t nest the source directory
-too deeply in your directory structure lest you exceed RMS&rsquo;
-maximum of 8 levels of subdirectory in a filespec.  (You
-can use rooted logical names to get another 8 levels of
-nesting, if you can&rsquo;t place the files near the top of
-the physical directory structure.)
-</p>
 <p>VMS support for this process in the current release of Perl
-is sufficient to handle most extensions.  However, it does
-not yet recognize extra libraries required to build shareable
-images which are part of an extension, so these must be added
-to the linker options file for the extension by hand.  For
-instance, if the <samp>PGPLOT</samp> extension to Perl requires the
-<samp>PGPLOTSHR.EXE</samp> shareable image in order to properly link
-the Perl extension, then the line <code>PGPLOTSHR/Share</code> must
-be added to the linker options file <samp>PGPLOT.Opt</samp> produced
-during the build process for the Perl extension.
-</p>
-<p>By default, the shareable image for an extension is placed in
-the 
<samp>[.lib.site_perl.auto</samp><em>Arch</em>.<em>Extname</em><samp>]</samp> 
directory of the
-installed Perl directory tree (where <em>Arch</em> is <samp>VMS_VAX</samp> or
-<samp>VMS_AXP</samp>, and <em>Extname</em> is the name of the extension, with
-each <code>::</code> translated to <code>.</code>).  (See the MakeMaker 
documentation
-for more details on installation options for extensions.)
-However, it can be manually placed in any of several locations:
+is sufficient to handle most extensions.  (See the MakeMaker
+documentation for more details on installation options for
+extensions.)
 </p>
 <ul>
 <li> the 
<samp>[.Lib.Auto.</samp><em>Arch</em><em>$PVers</em><em>Extname</em><samp>]</samp>
 subdirectory
@@ -107463,12 +109776,6 @@
 expects, if a VMS path cannot be translated to a Unix path, it is
 passed through unchanged, so <code>unixify(&quot;[...]&quot;)</code> will 
return <code>[...]</code>.
 </p>
-<p>The handling of extended characters is largely complete in the
-VMS-specific C infrastructure of Perl, but more work is still needed to
-fully support extended syntax filenames in several core modules.  In
-particular, at this writing PathTools has only partial support for
-directories containing some extended characters.
-</p>
 <p>There are several ambiguous cases where a conversion routine cannot
 determine whether an input filename is in Unix format or in VMS format,
 since now both VMS and Unix file specifications may have characters in
@@ -107504,15 +109811,12 @@
 <a name="Filename-Case"></a>
 <h4 class="subsection">87.5.2 Filename Case</h4>
 
-<p>Perl follows VMS defaults and override settings in preserving (or not
-preserving) filename case.  Case is not preserved on ODS-2 formatted
-volumes on any architecture.  On ODS-5 volumes, filenames may be case
-preserved depending on process and feature settings.  Perl now honors
-DECC$EFS_CASE_PRESERVE and DECC$ARGV_PARSE_STYLE on those systems where
-the CRTL supports these features.  When these features are not enabled
-or the CRTL does not support them, Perl follows the traditional CRTL
-behavior of downcasing command-line arguments and returning file
-specifications in lower case only.
+<p>Perl enables DECC$EFS_CASE_PRESERVE and DECC$ARGV_PARSE_STYLE by
+default.  Note that the latter only takes effect when extended parse
+is set in the process in which Perl is running.  When these features
+are explicitly disabled in the environment or the CRTL does not support
+them, Perl follows the traditional CRTL behavior of downcasing command-line
+arguments and returning file specifications in lower case only.
 </p>
 <p><em>N. B.</em>  It is very easy to get tripped up using a mixture of other
 programs, external utilities, and Perl scripts that are in varying
@@ -107521,7 +109825,7 @@
 such as MMK or MMS may generate a filename in all upper case even on an
 ODS-5 volume.  If this filename is later retrieved by a Perl script or
 module in a case preserving environment, that upper case name may not
-match the mixed-case or lower-case exceptions of the Perl code.  Your
+match the mixed-case or lower-case expectations of the Perl code.  Your
 best bet is to follow an all-or-nothing approach to case preservation:
 either don&rsquo;t use it at all, or make sure your entire toolchain and
 application environment support and use it.
@@ -108283,12 +110587,12 @@
 <dl compact="compact">
 <dt>CRTL_ENV</dt>
 <dd><a name="perlvms-CRTL_005fENV"></a>
-<p>This string tells Perl to consult the CRTL&rsquo;s internal 
<code>environ</code>
-array of key-value pairs, using <em>name</em> as the key.  In most cases,
-this contains only a few keys, but if Perl was invoked via the C
-<code>exec[lv]e()</code> function, as is the case for CGI processing by some
-HTTP servers, then the <code>environ</code> array may have been populated by
-the calling program.
+<p>This string tells Perl to consult the CRTL&rsquo;s internal 
<code>environ</code> array
+of key-value pairs, using <em>name</em> as the key.  In most cases, this
+contains only a few keys, but if Perl was invoked via the C
+<code>exec[lv]e()</code> function, as is the case for some embedded Perl
+applications or when running under a shell such as GNV bash, the
+<code>environ</code> array may have been populated by the calling program.
 </p>
 </dd>
 <dt>CLISYM_[LOCAL]</dt>
@@ -108316,7 +110620,9 @@
 you make while Perl is running do not affect the behavior of <code>%ENV</code>.
 If <samp>PERL_ENV_TABLES</samp> is not defined, then Perl defaults to 
consulting
 first the logical name tables specified by <samp>LNM$FILE_DEV</samp>, and then
-the CRTL <code>environ</code> array.
+the CRTL <code>environ</code> array.  This default order is reversed when the
+logical name <samp>GNV$UNIX_SHELL</samp> is defined, such as when running under
+GNV bash.
 </p>
 <p>In all operations on %ENV, the key string is treated as if it 
 were entirely uppercase, regardless of the case actually 
@@ -108353,23 +110659,16 @@
 (ASCII <code>\0</code>) character, since a logical name cannot translate to a
 zero-length string.  (This restriction does not apply to CLI symbols
 or CRTL <code>environ</code> values; they are set to the empty string.)
-An element of the CRTL <code>environ</code> array can be set only if your
-copy of Perl knows about the CRTL&rsquo;s <code>setenv()</code> function.  
(This is
-present only in some versions of the DECCRTL; check 
<code>$Config{d_setenv}</code>
-to see whether your copy of Perl was built with a CRTL that has this
-function.)
-</p>
-<p>When an element of <code>%ENV</code> is set to <code>undef</code>,
-the element is looked up as if it were being read, and if it is
-found, it is deleted.  (An item &quot;deleted&quot; from the CRTL 
<code>environ</code>
-array is set to the empty string; this can only be done if your
-copy of Perl knows about the CRTL <code>setenv()</code> function.)  Using
-<code>delete</code> to remove an element from <code>%ENV</code> has a similar 
effect,
-but after the element is deleted, another attempt is made to
-look up the element, so an inner-mode logical name or a name in
-another location will replace the logical name just deleted.
-In either case, only the first value found searching PERL_ENV_TABLES
-is altered.  It is not possible at present to define a search list
+</p>
+<p>When an element of <code>%ENV</code> is set to <code>undef</code>, the 
element is looked
+up as if it were being read, and if it is found, it is deleted.  (An
+item &quot;deleted&quot; from the CRTL <code>environ</code> array is set to 
the empty
+string.)  Using <code>delete</code> to remove an element from 
<code>%ENV</code> has a
+similar effect, but after the element is deleted, another attempt is
+made to look up the element, so an inner-mode logical name or a name
+in another location will replace the logical name just deleted. In
+either case, only the first value found searching PERL_ENV_TABLES is
+altered.  It is not possible at present to define a search list
 logical name via %ENV.
 </p>
 <p>The element <code>$ENV{DEFAULT}</code> is special: when read, it returns

Index: perldoc-all.html.gz
===================================================================
RCS file: /web/www/www/software/perl/manual/perldoc-all.html.gz,v
retrieving revision 1.8
retrieving revision 1.9
diff -u -b -r1.8 -r1.9
Binary files /tmp/cvs1jB1UW and /tmp/cvsiihguG differ

Index: perldoc-all.html_chapter.tar.gz
===================================================================
RCS file: /web/www/www/software/perl/manual/perldoc-all.html_chapter.tar.gz,v
retrieving revision 1.8
retrieving revision 1.9
diff -u -b -r1.8 -r1.9
Binary files /tmp/cvs5oTws5 and /tmp/cvsONvkgP differ

Index: perldoc-all.info.tar.gz
===================================================================
RCS file: /web/www/www/software/perl/manual/perldoc-all.info.tar.gz,v
retrieving revision 1.8
retrieving revision 1.9
diff -u -b -r1.8 -r1.9
Binary files /tmp/cvsUtP2Fc and /tmp/cvsxqX4EW differ

Index: perldoc-all.pdf
===================================================================
RCS file: /web/www/www/software/perl/manual/perldoc-all.pdf,v
retrieving revision 1.8
retrieving revision 1.9
diff -u -b -r1.8 -r1.9
Binary files /tmp/cvs010fHf and /tmp/cvsad5xq0 differ

Index: perldoc-all.texi.tar.gz
===================================================================
RCS file: /web/www/www/software/perl/manual/perldoc-all.texi.tar.gz,v
retrieving revision 1.8
retrieving revision 1.9
diff -u -b -r1.8 -r1.9
Binary files /tmp/cvsfOejHE and /tmp/cvsCYT1Ep differ
[Prev in Thread]
Current Thread
[Next in Thread]
www/software/perl/manual index.html perldoc-all..., karl <=
Prev by Date: www/distros common-distros.fr.html po/common-di...
Next by Date: www planetfeeds.html
Previous by thread: www/distros common-distros.fr.html po/common-di...
Next by thread: www distros/po/common-distros.ru.po software/po...
Index(es):
- Date
- Thread