gnunet-svn
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[lsd0003] branch master updated: minor points


From: gnunet
Subject: [lsd0003] branch master updated: minor points
Date: Tue, 08 Jun 2021 10:14:38 +0200

This is an automated email from the git hooks/post-receive script.

grothoff pushed a commit to branch master
in repository lsd0003.

The following commit(s) were added to refs/heads/master by this push:
     new 645fdcc  minor points
645fdcc is described below

commit 645fdcc94eceb374a55878bdf03c128ce935eb32
Author: Christian Grothoff <christian@grothoff.org>
AuthorDate: Tue Jun 8 10:11:58 2021 +0200

    minor points
---
 draft-summermatter-set-union.xml | 77 +++++++++++++++++++++-------------------
 1 file changed, 40 insertions(+), 37 deletions(-)

diff --git a/draft-summermatter-set-union.xml b/draft-summermatter-set-union.xml
index cef2baa..8ed3591 100644
--- a/draft-summermatter-set-union.xml
+++ b/draft-summermatter-set-union.xml
@@ -62,7 +62,7 @@
             <name>Introduction</name>
             <t>
               This document describes a Byzantine fault-tolerant set 
reconciliation protocol used to efficient and securely
-              synchronize two sets of elements between two peers.
+              compute the union of two sets across a network.
             </t>
             <t>
               This Byzantine fault-tolerant set reconciliation
@@ -97,7 +97,7 @@
             <t>
               The protocol described in this report is generic and
               suitable for a wide range of applications. As a result,
-              the internal structure of the elements in the sets must
+              the internal structure of the elements in the sets MUST
               be defined and verified by the application using the
               protocol.  This document thus does not cover the element
               structure, except for imposing a limit on the maximum
@@ -108,7 +108,7 @@
               the number of network round-trips and the number of bytes
               sent over the network.  Thus, for the protocol to choose
               the right parameters for a given situation, applications
-              using the protocol must provide a parameter that specifies
+              using the protocol SHOULD provide a parameter that specifies
               the cost-ratio of round-trips vs. bandwidth usage.  Given
               this trade-off factor, the protocol will then choose parameters
               that minimize the total execution costs.  In particular, there
@@ -241,7 +241,7 @@
                      ]]></artwork>
                 </figure>
                 <t>
-                  The parameters L and k depend on the set size and must be
+                  The parameters L and k depend on the set size and MUST be
                   chosen carefully to ensure that the BF does not return too
                   many false-positives.
                 </t>
@@ -296,7 +296,7 @@
                   order of a bucket reaches "infinity", it is no longer 
incremented or decremented.
                 </t>
                 <t>
-                  The parameters L and k and the number of bits allocated to 
the counters should depend on the set size.
+                  The parameters L and k and the number of bits allocated to 
the counters depend on the set size.
                   An IBF will degenerate when subjected to insert and remove 
iterations of different elements, and eventually all
                   buckets will reach "infinity".  The speed of the degradation 
will depend on the choice of L and k in
                   relation to the number of elements stored in the IBF.
@@ -487,12 +487,12 @@ hashSum |    0x0000   |   0x4242    |    0x0000   |   
0x4242    |
                         decoder returns an IDSUM value that is actually the 
XOR of several IDSUMs.
                         This is primarily detected by checking that the 
HASHSUM is the hash of the IDSUM.
                         Only if the HASHSUM also matches, the bucket could be 
pure.  Additionally,
-                        one should check that the IDSUM value actually would 
be mapped by M to
+                        one MUST check that the IDSUM value actually would be 
mapped by M to
                         the respective bucket. If not, there was a hash 
collision.
                     </t>
                     <t>
                         The very rare case that after all these checks a 
bucket is still
-                        falsely identified as pure must be detected (say by 
determining that
+                        falsely identified as pure MUST be detected (say by 
determining that
                         extracted element IDs do not match any actual 
elements), and addressed
                         at a higher level in the protocol. As these failures 
are probabilistic
                         and depend on element IDs and the IBF construction, 
they can typically
@@ -501,7 +501,7 @@ hashSum |    0x0000   |   0x4242    |    0x0000   |   
0x4242    |
                         a different mapping function M.
                         A more common scenario (especially if L was too small) 
is that
                         IBF decoding fails because there is no pure bucket. In 
this case, the
-                        higher-level protocol should also retry using 
different parameters.
+                        higher-level protocol SHOULD also retry using 
different parameters.
                     </t>
                     <t>
                         Suppose the IBF contains a pure bucket. In this case, 
the IDSUM in the
@@ -509,7 +509,7 @@ hashSum |    0x0000   |   0x4242    |    0x0000   |   
0x4242    |
                         to remove that element from the IBF (by inserting it 
if the counter
                         was negative, and by removing it if the counter was 
positive). This
                         is likely to cause other buckets to become pure, 
allowing further
-                        elements to be decoded.  Eventually, decoding should 
succeed with
+                        elements to be decoded.  Eventually, decoding ought to 
succeed with
                         all counters and IDSUM and HASHSUM values reach zero. 
However, it is also
                         possible that an IBF only partly decodes and then 
decoding fails after
                         obtaining some elements.
@@ -576,10 +576,10 @@ hashSum |    0x0000   |    0x0000   |    0x0000   |    
0x0000   |
                         IBF2 represents set B.  Then this set difference 
operation will compute IBF3 which
                         represents the set A - B --- without having to 
transfer the elements from set A or B.
 
-                        To calculate the IBF representing this set difference, 
both IBFs must have the same
+                        To calculate the IBF representing this set difference, 
both IBFs MUST have the same
                         length L, the same number of buckets per element k and 
use the same map M.  Given this,
                         one can compute the IBF representing the set 
difference by taking the XOR of the IDSUM and HASHSUM values
-                        of the respective buckets and subtracting the 
respective counters.  Care should be taken
+                        of the respective buckets and subtracting the 
respective counters.  Care MUST be taken
                         to handle overflows and underflows by setting the 
counter to "infinity" as necessary.
                         The result is a new IBF with the same number of 
buckets representing the set difference.
                     </t>
@@ -645,10 +645,12 @@ hashSum |    0x0101   |    0x0101   |    0x5050   |    
0x0000   |
                 <t>
                     To facilitate a reasonably CPU-efficient
                     implementation, this specification requires the
-                    IBF counter always to use 8 bits.  Fewer bits
+                    IBF counter always to use 8 bits.
+                    FIXME: I thought we lifted this constraint!?
+                    Fewer bits
                     would result in a particularly inefficient
                     implementation, while more bits are rarely useful
-                    as sets with so many elements should likely be
+                    as sets with so many elements should be
                     represented using a larger number of buckets. This
                     means the counter of this design can reach a
                     minimum of -127 and a maximum of 127 before the
@@ -656,7 +658,7 @@ hashSum |    0x0101   |    0x0101   |    0x5050   |    
0x0000   |
                 </t>
                 <t>
                     For the "IDSUM", we always use a 64-bit representation.
-                    The IDSUM value must have sufficient entropy for the
+                    The IDSUM value MUST have sufficient entropy for the
                     mapping function M to yield reasonably random buckets
                     even for very large values of L.  With a 32 bit
                     value the chance that multiple elements may be mapped
@@ -679,7 +681,7 @@ hashSum |    0x0101   |    0x0101   |    0x5050   |    
0x0000   |
                     to handle occasional collisions, so while with
                     32-bits there remains a chance of
                     accidental collisions, at 32 bit the chance is
-                    generally believed to be sufficiently small 
+                    generally believed to be sufficiently small
                     for the protocol to handle those cases efficiently
                     for a wide range of use-cases.  Smaller hash
                     values would safe bandwidth, but also drastically
@@ -812,7 +814,7 @@ FUNCTION get_bucket_id (key, number_of_buckets_per_element, 
ibf_size)
                     Basically a Strata Estimator (SE) is a series of IBFs 
(with a rather small value of L)
                     in which increasingly large subsets of the full set
                     of elements are added to each IBF.  For the n-th IBF, the 
function selecting the
-                    subset of elements should sample to select 
(probabilistically) 1/(2^n) of all
+                    subset of elements MUST sample to select 
(probabilistically) 1/(2^n) of all
                     elements.  This can be done by counting the number of 
trailing bits set to "1"
                     in an element ID, and then inserting the element into the 
IBF identified by that
                     counter.  As a result, all elements will be mapped to one 
IBF, with the n-th
@@ -852,7 +854,7 @@ FUNCTION get_bucket_id (key, number_of_buckets_per_element, 
ibf_size)
             <t>
                 The second possibility is that the difference of the sets is 
small compared to the set size.
                 In this case, an efficient "delta" synchronization mode is 
more efficient. These two possibilities given,
-                the first steps of the protocol are used to determine which 
mode should be used.
+                the first steps of the protocol are used to determine which 
mode MUST be used.
             </t>
 
             <t>
@@ -979,7 +981,7 @@ FUNCTION get_bucket_id (key, number_of_buckets_per_element, 
ibf_size)
                             <dt><em><xref target="messages_ibf" format="title" 
/></em> message:</dt>
                             <dd>
                                 If an <em><xref target="messages_ibf" 
format="title" /></em> message is received, this
-                                indicates that decoding of the IBF on the 
active site has failed and roles should be swapped.
+                                indicates that decoding of the IBF on the 
active site has failed and roles will be swapped.
                                 The receiving passive peer transitions into 
the <strong>Expecting IBF Last</strong> state,
                                 and waits for more <em><xref 
target="messages_ibf" format="title" /></em> messages
                                 or the final <em><xref 
target="messages_ibf_last" format="title" /></em> message to be received.
@@ -1050,8 +1052,10 @@ FUNCTION get_bucket_id (key, 
number_of_buckets_per_element, ibf_size)
                                 set there was probably a bucket decoded that 
was not pure so potentially all <em><xref target="messages_offer" 
format="title" /></em>
                                 and <em><xref target="messages_demand" 
format="title" /></em> messages sent later are invalid
                                 in this case a role change active -> passive 
with a new IBF is easiest.
+
                                 If a demand for the same element is received 
multiple times the demands should be
                                 discarded.
+                                FIXME: I thought demanding the same element 
multiple times is now verboten?
                             </dd>
                             <dt><em><xref target="messages_elements" 
format="title" /></em> message:</dt>
                             <dd>
@@ -1098,7 +1102,7 @@ FUNCTION get_bucket_id (key, 
number_of_buckets_per_element, ibf_size)
                     differences or if the byte-size of the elements is large. 
If the set difference is estimated to be large
                     the <xref target="modeofoperation_full-sync" 
format="title" /> is
                     more efficient. The exact heuristics and parameters on 
which the protocol decides which mode
-                    should be used are described in the <xref 
target="performance" format="title" /> section of this document.
+                    MUST be used are described in the <xref 
target="performance" format="title" /> section of this document.
                 </t>
                 <t>
                     There are two main cases when a <xref 
target="modeofoperation_full-sync" format="title" />
@@ -1106,7 +1110,7 @@ FUNCTION get_bucket_id (key, 
number_of_buckets_per_element, ibf_size)
                     The first case is when one of the peers announces having 
an empty set. This is announced by setting
                     the SETSIZE field in the <em><xref target="messages_se" 
format="title" /></em> to 0.
                     The second case is if the application requests full 
synchronization explicitly.
-                    This is useful for testing and should not be used in 
production.
+                    This is useful for testing and MUST NOT be used in 
production.
                 </t>
             </section>
         </section>
@@ -1211,7 +1215,7 @@ FUNCTION get_bucket_id (key, 
number_of_buckets_per_element, ibf_size)
                         <dd>
                             the type of SETU_P2P_REQUEST_IBF as registered in 
<xref target="gana" format="title" /> in network byte order.
                         </dd>
-                        
+
                         <dt>SIZE</dt>
                         <dd>
                             is a 32-bit unsigned integer which signals the 
number of buckets in the IBF.
@@ -1223,7 +1227,7 @@ FUNCTION get_bucket_id (key, 
number_of_buckets_per_element, ibf_size)
                             in the <xref 
target="performance_counter_variable_size" format="title" /> section.
                         </dd>
 
-                    
+
                         <dt>OFFSET</dt>
                         <dd>
                             is a 32-bit unsigned integer which signals the 
offset to the following ibf slices in the original.
@@ -1235,7 +1239,7 @@ FUNCTION get_bucket_id (key, 
number_of_buckets_per_element, ibf_size)
                         </dd>
                         <dt>IBF-SLICE</dt>
                         <dd>
-                            
+
                             <t>
                                 are variable numbers of slices in an array. A 
single slice contains multiple 64-bit IDSUMS,
                                 32-bit HASHSUMS and 1-64bit COUNTERS of 
variable size. In the network order the array of IDSUMS is first, followed
@@ -1243,7 +1247,7 @@ FUNCTION get_bucket_id (key, 
number_of_buckets_per_element, ibf_size)
                                 by MIN( SIZE - OFFSET, 
MAX_BUCKETS_PER_MESSAGE). MAX_BUCKETS_PER_MESSAGE is defined as
                                 32768 divided by the BUCKET_SIZE which is 
13-byte (104-bit).
                             </t>
-                            
+
                             <t>
                                 To get the IDSUM field, all IDs hitting a 
bucket are added up with a binary XOR operation.
                                 See <xref target="ibf_format_id_generation" 
format="title" /> details about ID generation.
@@ -1257,7 +1261,7 @@ FUNCTION get_bucket_id (key, 
number_of_buckets_per_element, ibf_size)
                                 The algorithm to find the correct bucket in 
which the ID and the HASH have to be added
                                 is described in detail in section <xref 
target="ibf_format_bucket_identification" format="title" />.
                             </t>
-                            
+
                             <t>
                                 Test vectors for an implementation can be 
found in the <xref target="test_vectors" format="title" /> section
                             </t>
@@ -1976,8 +1980,8 @@ FUNCTION ibf_get_max_counter(ibf)
 
 # INPUTS:
 # ibf: The IBF
-# offset: The offset which defines the starting point from which bucket the 
compress operation should start
-# count: The number of buckets to in the array that should be compressed
+# offset: The offset which defines the starting point from which bucket the 
compress operation starts
+# count: The number of buckets to in the array that will be compressed
 # OUTPUTS:
 # returns: An byte array of compressed counters to send over the network
 
@@ -2031,8 +2035,8 @@ FUNCTION pack_counter(ibf, offset, count)
 
 # INPUTS:
 # ibf: The IBF
-# offset: The offset which defines the starting point from which bucket the 
compress operation should start
-# count: The number of buckets to in the array that should be compressed
+# offset: The offset which defines the starting point from which bucket the 
compress operation starts
+# count: The number of buckets to in the array that will be compressed
 # counter_bit_length: The bit length of the counter can be found in the ibf 
message in the ibf_counter_bit_length field
 # packed_data: A byte array which contains the data packed with the 
pack_counter function
 # OUTPUTS:
@@ -2106,7 +2110,7 @@ FUNCTION unpack_counter(ibf, offset, count, 
counter_bit_length, packed_data)
         </t>
         <t>
             The formal format of all messages needs to be properly validated. 
This is important to prevent many
-            attacks on the code. The application data should be validated by 
the application using
+            attacks on the code. The application data MUST be validated by the 
application using
             the protocol not by the implementation of the protocol.
             In case the format validation fails the set operation MUST be 
terminated.
         </t>
@@ -2539,7 +2543,7 @@ FUNCTION validate_messages_full_done(full_done_msg, 
full_element_msg_store,
                     In the Active Decoding state it is important to prevent an 
attacker from
                     generating and passing an unlimited amount of IBFs, that 
do not decode or
                     even worse, generate an IBF constructed, to send the peers 
in an endless loop.
-                    To prevent an endless loop in decoding, a loop detection 
should be implemented.
+                    To prevent an endless loop in decoding, a loop detection 
MUST be implemented.
                     The simplest solution would be to prevent decoding of more 
than a given amount of elements.
                     A more robust solution is to implement a algorithm that 
detects a loop by
                     analyzing past partially decoded IBFs. This can be archived
@@ -2749,7 +2753,7 @@ FUNCTION validate_messages_done(done_msg, offer_msg_store,
                             to ensure this, multiple factors need to be 
considered: The absolute plausible maximum of
                             elements in a set, which has to be predefined 
according
                             to the use case and the maximal plausible element 
increase since the last
-                            successful set reconciliation, which should be 
either predefined or can be calculated
+                            successful set reconciliation, which SHOULD be 
either predefined or can be calculated
                             with the gaussian distribution function over all 
passed set reconciliations.
                         </t>
                         <t>
@@ -2811,7 +2815,7 @@ FUNCTION validate_messages_se(se_msg, remote_setsize, 
local_setsize)
                             When the full done message is received from the 
remote peer all
                             elements, that the remote peer has committed to, 
need to be received,
                             otherwise the operation MUST be terminated. After 
receiving the
-                            full done message no future elements should be 
accepted.
+                            full done message, future elements MUST NOT be 
accepted.
                             The 512-bit hash of the complete reconciled set 
contained in
                             the full done message is required to ensure that 
both sets are truly identical. If
                             the sets differ, a resynchronisation is required. 
The number of possible
@@ -2831,10 +2835,10 @@ FUNCTION validate_messages_se(se_msg, remote_setsize, 
local_setsize)
                     <dd>
                         <t>
                             In case an IBF message is received by the peer a 
active/passive role switch
-                            is initiated by the active decoding remote peer. 
In this moment the peer should
+                            is initiated by the active decoding remote peer. 
In this moment the peer MUST
                             wait for all open offers and demands to be 
fulfilled, to prevent
                             retransmission before switching into active 
decoding operation mode.
-                            A switch into active decoding mode should only be 
permitted for
+                            A switch into active decoding mode MUST only be 
permitted for
                             a predefined number of times as described in the 
top section
                             of the security section.
                         </t>
@@ -2885,7 +2889,7 @@ FUNCTION validate_messages_ibf(offer_msg_store, 
demand_msg_store, element_msg_st
                         predefined absolute maximum of elements in the set, 
which is defined
                         by real world limitations.
                         To implement this restrictions, a list with all 
received inquiries
-                        should be stored and new inquiries should be checked 
against.
+                        MUST be stored and new inquiries MUST be checked 
against.
                         </t>
                         <figure 
anchor="security_states_passive_decoding_inquiry_code">
                             <artwork name="" type="" align="left" 
alt=""><![CDATA[
@@ -3179,4 +3183,3 @@ counter serie 3: 0x440B
         </section>
     </back>
 </rfc>
-

-- 
To stop receiving notification emails like this one, please contact
gnunet@gnunet.org.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]