[Monotone-devel] Netsync performance improvement patch

monotone-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Monotone-devel] Netsync performance improvement patch

From:	Eric Anderson
Subject:	[Monotone-devel] Netsync performance improvement patch
Date:	Wed, 10 Aug 2005 00:29:30 -0700

This is the patch promised in my previous message that makes it
possible to get through performance test in a reasonable amount of
time.

Summary: The attached patch changes the recieve buffer from a string
to a string_queue.  This changes an O(n^2) algorithm to an O(n)
algorithm.  The practical effect on a smallish database is a 3.48x
CPU usage reduction on the pull side.  On a somewhat extreme case, it
resulted in a 24.7x CPU reduction on the pull side.

This patch will apply cleanly against the 0.22 revision of monotone
with the accounting patch applied.  If you don't have the accounting
patch applied, the ChangeLog patch will fail.

The ChangeLog entry (included in the patch):

2005-08-09  Eric Anderson  <address@hidden>

        * Changes to significantly improve network pull performance *
        string_queue.hh: created to store pending data and allow for
        efficient removal from the front.  The string queue automatically
        reduces its buffer size if it is very empty.    
        * hmac.{cc,hh}: Add in a version of chained_hmac::process that can
        operate on a string_queue for use during read.
        * netcmd.{cc,hh}: update netcmd::read to use a string_queue rather
        than a string, update all the regression tests also.  This required
        the somewhat ugly creation of a read_string function because the
        netcmd read and write functions are no longer using the same type.
        * netio.hh: introduce functions for operating on a string_queue. They
        are identical to the equivalent string functions except for the type
        of the argument.
        * netsync.cc: Use a string_queue rather than a string for storing the 
        input and output buffers.

Detailed discussion:

The current recieve side algorithm appends "packets" to a string as
they come in over the network.  Once an entire packet has been
recieved, the packet is removed from the front of the string.  If a
number of small packets arrive, and then a larger packet before the
smaller packets are removed, then the net effect will be to copy the
large packet multiple times before the smaller packet can be
processed.  In the worst case this makes the amount of copies O(n^2)
where n is the amount of data put into the buffer.  The problem seems
to be exacerbated the larger the amount of data that needs to be
transferred.

The solution to the problem is to maintain a string_queue that stores
the data.  Instead of copying the data to remove something from the
front, a pointer is simply moved forward.  This is almost the standard
implementation of a double-ended queue using an array with rotating
pointers in the array.  The only difference is that because a large
amount of the monotone netsync code assumes that the recieve data is
contiguous, when data would be appended that would overflow the end of
the buffer, the remaining data is copied to the front.

In addition, to avoid maintaining a large buffer after processing a
single large file, the string queue will resize itself downward if the
amount of data occupying it is significantly less than the buffer
size.

There is one ugly part of the patch.  The regression tests in
netcmd.cc using a string buffer to test writing and reading netcmd's.
To make this work, and to properly exercise the removal from a string
queue function, a function only used by the regression tests was
created which converts the string to a string_queue, uses the existing
function and then converts the string_queue back into a string.  This
ugliness could be eliminated by converting the output buffer to being
a string queue instead of a list of string, offset pairs.  This change
would probably have no effect on the performance, and would happen to
simplify the output code a small bit because the output code would no
longer have to track the fact that it has multiple output buffers.

The patch also notes a TODO in the netsync code where a large file
will be copied twice in a row; it was not clear what the appropriate
fix would be so the problem was just noted and left alone.  The patch
also has two QUESTIONS in it where there were pieces in the existing
code that seemed either redundant or unnecessary, however I didn't
want to remove them because I may have been missing something.

The patch has no performance or memory effect on almost all of the
tests in the performance test because they are either very small (just
the 0.19 version of monotone), or a single file.  The resident set
size is effectively unchanged by the patch, and the process size
decreases in a few cases.

For the case of having all the files from 0.10-0.22 in the checkout,
we have a 3.48x CPU usage reduction on the pull side, and 1.08x on the
serve side.  The copies and allocations on the pull side dropped by
54x and 39x respectively.

For the case of having all test files in the repository (an extreme
test), we have a 24.7x CPU usage reduction on the pull size, and 1.04x
on the serve side.  The copies and alllocations on the pull side
dropped by 192x and 189x respectively.  In this case, moving a 279MB
database via netsync caused 1.1 TiB of data to be copied, and took 41
CPU minutes.  Note that a 279MiB database is not completely
unreasonable.  I have a 117MiB database from real usage right now, and
they only get larger.

--- Performance Before:

monotone 0.22 (base revision: 28058ae3e850229a5d8fae65415cbbf82b435377)
                                     Maximum (MiB)     Copied   Malloc
     *Test*     Operation  CPU(s)   Size  Resident     (MiB)    (MiB)
--------------- ---------  ------  -------  -------   -------  -------
zero_small      add files    0.02     7.29     2.75       0.3      1.0
zero_small      commit       0.28     7.62     4.07       8.5     10.8
zero_small      checkout     0.03     7.36     3.52       0.4      1.3
zero_small      serve        0.25     9.38     4.78       5.5     10.4
zero_small      pull         0.05     9.38     4.71       0.6      2.0

zero_large      add files    3.91   420.42   414.09    1447.6    101.0
zero_large      commit       8.79   308.04   204.42     711.7    317.0
zero_large      checkout     4.45   208.66   204.50     401.1    208.2
zero_large      serve        7.23   210.41   205.79     606.6    221.1
zero_large      pull         6.76   310.55   205.79     602.6    310.5

random_medium   add files    0.25    51.67    45.34     126.3     11.0
random_medium   commit       2.79    73.93    69.79     196.8    114.9
random_medium   checkout     1.94    56.67    39.18      77.5     64.6
random_medium   serve        3.50    58.37    46.64     162.9    107.8
random_medium   pull         2.53    85.85    80.96     229.2    150.0

halfzero_large  add files    3.77   414.81   409.91    1739.4    101.0
halfzero_large  commit      18.07   413.28   409.19    1302.4    660.2
halfzero_large  checkout    12.35   345.79   274.10     586.2    413.2
halfzero_large  serve       21.20   347.49   307.53    1091.7    583.2
halfzero_large  pull        17.14   465.63   460.70    1444.6    868.9

random_large    add files    3.75   414.81   409.33    1963.3    101.0
random_large    commit      28.71   651.73   647.56    1894.2   1038.4
random_large    checkout    20.68   480.88   341.64     771.3    616.2
random_large    serve       35.50   482.55   407.45    1577.2    940.2
random_large    pull        28.12   753.71   748.77    2286.6   1459.7

monotone        add files    0.38     9.23     4.45      26.5     11.1
monotone        commit       3.33    12.74     9.06     131.5    109.5
monotone        checkout     1.57    19.60    15.68      55.8     45.3
monotone        serve        3.04    20.68    15.76     122.4     99.4
monotone        pull         3.41    15.18    10.32     138.8    137.0

mt_multiple     add files    4.86    12.29     7.40     238.9    103.5
mt_multiple     commit      29.04    39.20    35.25     893.7    678.6
mt_multiple     checkout    14.84    25.71    21.79     477.1    367.8
mt_multiple     serve       10.12    19.99    14.95     320.0    261.0
mt_multiple     pull        88.47    53.00    37.57   30291.3  20461.8

mt_bigfiles     add files    3.71    73.76    69.06    1318.2    106.8
mt_bigfiles     commit      17.61    50.80    42.32    1141.2    466.4
mt_bigfiles     checkout    10.28    50.20    41.49     519.6    261.2
mt_bigfiles     serve       20.97    52.14    42.58     934.4    404.9
mt_bigfiles     pull        17.24    56.63    47.30    1167.3    608.4

everything      add files   19.94   500.49   495.10    6707.3    519.4
everything      commit     104.94   658.30   654.24    6098.8   2997.4
everything      checkout    68.16   497.13   392.59    2831.1   1693.8
everything      serve      105.79   497.19   416.21    4905.3   2363.4
everything      pull       2480.6   875.84   779.97   1168339   654884


--- Performance After:
Test CPU: Intel(R) Pentium(R) M processor 1700MHz
monotone 0.22 (base revision: 072f38da9450e2e2e406332a480c8c7a50736f8b)
                                     Maximum (MiB)     Copied   Malloc
     *Test*     Operation  CPU(s)   Size  Resident     (MiB)    (MiB)
--------------- ---------  ------  -------  -------   -------  -------
zero_small      add files    0.02     7.30     2.72       0.3      1.0
zero_small      commit       0.27     7.63     4.04       8.3      9.8
zero_small      checkout     0.03     7.37     3.49       0.4      1.3
zero_small      serve        0.26     9.34     4.81       5.5     12.0
zero_small      pull         0.04     9.39     4.74       0.6      2.0

zero_large      add files    3.84   420.43   415.03    1447.6    101.0
zero_large      commit       8.73   308.08   203.94     711.3    316.4
zero_large      checkout     4.42   208.67   204.48     401.1    208.2
zero_large      serve        7.28   210.42   205.82     606.5    222.6
zero_large      pull         6.72   310.64   205.78     602.4    310.1

random_medium   add files    0.26    51.68    43.94     126.3     11.0
random_medium   commit       2.80    73.94    69.74     196.9    115.2
random_medium   checkout     1.95    56.68    39.16      77.5     64.6
random_medium   serve        3.47    58.42    46.66     162.4    105.4
random_medium   pull         2.56    85.68    80.99     219.2    137.4

halfzero_large  add files    3.77   414.82   407.79    1739.4    101.0
halfzero_large  commit      18.00   413.29   409.14    1302.6    660.8
halfzero_large  checkout    12.34   345.80   274.08     586.2    413.5
halfzero_large  serve       21.67   347.38   307.55    1092.1    583.2
halfzero_large  pull        16.94   465.37   460.75    1394.5    806.2

random_large    add files    3.61   414.82   409.60    1963.3    101.0
random_large    commit      28.50   651.74   647.52    1893.8   1039.7
random_large    checkout    20.30   480.88   341.62     771.3    615.6
random_large    serve       35.27   482.50   407.48    1577.1    942.3
random_large    pull        27.93   753.48   748.80    2186.5   1335.0

monotone        add files    0.38     9.24     4.42      26.5     11.1
monotone        commit       3.28    12.64     9.03     131.4    108.1
monotone        checkout     1.58    19.75    15.72      55.8     45.3
monotone        serve        3.02    20.78    15.52     101.5     91.1
monotone        pull         3.42    15.52    10.69     130.2    104.4

mt_multiple     add files    4.86    12.88     7.56     238.9    103.5
mt_multiple     commit      29.05    39.21    35.21     894.0    680.8
mt_multiple     checkout    14.73    25.65    21.77     477.9    368.3
mt_multiple     serve        9.38    19.80    14.98     319.5    254.0
mt_multiple     pull        25.41    42.98    38.43     557.2    524.5

mt_bigfiles     add files    3.70    73.77    69.03    1318.2    106.8
mt_bigfiles     commit      17.61    50.81    42.28    1141.7    468.8
mt_bigfiles     checkout    10.34    50.20    41.46     519.6    261.2
mt_bigfiles     serve       21.03    52.16    42.60     934.2    405.8
mt_bigfiles     pull        17.29    56.27    47.29    1134.3    566.4

everything      add files   19.77   500.50   494.79    6707.3    519.4
everything      commit     105.09   658.31   654.20    6098.6   2997.3
everything      checkout    66.96   497.14   392.41    2832.1   1694.9
everything      serve      101.61   496.60   415.63    4664.8   2237.3
everything      pull       100.17   764.49   759.98    6095.8   3469.3

netsync-string-queue.patch
Description: Binary data

[Prev in Thread]

Current Thread

[Next in Thread]

[Monotone-devel] Netsync performance improvement patch, Eric Anderson <=
- Re: [Monotone-devel] Netsync performance improvement patch, Eric Anderson, 2005/08/12
  - Re: [Monotone-devel] Netsync performance improvement patch, Matt Johnston, 2005/08/14
    - Re: [Monotone-devel] Netsync performance improvement patch, Eric Anderson, 2005/08/15
    - Re: [Monotone-devel] Netsync performance improvement patch, Nathaniel Smith, 2005/08/15
    - Re: [Monotone-devel] Netsync performance improvement patch, Eric Anderson, 2005/08/15
    - Re: [Monotone-devel] Netsync performance improvement patch, Petr Ovtchenkov, 2005/08/15
    - Re: [Monotone-devel] Netsync performance improvement patch, Eric Anderson, 2005/08/15

Prev by Date: Re: [Monotone-devel] [Fwd: monotone_0.22-1_i386.changes ACCEPTED]
Next by Date: Re: [Monotone-devel] [ANNOUNCE] monotone 0.22
Previous by thread: [Monotone-devel] Interesting bug in monotone-viz
Next by thread: Re: [Monotone-devel] Netsync performance improvement patch
Index(es):
- Date
- Thread