gnutls-commit
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[SCM] GNU gnutls branch, master, updated. gnutls_2_99_2-40-ga039caa


From: Nikos Mavrogiannopoulos
Subject: [SCM] GNU gnutls branch, master, updated. gnutls_2_99_2-40-ga039caa
Date: Wed, 01 Jun 2011 15:04:59 +0000

This is an automated email from the git hooks/post-receive script. It was
generated because a ref change was pushed to the repository containing
the project "GNU gnutls".

http://git.savannah.gnu.org/cgit/gnutls.git/commit/?id=a039caa12418bda113f0cc6fd0457b1ea194b9e2

The branch, master has been updated
       via  a039caa12418bda113f0cc6fd0457b1ea194b9e2 (commit)
       via  33f9812bf5dc8bd3f6c838a479709a55bc472b59 (commit)
       via  3e5615a029acbec8f76fc90246c05ab880ecf15d (commit)
       via  eaad59046eb13534548c45b0d107fdbdbc007d0d (commit)
       via  b948e80596ba5810043ef53072a8f712a07bcafc (commit)
       via  f96480d46bb129810ceb4af0110f680fa508a5e7 (commit)
       via  2cfb3af0a6b8cd3a7be436420317da37643d064f (commit)
       via  63b7a5903ea3d7fbc9378ea5e4cc2dfc96c495f5 (commit)
       via  48b8c4ae5f2fd5cb8d0933c619e4777fdc89228d (commit)
       via  3fa64d9b9821d12d69033b5889281f39aa6e6b92 (commit)
      from  dc760beb15db99f654b0b9d1186b3b0f8ebd3ab1 (commit)

Those revisions listed above that are new to this repository have
not appeared on any other notification email; so we list those
revisions in full, below.

- Log -----------------------------------------------------------------
commit a039caa12418bda113f0cc6fd0457b1ea194b9e2
Author: Nikos Mavrogiannopoulos <address@hidden>
Date:   Wed Jun 1 16:52:09 2011 +0200

    Added benchmark on GCM ciphersuites and arcfour for comparison.

commit 33f9812bf5dc8bd3f6c838a479709a55bc472b59
Author: Nikos Mavrogiannopoulos <address@hidden>
Date:   Wed Jun 1 16:46:55 2011 +0200

    corrected typo.

commit 3e5615a029acbec8f76fc90246c05ab880ecf15d
Author: Nikos Mavrogiannopoulos <address@hidden>
Date:   Wed Jun 1 16:45:55 2011 +0200

    indented code

commit eaad59046eb13534548c45b0d107fdbdbc007d0d
Author: Nikos Mavrogiannopoulos <address@hidden>
Date:   Wed Jun 1 16:42:02 2011 +0200

    properly initialize benchmarks.

commit b948e80596ba5810043ef53072a8f712a07bcafc
Author: Nikos Mavrogiannopoulos <address@hidden>
Date:   Wed Jun 1 16:33:42 2011 +0200

    bumped version.

commit f96480d46bb129810ceb4af0110f680fa508a5e7
Author: Nikos Mavrogiannopoulos <address@hidden>
Date:   Wed Jun 1 16:33:28 2011 +0200

    Corrections in encryption and decryption of incomplete blocks.

commit 2cfb3af0a6b8cd3a7be436420317da37643d064f
Author: Nikos Mavrogiannopoulos <address@hidden>
Date:   Wed Jun 1 14:36:44 2011 +0200

    Use nettle's memxor or gnulib's if it doesn't exist.

commit 63b7a5903ea3d7fbc9378ea5e4cc2dfc96c495f5
Author: Nikos Mavrogiannopoulos <address@hidden>
Date:   Wed Jun 1 14:33:12 2011 +0200

    Added AES-GCM optimizations using the PCLMULQDQ instruction. Uses Andy 
Polyakov's assembly code.

commit 48b8c4ae5f2fd5cb8d0933c619e4777fdc89228d
Author: Nikos Mavrogiannopoulos <address@hidden>
Date:   Wed Jun 1 14:04:11 2011 +0200

    documented usage of gnutls_cipher_add_auth().

commit 3fa64d9b9821d12d69033b5889281f39aa6e6b92
Author: Nikos Mavrogiannopoulos <address@hidden>
Date:   Wed Jun 1 00:04:47 2011 +0200

    updates.

-----------------------------------------------------------------------

Summary of changes:
 NEWS                                             |    3 +
 configure.ac                                     |    2 +-
 doc/cha-intro-tls.texi                           |    6 +-
 lib/accelerated/intel/Makefile.am                |    6 +-
 lib/accelerated/intel/aes-gcm-x86.c              |  271 ++++++
 lib/accelerated/intel/aes-x86.c                  |   68 +-
 lib/accelerated/intel/aes-x86.h                  |   41 +
 lib/accelerated/intel/asm/appro-aes-gcm-x86-64.s | 1065 ++++++++++++++++++++++
 lib/accelerated/intel/asm/appro-aes-gcm-x86.s    |  991 ++++++++++++++++++++
 lib/crypto-api.c                                 |    4 +-
 lib/gnutls_int.h                                 |    7 +
 lib/gnutls_num.c                                 |    9 +
 lib/gnutls_num.h                                 |    1 +
 lib/gnutls_state.c                               |   16 +-
 m4/hooks.m4                                      |    2 +-
 src/benchmark-tls.c                              |   20 +-
 src/benchmark.c                                  |    2 +-
 tests/cipher-test.c                              |  735 +++++++++------
 18 files changed, 2887 insertions(+), 362 deletions(-)
 create mode 100644 lib/accelerated/intel/aes-gcm-x86.c
 create mode 100644 lib/accelerated/intel/asm/appro-aes-gcm-x86-64.s
 create mode 100644 lib/accelerated/intel/asm/appro-aes-gcm-x86.s

diff --git a/NEWS b/NEWS
index 86abf15..02b87ea 100644
--- a/NEWS
+++ b/NEWS
@@ -5,6 +5,9 @@ See the end for copying conditions.
 
 * Version 2.99.3 (unreleased)
 
+** libgnutls: Added AES-GCM optimizations using the PCLMULQDQ
+instruction. Uses Andy Polyakov's assembly code.
+
 ** libgnutls: Added ECDHE-PSK ciphersuites for TLS (RFC 5489).
 
 ** API and ABI modifications:
diff --git a/configure.ac b/configure.ac
index 00f4a7e..e02ed71 100644
--- a/configure.ac
+++ b/configure.ac
@@ -22,7 +22,7 @@ dnl Process this file with autoconf to produce a configure 
script.
 # USA
 
 AC_PREREQ(2.61)
-AC_INIT([GnuTLS], [2.99.2], address@hidden)
+AC_INIT([GnuTLS], [2.99.3], address@hidden)
 AC_CONFIG_AUX_DIR([build-aux])
 AC_CONFIG_MACRO_DIR([m4])
 
diff --git a/doc/cha-intro-tls.texi b/doc/cha-intro-tls.texi
index 2109d2b..7646992 100644
--- a/doc/cha-intro-tls.texi
+++ b/doc/cha-intro-tls.texi
@@ -394,7 +394,7 @@ To initiate the handshake.
 * Client Authentication::       Requesting a certificate from the client.
 * Resuming Sessions::           Reusing previously established keys.
 * Resuming Internals::          More information on reusing previously 
established keys.
-* Interoperability Issues::     Interoperability issues with other 
implementations.
+* Interoperability::            About interoperability with other 
implementations.
 @end menu
 
 @node TLS Cipher Suites
@@ -661,8 +661,8 @@ It might also be useful to be able to check for expired 
sessions in
 order to remove them, and save space. The function
 @ref{gnutls_db_check_entry} is provided for that reason.
 
address@hidden Interoperability Issues
address@hidden Interoperability Issues
address@hidden Interoperability
address@hidden Interoperability
 
 The @acronym{TLS} handshake is a complex procedure that negotiates all
 required parameters for a secure session. @acronym{GnuTLS} supports
diff --git a/lib/accelerated/intel/Makefile.am 
b/lib/accelerated/intel/Makefile.am
index 013fd9d..c0d380e 100644
--- a/lib/accelerated/intel/Makefile.am
+++ b/lib/accelerated/intel/Makefile.am
@@ -37,12 +37,12 @@ EXTRA_DIST = aes-x86.h README license.txt
 
 noinst_LTLIBRARIES = libintel.la
 
-libintel_la_SOURCES = aes-x86.c
+libintel_la_SOURCES = aes-x86.c aes-gcm-x86.c
 libintel_la_LIBADD =
 
 if ASM_X86_64
-libintel_la_SOURCES += asm/appro-aes-x86-64.s
+libintel_la_SOURCES += asm/appro-aes-x86-64.s asm/appro-aes-gcm-x86-64.s
 else
-libintel_la_SOURCES += asm/appro-aes-x86.s
+libintel_la_SOURCES += asm/appro-aes-x86.s asm/appro-aes-gcm-x86.s
 endif
 
diff --git a/lib/accelerated/intel/aes-gcm-x86.c 
b/lib/accelerated/intel/aes-gcm-x86.c
new file mode 100644
index 0000000..22bbac9
--- /dev/null
+++ b/lib/accelerated/intel/aes-gcm-x86.c
@@ -0,0 +1,271 @@
+/*
+ * Copyright (C) 2011, Free Software Foundation
+ *
+ * Author: Nikos Mavrogiannopoulos
+ *
+ * This file is part of GnuTLS.
+ *
+ * The GnuTLS is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public License
+ * as published by the Free Software Foundation; either version 2.1 of
+ * the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301,
+ * USA
+ *
+ * The following code is an implementation of the AES-128-CBC cipher
+ * using intel's AES instruction set. It is based on Intel reference
+ * code.
+ */
+
+#include <gnutls_errors.h>
+#include <gnutls_int.h>
+#include <gnutls/crypto.h>
+#include <gnutls_errors.h>
+#include <aes-x86.h>
+#include <x86.h>
+#include <byteswap.h>
+
+#define GCM_BLOCK_SIZE 16
+
+/* GCM mode */
+
+typedef struct
+{
+  uint64_t hi, lo;
+} u128;
+
+/* This is the gcm128 structure used in openssl. It
+ * is compatible with the included assembly code.
+ */
+struct gcm128_context
+{
+  union
+  {
+    uint64_t u[2];
+    uint32_t d[4];
+    uint8_t c[16];
+  } Yi, EKi, EK0, len, Xi, H;
+  u128 Htable[16];
+};
+
+struct aes_gcm_ctx
+{
+  AES_KEY expanded_key;
+  struct gcm128_context gcm;
+};
+
+void gcm_init_clmul (u128 Htable[16], const u64 Xi[2]);
+void gcm_ghash_clmul (uint64_t Xi[2], const u128 Htable[16],
+                      const uint8_t * inp, size_t len);
+void gcm_gmult_clmul (u64 Xi[2], const u128 Htable[16]);
+
+static void
+aes_gcm_deinit (void *_ctx)
+{
+  gnutls_free (_ctx);
+}
+
+static int
+aes_gcm_cipher_init (gnutls_cipher_algorithm_t algorithm, void **_ctx)
+{
+  struct aes_gcm_ctx *ctx;
+
+  /* we use key size to distinguish */
+  if (algorithm != GNUTLS_CIPHER_AES_128_GCM)
+    return GNUTLS_E_INVALID_REQUEST;
+
+  *_ctx = gnutls_calloc (1, sizeof (struct aes_gcm_ctx));
+  if (*_ctx == NULL)
+    {
+      gnutls_assert ();
+      return GNUTLS_E_MEMORY_ERROR;
+    }
+
+  ctx = *_ctx;
+
+  return 0;
+}
+
+static int
+aes_gcm_cipher_setkey (void *_ctx, const void *userkey, size_t keysize)
+{
+  struct aes_gcm_ctx *ctx = _ctx;
+  int ret;
+
+  ret = aesni_set_encrypt_key (userkey, keysize * 8, &ctx->expanded_key);
+  if (ret != 0)
+    return gnutls_assert_val (GNUTLS_E_ENCRYPTION_FAILED);
+
+  aesni_ecb_encrypt (ctx->gcm.H.c, ctx->gcm.H.c,
+                     GCM_BLOCK_SIZE, &ctx->expanded_key, 1);
+
+  ctx->gcm.H.u[0] = bswap_64 (ctx->gcm.H.u[0]);
+  ctx->gcm.H.u[1] = bswap_64 (ctx->gcm.H.u[1]);
+
+  gcm_init_clmul (ctx->gcm.Htable, ctx->gcm.H.u);
+
+  return 0;
+}
+
+static int
+aes_gcm_setiv (void *_ctx, const void *iv, size_t iv_size)
+{
+  struct aes_gcm_ctx *ctx = _ctx;
+
+  if (iv_size != GCM_BLOCK_SIZE - 4)
+    return GNUTLS_E_INVALID_REQUEST;
+
+  memset (ctx->gcm.Xi.c, 0, sizeof (ctx->gcm.Xi.c));
+  memset (ctx->gcm.len.c, 0, sizeof (ctx->gcm.len.c));
+
+  memcpy (ctx->gcm.Yi.c, iv, GCM_BLOCK_SIZE - 4);
+  ctx->gcm.Yi.c[GCM_BLOCK_SIZE - 4] = 0;
+  ctx->gcm.Yi.c[GCM_BLOCK_SIZE - 3] = 0;
+  ctx->gcm.Yi.c[GCM_BLOCK_SIZE - 2] = 0;
+  ctx->gcm.Yi.c[GCM_BLOCK_SIZE - 1] = 1;
+
+  aesni_ecb_encrypt (ctx->gcm.Yi.c, ctx->gcm.EK0.c,
+                     GCM_BLOCK_SIZE, &ctx->expanded_key, 1);
+  ctx->gcm.Yi.c[GCM_BLOCK_SIZE - 1] = 2;
+  return 0;
+}
+
+static void
+gcm_ghash (struct aes_gcm_ctx *ctx, const uint8_t * src, size_t src_size)
+{
+  size_t rest = src_size % GCM_BLOCK_SIZE;
+  size_t aligned_size = src_size - rest;
+
+  if (aligned_size > 0)
+    gcm_ghash_clmul (ctx->gcm.Xi.u, ctx->gcm.Htable, src, aligned_size);
+
+  if (rest > 0)
+    {
+      memxor (ctx->gcm.Xi.c, src + aligned_size, rest);
+      gcm_gmult_clmul (ctx->gcm.Xi.u, ctx->gcm.Htable);
+    }
+}
+
+static inline void
+ctr_encrypt_last (struct aes_gcm_ctx *ctx, const uint8_t * src,
+                  uint8_t * dst, size_t pos, size_t length)
+{
+  uint8_t tmp[GCM_BLOCK_SIZE];
+  uint8_t out[GCM_BLOCK_SIZE];
+
+  memcpy (tmp, &src[pos], length);
+  aesni_ctr32_encrypt_blocks (tmp, out, 1, &ctx->expanded_key, ctx->gcm.Yi.c);
+
+  memcpy (&dst[pos], out, length);
+
+}
+
+static int
+aes_gcm_encrypt (void *_ctx, const void *src, size_t src_size,
+                 void *dst, size_t length)
+{
+  struct aes_gcm_ctx *ctx = _ctx;
+  int blocks = src_size / GCM_BLOCK_SIZE;
+  int exp_blocks = blocks * GCM_BLOCK_SIZE;
+  int rest = src_size - (exp_blocks);
+  uint32_t counter;
+
+  if (blocks > 0)
+    {
+      aesni_ctr32_encrypt_blocks (src, dst,
+                                  blocks, &ctx->expanded_key, ctx->gcm.Yi.c);
+
+      counter = _gnutls_read_uint32 (ctx->gcm.Yi.c + 12);
+      counter += blocks;
+      _gnutls_write_uint32 (counter, ctx->gcm.Yi.c + 12);
+    }
+
+  if (rest > 0)                 /* last incomplete block */
+    ctr_encrypt_last (ctx, src, dst, exp_blocks, rest);
+
+  gcm_ghash (ctx, dst, src_size);
+  ctx->gcm.len.u[1] += src_size;
+
+  return 0;
+}
+
+static int
+aes_gcm_decrypt (void *_ctx, const void *src, size_t src_size,
+                 void *dst, size_t dst_size)
+{
+  struct aes_gcm_ctx *ctx = _ctx;
+  int blocks = src_size / GCM_BLOCK_SIZE;
+  int exp_blocks = blocks * GCM_BLOCK_SIZE;
+  int rest = src_size - (exp_blocks);
+  uint32_t counter;
+
+  gcm_ghash (ctx, src, src_size);
+  ctx->gcm.len.u[1] += src_size;
+
+  if (blocks > 0)
+    {
+      aesni_ctr32_encrypt_blocks (src, dst,
+                                  blocks, &ctx->expanded_key, ctx->gcm.Yi.c);
+
+      counter = _gnutls_read_uint32 (ctx->gcm.Yi.c + 12);
+      counter += blocks;
+      _gnutls_write_uint32 (counter, ctx->gcm.Yi.c + 12);
+    }
+
+  if (rest > 0)                 /* last incomplete block */
+    ctr_encrypt_last (ctx, src, dst, exp_blocks, rest);
+
+  return 0;
+}
+
+static int
+aes_gcm_auth (void *_ctx, const void *src, size_t src_size)
+{
+  struct aes_gcm_ctx *ctx = _ctx;
+
+  gcm_ghash (ctx, src, src_size);
+  ctx->gcm.len.u[0] += src_size;
+
+  return 0;
+}
+
+
+static void
+aes_gcm_tag (void *_ctx, void *tag, size_t tagsize)
+{
+  struct aes_gcm_ctx *ctx = _ctx;
+  uint8_t buffer[GCM_BLOCK_SIZE];
+  uint64_t alen, clen;
+
+  alen = ctx->gcm.len.u[0] * 8;
+  clen = ctx->gcm.len.u[1] * 8;
+
+  _gnutls_write_uint64 (alen, buffer);
+  _gnutls_write_uint64 (clen, &buffer[8]);
+
+  gcm_ghash_clmul (ctx->gcm.Xi.u, ctx->gcm.Htable, buffer, GCM_BLOCK_SIZE);
+
+  ctx->gcm.Xi.u[0] ^= ctx->gcm.EK0.u[0];
+  ctx->gcm.Xi.u[1] ^= ctx->gcm.EK0.u[1];
+
+  memcpy (tag, ctx->gcm.Xi.c, MIN (GCM_BLOCK_SIZE, tagsize));
+}
+
+const gnutls_crypto_cipher_st aes_gcm_struct = {
+  .init = aes_gcm_cipher_init,
+  .setkey = aes_gcm_cipher_setkey,
+  .setiv = aes_gcm_setiv,
+  .encrypt = aes_gcm_encrypt,
+  .decrypt = aes_gcm_decrypt,
+  .deinit = aes_gcm_deinit,
+  .tag = aes_gcm_tag,
+  .auth = aes_gcm_auth,
+};
diff --git a/lib/accelerated/intel/aes-x86.c b/lib/accelerated/intel/aes-x86.c
index daffccd..02c4549 100644
--- a/lib/accelerated/intel/aes-x86.c
+++ b/lib/accelerated/intel/aes-x86.c
@@ -32,27 +32,6 @@
 #include <aes-x86.h>
 #include <x86.h>
 
-#ifdef __GNUC__
-# define ALIGN16 __attribute__ ((aligned (16)))
-#else
-# define ALIGN16
-#endif
-
-#define AES_MAXNR 14
-typedef struct
-{
-  uint32_t ALIGN16 rd_key[4 * (AES_MAXNR + 1)];
-  int rounds;
-} AES_KEY;
-
-void aesni_cbc_encrypt (const unsigned char *in, unsigned char *out,
-                        size_t len, const AES_KEY * key,
-                        unsigned char *ivec, const int enc);
-int aesni_set_decrypt_key (const unsigned char *userKey, const int bits,
-                           AES_KEY * key);
-int aesni_set_encrypt_key (const unsigned char *userKey, const int bits,
-                           AES_KEY * key);
-
 struct aes_ctx
 {
   AES_KEY expanded_key;
@@ -110,23 +89,22 @@ aes_setiv (void *_ctx, const void *iv, size_t iv_size)
 }
 
 static int
-aes_encrypt (void *_ctx, const void *plain, size_t plainsize,
-             void *encr, size_t length)
+aes_encrypt (void *_ctx, const void *src, size_t src_size,
+             void *dst, size_t dst_size)
 {
   struct aes_ctx *ctx = _ctx;
 
-  aesni_cbc_encrypt (plain, encr, plainsize, &ctx->expanded_key, ctx->iv, 1);
+  aesni_cbc_encrypt (src, dst, src_size, &ctx->expanded_key, ctx->iv, 1);
   return 0;
 }
 
 static int
-aes_decrypt (void *_ctx, const void *encr, size_t encrsize,
-             void *plain, size_t length)
+aes_decrypt (void *_ctx, const void *src, size_t src_size,
+             void *dst, size_t dst_size)
 {
   struct aes_ctx *ctx = _ctx;
 
-  aesni_cbc_encrypt (encr, plain, encrsize,
-                     &ctx->expanded_key_dec, ctx->iv, 0);
+  aesni_cbc_encrypt (src, dst, src_size, &ctx->expanded_key_dec, ctx->iv, 0);
 
   return 0;
 }
@@ -156,17 +134,25 @@ check_optimized_aes (void)
 }
 
 static unsigned
+check_pclmul (void)
+{
+  unsigned int a, b, c, d;
+  cpuid (1, a, b, c, d);
+
+  return (c & 0x2);
+}
+
+static unsigned
 check_intel_or_amd (void)
 {
   unsigned int a, b, c, d;
   cpuid (0, a, b, c, d);
 
-  if ((memcmp(&b, "Genu", 4) == 0 &&
-               memcmp(&d, "ineI", 4) == 0 &&
-               memcmp(&c, "ntel", 4) == 0) ||
-     (memcmp(&b, "Auth", 4) == 0 &&
-               memcmp(&d, "enti", 4) == 0 &&
-               memcmp(&c, "cAMD", 4) == 0))
+  if ((memcmp (&b, "Genu", 4) == 0 &&
+       memcmp (&d, "ineI", 4) == 0 &&
+       memcmp (&c, "ntel", 4) == 0) ||
+      (memcmp (&b, "Auth", 4) == 0 &&
+       memcmp (&d, "enti", 4) == 0 && memcmp (&c, "cAMD", 4) == 0))
     {
       return 1;
     }
@@ -179,7 +165,7 @@ register_x86_crypto (void)
 {
   int ret;
 
-  if (check_intel_or_amd() == 0)
+  if (check_intel_or_amd () == 0)
     return;
 
   if (check_optimized_aes ())
@@ -208,6 +194,18 @@ register_x86_crypto (void)
         {
           gnutls_assert ();
         }
+
+      if (check_pclmul ())
+        {
+          /* register GCM ciphers */
+          ret =
+            gnutls_crypto_single_cipher_register (GNUTLS_CIPHER_AES_128_GCM,
+                                                  80, &aes_gcm_struct);
+          if (ret < 0)
+            {
+              gnutls_assert ();
+            }
+        }
     }
 
   return;
diff --git a/lib/accelerated/intel/aes-x86.h b/lib/accelerated/intel/aes-x86.h
index 40d6a0c..8f49ff3 100644
--- a/lib/accelerated/intel/aes-x86.h
+++ b/lib/accelerated/intel/aes-x86.h
@@ -1 +1,42 @@
+#ifndef AES_X86_H
+# define AES_X86_H
+
+#include <gnutls_int.h>
+
 void register_x86_crypto (void);
+
+#ifdef __GNUC__
+# define ALIGN16 __attribute__ ((aligned (16)))
+#else
+# define ALIGN16
+#endif
+
+#define AES_MAXNR 14
+typedef struct
+{
+  uint32_t ALIGN16 rd_key[4 * (AES_MAXNR + 1)];
+  int rounds;
+} AES_KEY;
+
+void aesni_ecb_encrypt (const unsigned char *in, unsigned char *out,
+                        size_t len, const AES_KEY * key,
+                        int enc);
+
+void aesni_cbc_encrypt (const unsigned char *in, unsigned char *out,
+                        size_t len, const AES_KEY * key,
+                        unsigned char *ivec, const int enc);
+int aesni_set_decrypt_key (const unsigned char *userKey, const int bits,
+                           AES_KEY * key);
+int aesni_set_encrypt_key (const unsigned char *userKey, const int bits,
+                           AES_KEY * key);
+
+void aesni_ctr32_encrypt_blocks(const unsigned char *in,
+                           unsigned char *out,
+                           size_t blocks,
+                           const void *key,
+                           const unsigned char *ivec);
+
+
+const gnutls_crypto_cipher_st aes_gcm_struct;
+
+#endif
diff --git a/lib/accelerated/intel/asm/appro-aes-gcm-x86-64.s 
b/lib/accelerated/intel/asm/appro-aes-gcm-x86-64.s
new file mode 100644
index 0000000..4235cd2
--- /dev/null
+++ b/lib/accelerated/intel/asm/appro-aes-gcm-x86-64.s
@@ -0,0 +1,1065 @@
+# Copyright (c) 2006, Andy Polyakov by <address@hidden>
+# All rights reserved.
+#
+# Redistribution and use in source and binary forms, with or without
+# modification, are permitted provided that the following conditions
+# are met:
+# 
+#     *        Redistributions of source code must retain copyright
+#     * notices,
+#      this list of conditions and the following disclaimer.
+#
+#     *        Redistributions in binary form must reproduce the above
+#      copyright notice, this list of conditions and the following
+#      disclaimer in the documentation and/or other materials
+#      provided with the distribution.
+#
+#     *        Neither the name of the Andy Polyakov nor the names of its
+#      copyright holder and contributors may be used to endorse or
+#      promote products derived from this software without specific
+#      prior written permission.
+#
+# ALTERNATIVELY, provided that this notice is retained in full, this
+# product may be distributed under the terms of the GNU General Public
+# License (GPL), in which case the provisions of the GPL apply INSTEAD OF
+# those given above.
+#
+# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDER AND CONTRIBUTORS
+# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+
+.text  
+
+.globl gcm_gmult_4bit
+.type  gcm_gmult_4bit,@function
+.align 16
+gcm_gmult_4bit:
+       pushq   %rbx
+       pushq   %rbp
+       pushq   %r12
+.Lgmult_prologue:
+
+       movzbq  15(%rdi),%r8
+       leaq    .Lrem_4bit(%rip),%r11
+       xorq    %rax,%rax
+       xorq    %rbx,%rbx
+       movb    %r8b,%al
+       movb    %r8b,%bl
+       shlb    $4,%al
+       movq    $14,%rcx
+       movq    8(%rsi,%rax,1),%r8
+       movq    (%rsi,%rax,1),%r9
+       andb    $240,%bl
+       movq    %r8,%rdx
+       jmp     .Loop1
+
+.align 16
+.Loop1:
+       shrq    $4,%r8
+       andq    $15,%rdx
+       movq    %r9,%r10
+       movb    (%rdi,%rcx,1),%al
+       shrq    $4,%r9
+       xorq    8(%rsi,%rbx,1),%r8
+       shlq    $60,%r10
+       xorq    (%rsi,%rbx,1),%r9
+       movb    %al,%bl
+       xorq    (%r11,%rdx,8),%r9
+       movq    %r8,%rdx
+       shlb    $4,%al
+       xorq    %r10,%r8
+       decq    %rcx
+       js      .Lbreak1
+
+       shrq    $4,%r8
+       andq    $15,%rdx
+       movq    %r9,%r10
+       shrq    $4,%r9
+       xorq    8(%rsi,%rax,1),%r8
+       shlq    $60,%r10
+       xorq    (%rsi,%rax,1),%r9
+       andb    $240,%bl
+       xorq    (%r11,%rdx,8),%r9
+       movq    %r8,%rdx
+       xorq    %r10,%r8
+       jmp     .Loop1
+
+.align 16
+.Lbreak1:
+       shrq    $4,%r8
+       andq    $15,%rdx
+       movq    %r9,%r10
+       shrq    $4,%r9
+       xorq    8(%rsi,%rax,1),%r8
+       shlq    $60,%r10
+       xorq    (%rsi,%rax,1),%r9
+       andb    $240,%bl
+       xorq    (%r11,%rdx,8),%r9
+       movq    %r8,%rdx
+       xorq    %r10,%r8
+
+       shrq    $4,%r8
+       andq    $15,%rdx
+       movq    %r9,%r10
+       shrq    $4,%r9
+       xorq    8(%rsi,%rbx,1),%r8
+       shlq    $60,%r10
+       xorq    (%rsi,%rbx,1),%r9
+       xorq    %r10,%r8
+       xorq    (%r11,%rdx,8),%r9
+
+       bswapq  %r8
+       bswapq  %r9
+       movq    %r8,8(%rdi)
+       movq    %r9,(%rdi)
+
+       movq    16(%rsp),%rbx
+       leaq    24(%rsp),%rsp
+.Lgmult_epilogue:
+       .byte   0xf3,0xc3
+.size  gcm_gmult_4bit,.-gcm_gmult_4bit
+.globl gcm_ghash_4bit
+.type  gcm_ghash_4bit,@function
+.align 16
+gcm_ghash_4bit:
+       pushq   %rbx
+       pushq   %rbp
+       pushq   %r12
+       pushq   %r13
+       pushq   %r14
+       pushq   %r15
+       subq    $280,%rsp
+.Lghash_prologue:
+       movq    %rdx,%r14
+       movq    %rcx,%r15
+       subq    $-128,%rsi
+       leaq    16+128(%rsp),%rbp
+       xorl    %edx,%edx
+       movq    0+0-128(%rsi),%r8
+       movq    0+8-128(%rsi),%rax
+       movb    %al,%dl
+       shrq    $4,%rax
+       movq    %r8,%r10
+       shrq    $4,%r8
+       movq    16+0-128(%rsi),%r9
+       shlb    $4,%dl
+       movq    16+8-128(%rsi),%rbx
+       shlq    $60,%r10
+       movb    %dl,0(%rsp)
+       orq     %r10,%rax
+       movb    %bl,%dl
+       shrq    $4,%rbx
+       movq    %r9,%r10
+       shrq    $4,%r9
+       movq    %r8,0(%rbp)
+       movq    32+0-128(%rsi),%r8
+       shlb    $4,%dl
+       movq    %rax,0-128(%rbp)
+       movq    32+8-128(%rsi),%rax
+       shlq    $60,%r10
+       movb    %dl,1(%rsp)
+       orq     %r10,%rbx
+       movb    %al,%dl
+       shrq    $4,%rax
+       movq    %r8,%r10
+       shrq    $4,%r8
+       movq    %r9,8(%rbp)
+       movq    48+0-128(%rsi),%r9
+       shlb    $4,%dl
+       movq    %rbx,8-128(%rbp)
+       movq    48+8-128(%rsi),%rbx
+       shlq    $60,%r10
+       movb    %dl,2(%rsp)
+       orq     %r10,%rax
+       movb    %bl,%dl
+       shrq    $4,%rbx
+       movq    %r9,%r10
+       shrq    $4,%r9
+       movq    %r8,16(%rbp)
+       movq    64+0-128(%rsi),%r8
+       shlb    $4,%dl
+       movq    %rax,16-128(%rbp)
+       movq    64+8-128(%rsi),%rax
+       shlq    $60,%r10
+       movb    %dl,3(%rsp)
+       orq     %r10,%rbx
+       movb    %al,%dl
+       shrq    $4,%rax
+       movq    %r8,%r10
+       shrq    $4,%r8
+       movq    %r9,24(%rbp)
+       movq    80+0-128(%rsi),%r9
+       shlb    $4,%dl
+       movq    %rbx,24-128(%rbp)
+       movq    80+8-128(%rsi),%rbx
+       shlq    $60,%r10
+       movb    %dl,4(%rsp)
+       orq     %r10,%rax
+       movb    %bl,%dl
+       shrq    $4,%rbx
+       movq    %r9,%r10
+       shrq    $4,%r9
+       movq    %r8,32(%rbp)
+       movq    96+0-128(%rsi),%r8
+       shlb    $4,%dl
+       movq    %rax,32-128(%rbp)
+       movq    96+8-128(%rsi),%rax
+       shlq    $60,%r10
+       movb    %dl,5(%rsp)
+       orq     %r10,%rbx
+       movb    %al,%dl
+       shrq    $4,%rax
+       movq    %r8,%r10
+       shrq    $4,%r8
+       movq    %r9,40(%rbp)
+       movq    112+0-128(%rsi),%r9
+       shlb    $4,%dl
+       movq    %rbx,40-128(%rbp)
+       movq    112+8-128(%rsi),%rbx
+       shlq    $60,%r10
+       movb    %dl,6(%rsp)
+       orq     %r10,%rax
+       movb    %bl,%dl
+       shrq    $4,%rbx
+       movq    %r9,%r10
+       shrq    $4,%r9
+       movq    %r8,48(%rbp)
+       movq    128+0-128(%rsi),%r8
+       shlb    $4,%dl
+       movq    %rax,48-128(%rbp)
+       movq    128+8-128(%rsi),%rax
+       shlq    $60,%r10
+       movb    %dl,7(%rsp)
+       orq     %r10,%rbx
+       movb    %al,%dl
+       shrq    $4,%rax
+       movq    %r8,%r10
+       shrq    $4,%r8
+       movq    %r9,56(%rbp)
+       movq    144+0-128(%rsi),%r9
+       shlb    $4,%dl
+       movq    %rbx,56-128(%rbp)
+       movq    144+8-128(%rsi),%rbx
+       shlq    $60,%r10
+       movb    %dl,8(%rsp)
+       orq     %r10,%rax
+       movb    %bl,%dl
+       shrq    $4,%rbx
+       movq    %r9,%r10
+       shrq    $4,%r9
+       movq    %r8,64(%rbp)
+       movq    160+0-128(%rsi),%r8
+       shlb    $4,%dl
+       movq    %rax,64-128(%rbp)
+       movq    160+8-128(%rsi),%rax
+       shlq    $60,%r10
+       movb    %dl,9(%rsp)
+       orq     %r10,%rbx
+       movb    %al,%dl
+       shrq    $4,%rax
+       movq    %r8,%r10
+       shrq    $4,%r8
+       movq    %r9,72(%rbp)
+       movq    176+0-128(%rsi),%r9
+       shlb    $4,%dl
+       movq    %rbx,72-128(%rbp)
+       movq    176+8-128(%rsi),%rbx
+       shlq    $60,%r10
+       movb    %dl,10(%rsp)
+       orq     %r10,%rax
+       movb    %bl,%dl
+       shrq    $4,%rbx
+       movq    %r9,%r10
+       shrq    $4,%r9
+       movq    %r8,80(%rbp)
+       movq    192+0-128(%rsi),%r8
+       shlb    $4,%dl
+       movq    %rax,80-128(%rbp)
+       movq    192+8-128(%rsi),%rax
+       shlq    $60,%r10
+       movb    %dl,11(%rsp)
+       orq     %r10,%rbx
+       movb    %al,%dl
+       shrq    $4,%rax
+       movq    %r8,%r10
+       shrq    $4,%r8
+       movq    %r9,88(%rbp)
+       movq    208+0-128(%rsi),%r9
+       shlb    $4,%dl
+       movq    %rbx,88-128(%rbp)
+       movq    208+8-128(%rsi),%rbx
+       shlq    $60,%r10
+       movb    %dl,12(%rsp)
+       orq     %r10,%rax
+       movb    %bl,%dl
+       shrq    $4,%rbx
+       movq    %r9,%r10
+       shrq    $4,%r9
+       movq    %r8,96(%rbp)
+       movq    224+0-128(%rsi),%r8
+       shlb    $4,%dl
+       movq    %rax,96-128(%rbp)
+       movq    224+8-128(%rsi),%rax
+       shlq    $60,%r10
+       movb    %dl,13(%rsp)
+       orq     %r10,%rbx
+       movb    %al,%dl
+       shrq    $4,%rax
+       movq    %r8,%r10
+       shrq    $4,%r8
+       movq    %r9,104(%rbp)
+       movq    240+0-128(%rsi),%r9
+       shlb    $4,%dl
+       movq    %rbx,104-128(%rbp)
+       movq    240+8-128(%rsi),%rbx
+       shlq    $60,%r10
+       movb    %dl,14(%rsp)
+       orq     %r10,%rax
+       movb    %bl,%dl
+       shrq    $4,%rbx
+       movq    %r9,%r10
+       shrq    $4,%r9
+       movq    %r8,112(%rbp)
+       shlb    $4,%dl
+       movq    %rax,112-128(%rbp)
+       shlq    $60,%r10
+       movb    %dl,15(%rsp)
+       orq     %r10,%rbx
+       movq    %r9,120(%rbp)
+       movq    %rbx,120-128(%rbp)
+       addq    $-128,%rsi
+       movq    8(%rdi),%r8
+       movq    0(%rdi),%r9
+       addq    %r14,%r15
+       leaq    .Lrem_8bit(%rip),%r11
+       jmp     .Louter_loop
+.align 16
+.Louter_loop:
+       xorq    (%r14),%r9
+       movq    8(%r14),%rdx
+       leaq    16(%r14),%r14
+       xorq    %r8,%rdx
+       movq    %r9,(%rdi)
+       movq    %rdx,8(%rdi)
+       shrq    $32,%rdx
+       xorq    %rax,%rax
+       roll    $8,%edx
+       movb    %dl,%al
+       movzbl  %dl,%ebx
+       shlb    $4,%al
+       shrl    $4,%ebx
+       roll    $8,%edx
+       movq    8(%rsi,%rax,1),%r8
+       movq    (%rsi,%rax,1),%r9
+       movb    %dl,%al
+       movzbl  %dl,%ecx
+       shlb    $4,%al
+       movzbq  (%rsp,%rbx,1),%r12
+       shrl    $4,%ecx
+       xorq    %r8,%r12
+       movq    %r9,%r10
+       shrq    $8,%r8
+       movzbq  %r12b,%r12
+       shrq    $8,%r9
+       xorq    -128(%rbp,%rbx,8),%r8
+       shlq    $56,%r10
+       xorq    (%rbp,%rbx,8),%r9
+       roll    $8,%edx
+       xorq    8(%rsi,%rax,1),%r8
+       xorq    (%rsi,%rax,1),%r9
+       movb    %dl,%al
+       xorq    %r10,%r8
+       movzwq  (%r11,%r12,2),%r12
+       movzbl  %dl,%ebx
+       shlb    $4,%al
+       movzbq  (%rsp,%rcx,1),%r13
+       shrl    $4,%ebx
+       shlq    $48,%r12
+       xorq    %r8,%r13
+       movq    %r9,%r10
+       xorq    %r12,%r9
+       shrq    $8,%r8
+       movzbq  %r13b,%r13
+       shrq    $8,%r9
+       xorq    -128(%rbp,%rcx,8),%r8
+       shlq    $56,%r10
+       xorq    (%rbp,%rcx,8),%r9
+       roll    $8,%edx
+       xorq    8(%rsi,%rax,1),%r8
+       xorq    (%rsi,%rax,1),%r9
+       movb    %dl,%al
+       xorq    %r10,%r8
+       movzwq  (%r11,%r13,2),%r13
+       movzbl  %dl,%ecx
+       shlb    $4,%al
+       movzbq  (%rsp,%rbx,1),%r12
+       shrl    $4,%ecx
+       shlq    $48,%r13
+       xorq    %r8,%r12
+       movq    %r9,%r10
+       xorq    %r13,%r9
+       shrq    $8,%r8
+       movzbq  %r12b,%r12
+       movl    8(%rdi),%edx
+       shrq    $8,%r9
+       xorq    -128(%rbp,%rbx,8),%r8
+       shlq    $56,%r10
+       xorq    (%rbp,%rbx,8),%r9
+       roll    $8,%edx
+       xorq    8(%rsi,%rax,1),%r8
+       xorq    (%rsi,%rax,1),%r9
+       movb    %dl,%al
+       xorq    %r10,%r8
+       movzwq  (%r11,%r12,2),%r12
+       movzbl  %dl,%ebx
+       shlb    $4,%al
+       movzbq  (%rsp,%rcx,1),%r13
+       shrl    $4,%ebx
+       shlq    $48,%r12
+       xorq    %r8,%r13
+       movq    %r9,%r10
+       xorq    %r12,%r9
+       shrq    $8,%r8
+       movzbq  %r13b,%r13
+       shrq    $8,%r9
+       xorq    -128(%rbp,%rcx,8),%r8
+       shlq    $56,%r10
+       xorq    (%rbp,%rcx,8),%r9
+       roll    $8,%edx
+       xorq    8(%rsi,%rax,1),%r8
+       xorq    (%rsi,%rax,1),%r9
+       movb    %dl,%al
+       xorq    %r10,%r8
+       movzwq  (%r11,%r13,2),%r13
+       movzbl  %dl,%ecx
+       shlb    $4,%al
+       movzbq  (%rsp,%rbx,1),%r12
+       shrl    $4,%ecx
+       shlq    $48,%r13
+       xorq    %r8,%r12
+       movq    %r9,%r10
+       xorq    %r13,%r9
+       shrq    $8,%r8
+       movzbq  %r12b,%r12
+       shrq    $8,%r9
+       xorq    -128(%rbp,%rbx,8),%r8
+       shlq    $56,%r10
+       xorq    (%rbp,%rbx,8),%r9
+       roll    $8,%edx
+       xorq    8(%rsi,%rax,1),%r8
+       xorq    (%rsi,%rax,1),%r9
+       movb    %dl,%al
+       xorq    %r10,%r8
+       movzwq  (%r11,%r12,2),%r12
+       movzbl  %dl,%ebx
+       shlb    $4,%al
+       movzbq  (%rsp,%rcx,1),%r13
+       shrl    $4,%ebx
+       shlq    $48,%r12
+       xorq    %r8,%r13
+       movq    %r9,%r10
+       xorq    %r12,%r9
+       shrq    $8,%r8
+       movzbq  %r13b,%r13
+       shrq    $8,%r9
+       xorq    -128(%rbp,%rcx,8),%r8
+       shlq    $56,%r10
+       xorq    (%rbp,%rcx,8),%r9
+       roll    $8,%edx
+       xorq    8(%rsi,%rax,1),%r8
+       xorq    (%rsi,%rax,1),%r9
+       movb    %dl,%al
+       xorq    %r10,%r8
+       movzwq  (%r11,%r13,2),%r13
+       movzbl  %dl,%ecx
+       shlb    $4,%al
+       movzbq  (%rsp,%rbx,1),%r12
+       shrl    $4,%ecx
+       shlq    $48,%r13
+       xorq    %r8,%r12
+       movq    %r9,%r10
+       xorq    %r13,%r9
+       shrq    $8,%r8
+       movzbq  %r12b,%r12
+       movl    4(%rdi),%edx
+       shrq    $8,%r9
+       xorq    -128(%rbp,%rbx,8),%r8
+       shlq    $56,%r10
+       xorq    (%rbp,%rbx,8),%r9
+       roll    $8,%edx
+       xorq    8(%rsi,%rax,1),%r8
+       xorq    (%rsi,%rax,1),%r9
+       movb    %dl,%al
+       xorq    %r10,%r8
+       movzwq  (%r11,%r12,2),%r12
+       movzbl  %dl,%ebx
+       shlb    $4,%al
+       movzbq  (%rsp,%rcx,1),%r13
+       shrl    $4,%ebx
+       shlq    $48,%r12
+       xorq    %r8,%r13
+       movq    %r9,%r10
+       xorq    %r12,%r9
+       shrq    $8,%r8
+       movzbq  %r13b,%r13
+       shrq    $8,%r9
+       xorq    -128(%rbp,%rcx,8),%r8
+       shlq    $56,%r10
+       xorq    (%rbp,%rcx,8),%r9
+       roll    $8,%edx
+       xorq    8(%rsi,%rax,1),%r8
+       xorq    (%rsi,%rax,1),%r9
+       movb    %dl,%al
+       xorq    %r10,%r8
+       movzwq  (%r11,%r13,2),%r13
+       movzbl  %dl,%ecx
+       shlb    $4,%al
+       movzbq  (%rsp,%rbx,1),%r12
+       shrl    $4,%ecx
+       shlq    $48,%r13
+       xorq    %r8,%r12
+       movq    %r9,%r10
+       xorq    %r13,%r9
+       shrq    $8,%r8
+       movzbq  %r12b,%r12
+       shrq    $8,%r9
+       xorq    -128(%rbp,%rbx,8),%r8
+       shlq    $56,%r10
+       xorq    (%rbp,%rbx,8),%r9
+       roll    $8,%edx
+       xorq    8(%rsi,%rax,1),%r8
+       xorq    (%rsi,%rax,1),%r9
+       movb    %dl,%al
+       xorq    %r10,%r8
+       movzwq  (%r11,%r12,2),%r12
+       movzbl  %dl,%ebx
+       shlb    $4,%al
+       movzbq  (%rsp,%rcx,1),%r13
+       shrl    $4,%ebx
+       shlq    $48,%r12
+       xorq    %r8,%r13
+       movq    %r9,%r10
+       xorq    %r12,%r9
+       shrq    $8,%r8
+       movzbq  %r13b,%r13
+       shrq    $8,%r9
+       xorq    -128(%rbp,%rcx,8),%r8
+       shlq    $56,%r10
+       xorq    (%rbp,%rcx,8),%r9
+       roll    $8,%edx
+       xorq    8(%rsi,%rax,1),%r8
+       xorq    (%rsi,%rax,1),%r9
+       movb    %dl,%al
+       xorq    %r10,%r8
+       movzwq  (%r11,%r13,2),%r13
+       movzbl  %dl,%ecx
+       shlb    $4,%al
+       movzbq  (%rsp,%rbx,1),%r12
+       shrl    $4,%ecx
+       shlq    $48,%r13
+       xorq    %r8,%r12
+       movq    %r9,%r10
+       xorq    %r13,%r9
+       shrq    $8,%r8
+       movzbq  %r12b,%r12
+       movl    0(%rdi),%edx
+       shrq    $8,%r9
+       xorq    -128(%rbp,%rbx,8),%r8
+       shlq    $56,%r10
+       xorq    (%rbp,%rbx,8),%r9
+       roll    $8,%edx
+       xorq    8(%rsi,%rax,1),%r8
+       xorq    (%rsi,%rax,1),%r9
+       movb    %dl,%al
+       xorq    %r10,%r8
+       movzwq  (%r11,%r12,2),%r12
+       movzbl  %dl,%ebx
+       shlb    $4,%al
+       movzbq  (%rsp,%rcx,1),%r13
+       shrl    $4,%ebx
+       shlq    $48,%r12
+       xorq    %r8,%r13
+       movq    %r9,%r10
+       xorq    %r12,%r9
+       shrq    $8,%r8
+       movzbq  %r13b,%r13
+       shrq    $8,%r9
+       xorq    -128(%rbp,%rcx,8),%r8
+       shlq    $56,%r10
+       xorq    (%rbp,%rcx,8),%r9
+       roll    $8,%edx
+       xorq    8(%rsi,%rax,1),%r8
+       xorq    (%rsi,%rax,1),%r9
+       movb    %dl,%al
+       xorq    %r10,%r8
+       movzwq  (%r11,%r13,2),%r13
+       movzbl  %dl,%ecx
+       shlb    $4,%al
+       movzbq  (%rsp,%rbx,1),%r12
+       shrl    $4,%ecx
+       shlq    $48,%r13
+       xorq    %r8,%r12
+       movq    %r9,%r10
+       xorq    %r13,%r9
+       shrq    $8,%r8
+       movzbq  %r12b,%r12
+       shrq    $8,%r9
+       xorq    -128(%rbp,%rbx,8),%r8
+       shlq    $56,%r10
+       xorq    (%rbp,%rbx,8),%r9
+       roll    $8,%edx
+       xorq    8(%rsi,%rax,1),%r8
+       xorq    (%rsi,%rax,1),%r9
+       movb    %dl,%al
+       xorq    %r10,%r8
+       movzwq  (%r11,%r12,2),%r12
+       movzbl  %dl,%ebx
+       shlb    $4,%al
+       movzbq  (%rsp,%rcx,1),%r13
+       shrl    $4,%ebx
+       shlq    $48,%r12
+       xorq    %r8,%r13
+       movq    %r9,%r10
+       xorq    %r12,%r9
+       shrq    $8,%r8
+       movzbq  %r13b,%r13
+       shrq    $8,%r9
+       xorq    -128(%rbp,%rcx,8),%r8
+       shlq    $56,%r10
+       xorq    (%rbp,%rcx,8),%r9
+       roll    $8,%edx
+       xorq    8(%rsi,%rax,1),%r8
+       xorq    (%rsi,%rax,1),%r9
+       movb    %dl,%al
+       xorq    %r10,%r8
+       movzwq  (%r11,%r13,2),%r13
+       movzbl  %dl,%ecx
+       shlb    $4,%al
+       movzbq  (%rsp,%rbx,1),%r12
+       andl    $240,%ecx
+       shlq    $48,%r13
+       xorq    %r8,%r12
+       movq    %r9,%r10
+       xorq    %r13,%r9
+       shrq    $8,%r8
+       movzbq  %r12b,%r12
+       movl    -4(%rdi),%edx
+       shrq    $8,%r9
+       xorq    -128(%rbp,%rbx,8),%r8
+       shlq    $56,%r10
+       xorq    (%rbp,%rbx,8),%r9
+       movzwq  (%r11,%r12,2),%r12
+       xorq    8(%rsi,%rax,1),%r8
+       xorq    (%rsi,%rax,1),%r9
+       shlq    $48,%r12
+       xorq    %r10,%r8
+       xorq    %r12,%r9
+       movzbq  %r8b,%r13
+       shrq    $4,%r8
+       movq    %r9,%r10
+       shlb    $4,%r13b
+       shrq    $4,%r9
+       xorq    8(%rsi,%rcx,1),%r8
+       movzwq  (%r11,%r13,2),%r13
+       shlq    $60,%r10
+       xorq    (%rsi,%rcx,1),%r9
+       xorq    %r10,%r8
+       shlq    $48,%r13
+       bswapq  %r8
+       xorq    %r13,%r9
+       bswapq  %r9
+       cmpq    %r15,%r14
+       jb      .Louter_loop
+       movq    %r8,8(%rdi)
+       movq    %r9,(%rdi)
+
+       leaq    280(%rsp),%rsi
+       movq    0(%rsi),%r15
+       movq    8(%rsi),%r14
+       movq    16(%rsi),%r13
+       movq    24(%rsi),%r12
+       movq    32(%rsi),%rbp
+       movq    40(%rsi),%rbx
+       leaq    48(%rsi),%rsp
+.Lghash_epilogue:
+       .byte   0xf3,0xc3
+.size  gcm_ghash_4bit,.-gcm_ghash_4bit
+.globl gcm_init_clmul
+.type  gcm_init_clmul,@function
+.align 16
+gcm_init_clmul:
+       movdqu  (%rsi),%xmm2
+       pshufd  $78,%xmm2,%xmm2
+
+
+       pshufd  $255,%xmm2,%xmm4
+       movdqa  %xmm2,%xmm3
+       psllq   $1,%xmm2
+       pxor    %xmm5,%xmm5
+       psrlq   $63,%xmm3
+       pcmpgtd %xmm4,%xmm5
+       pslldq  $8,%xmm3
+       por     %xmm3,%xmm2
+
+
+       pand    .L0x1c2_polynomial(%rip),%xmm5
+       pxor    %xmm5,%xmm2
+
+
+       movdqa  %xmm2,%xmm0
+       movdqa  %xmm0,%xmm1
+       pshufd  $78,%xmm0,%xmm3
+       pshufd  $78,%xmm2,%xmm4
+       pxor    %xmm0,%xmm3
+       pxor    %xmm2,%xmm4
+.byte  102,15,58,68,194,0
+.byte  102,15,58,68,202,17
+.byte  102,15,58,68,220,0
+       pxor    %xmm0,%xmm3
+       pxor    %xmm1,%xmm3
+
+       movdqa  %xmm3,%xmm4
+       psrldq  $8,%xmm3
+       pslldq  $8,%xmm4
+       pxor    %xmm3,%xmm1
+       pxor    %xmm4,%xmm0
+
+       movdqa  %xmm0,%xmm3
+       psllq   $1,%xmm0
+       pxor    %xmm3,%xmm0
+       psllq   $5,%xmm0
+       pxor    %xmm3,%xmm0
+       psllq   $57,%xmm0
+       movdqa  %xmm0,%xmm4
+       pslldq  $8,%xmm0
+       psrldq  $8,%xmm4
+       pxor    %xmm3,%xmm0
+       pxor    %xmm4,%xmm1
+
+
+       movdqa  %xmm0,%xmm4
+       psrlq   $5,%xmm0
+       pxor    %xmm4,%xmm0
+       psrlq   $1,%xmm0
+       pxor    %xmm4,%xmm0
+       pxor    %xmm1,%xmm4
+       psrlq   $1,%xmm0
+       pxor    %xmm4,%xmm0
+       movdqu  %xmm2,(%rdi)
+       movdqu  %xmm0,16(%rdi)
+       .byte   0xf3,0xc3
+.size  gcm_init_clmul,.-gcm_init_clmul
+.globl gcm_gmult_clmul
+.type  gcm_gmult_clmul,@function
+.align 16
+gcm_gmult_clmul:
+       movdqu  (%rdi),%xmm0
+       movdqa  .Lbswap_mask(%rip),%xmm5
+       movdqu  (%rsi),%xmm2
+.byte  102,15,56,0,197
+       movdqa  %xmm0,%xmm1
+       pshufd  $78,%xmm0,%xmm3
+       pshufd  $78,%xmm2,%xmm4
+       pxor    %xmm0,%xmm3
+       pxor    %xmm2,%xmm4
+.byte  102,15,58,68,194,0
+.byte  102,15,58,68,202,17
+.byte  102,15,58,68,220,0
+       pxor    %xmm0,%xmm3
+       pxor    %xmm1,%xmm3
+
+       movdqa  %xmm3,%xmm4
+       psrldq  $8,%xmm3
+       pslldq  $8,%xmm4
+       pxor    %xmm3,%xmm1
+       pxor    %xmm4,%xmm0
+
+       movdqa  %xmm0,%xmm3
+       psllq   $1,%xmm0
+       pxor    %xmm3,%xmm0
+       psllq   $5,%xmm0
+       pxor    %xmm3,%xmm0
+       psllq   $57,%xmm0
+       movdqa  %xmm0,%xmm4
+       pslldq  $8,%xmm0
+       psrldq  $8,%xmm4
+       pxor    %xmm3,%xmm0
+       pxor    %xmm4,%xmm1
+
+
+       movdqa  %xmm0,%xmm4
+       psrlq   $5,%xmm0
+       pxor    %xmm4,%xmm0
+       psrlq   $1,%xmm0
+       pxor    %xmm4,%xmm0
+       pxor    %xmm1,%xmm4
+       psrlq   $1,%xmm0
+       pxor    %xmm4,%xmm0
+.byte  102,15,56,0,197
+       movdqu  %xmm0,(%rdi)
+       .byte   0xf3,0xc3
+.size  gcm_gmult_clmul,.-gcm_gmult_clmul
+.globl gcm_ghash_clmul
+.type  gcm_ghash_clmul,@function
+.align 16
+gcm_ghash_clmul:
+       movdqa  .Lbswap_mask(%rip),%xmm5
+
+       movdqu  (%rdi),%xmm0
+       movdqu  (%rsi),%xmm2
+.byte  102,15,56,0,197
+
+       subq    $16,%rcx
+       jz      .Lodd_tail
+
+       movdqu  16(%rsi),%xmm8
+
+
+
+
+
+       movdqu  (%rdx),%xmm3
+       movdqu  16(%rdx),%xmm6
+.byte  102,15,56,0,221
+.byte  102,15,56,0,245
+       pxor    %xmm3,%xmm0
+       movdqa  %xmm6,%xmm7
+       pshufd  $78,%xmm6,%xmm3
+       pshufd  $78,%xmm2,%xmm4
+       pxor    %xmm6,%xmm3
+       pxor    %xmm2,%xmm4
+.byte  102,15,58,68,242,0
+.byte  102,15,58,68,250,17
+.byte  102,15,58,68,220,0
+       pxor    %xmm6,%xmm3
+       pxor    %xmm7,%xmm3
+
+       movdqa  %xmm3,%xmm4
+       psrldq  $8,%xmm3
+       pslldq  $8,%xmm4
+       pxor    %xmm3,%xmm7
+       pxor    %xmm4,%xmm6
+       movdqa  %xmm0,%xmm1
+       pshufd  $78,%xmm0,%xmm3
+       pshufd  $78,%xmm8,%xmm4
+       pxor    %xmm0,%xmm3
+       pxor    %xmm8,%xmm4
+
+       leaq    32(%rdx),%rdx
+       subq    $32,%rcx
+       jbe     .Leven_tail
+
+.Lmod_loop:
+.byte  102,65,15,58,68,192,0
+.byte  102,65,15,58,68,200,17
+.byte  102,15,58,68,220,0
+       pxor    %xmm0,%xmm3
+       pxor    %xmm1,%xmm3
+
+       movdqa  %xmm3,%xmm4
+       psrldq  $8,%xmm3
+       pslldq  $8,%xmm4
+       pxor    %xmm3,%xmm1
+       pxor    %xmm4,%xmm0
+       movdqu  (%rdx),%xmm3
+       pxor    %xmm6,%xmm0
+       pxor    %xmm7,%xmm1
+
+       movdqu  16(%rdx),%xmm6
+.byte  102,15,56,0,221
+.byte  102,15,56,0,245
+
+       movdqa  %xmm6,%xmm7
+       pshufd  $78,%xmm6,%xmm9
+       pshufd  $78,%xmm2,%xmm10
+       pxor    %xmm6,%xmm9
+       pxor    %xmm2,%xmm10
+       pxor    %xmm3,%xmm1
+
+       movdqa  %xmm0,%xmm3
+       psllq   $1,%xmm0
+       pxor    %xmm3,%xmm0
+       psllq   $5,%xmm0
+       pxor    %xmm3,%xmm0
+.byte  102,15,58,68,242,0
+       psllq   $57,%xmm0
+       movdqa  %xmm0,%xmm4
+       pslldq  $8,%xmm0
+       psrldq  $8,%xmm4
+       pxor    %xmm3,%xmm0
+       pxor    %xmm4,%xmm1
+
+.byte  102,15,58,68,250,17
+       movdqa  %xmm0,%xmm4
+       psrlq   $5,%xmm0
+       pxor    %xmm4,%xmm0
+       psrlq   $1,%xmm0
+       pxor    %xmm4,%xmm0
+       pxor    %xmm1,%xmm4
+       psrlq   $1,%xmm0
+       pxor    %xmm4,%xmm0
+
+.byte  102,69,15,58,68,202,0
+       movdqa  %xmm0,%xmm1
+       pshufd  $78,%xmm0,%xmm3
+       pshufd  $78,%xmm8,%xmm4
+       pxor    %xmm0,%xmm3
+       pxor    %xmm8,%xmm4
+
+       pxor    %xmm6,%xmm9
+       pxor    %xmm7,%xmm9
+       movdqa  %xmm9,%xmm10
+       psrldq  $8,%xmm9
+       pslldq  $8,%xmm10
+       pxor    %xmm9,%xmm7
+       pxor    %xmm10,%xmm6
+
+       leaq    32(%rdx),%rdx
+       subq    $32,%rcx
+       ja      .Lmod_loop
+
+.Leven_tail:
+.byte  102,65,15,58,68,192,0
+.byte  102,65,15,58,68,200,17
+.byte  102,15,58,68,220,0
+       pxor    %xmm0,%xmm3
+       pxor    %xmm1,%xmm3
+
+       movdqa  %xmm3,%xmm4
+       psrldq  $8,%xmm3
+       pslldq  $8,%xmm4
+       pxor    %xmm3,%xmm1
+       pxor    %xmm4,%xmm0
+       pxor    %xmm6,%xmm0
+       pxor    %xmm7,%xmm1
+
+       movdqa  %xmm0,%xmm3
+       psllq   $1,%xmm0
+       pxor    %xmm3,%xmm0
+       psllq   $5,%xmm0
+       pxor    %xmm3,%xmm0
+       psllq   $57,%xmm0
+       movdqa  %xmm0,%xmm4
+       pslldq  $8,%xmm0
+       psrldq  $8,%xmm4
+       pxor    %xmm3,%xmm0
+       pxor    %xmm4,%xmm1
+
+
+       movdqa  %xmm0,%xmm4
+       psrlq   $5,%xmm0
+       pxor    %xmm4,%xmm0
+       psrlq   $1,%xmm0
+       pxor    %xmm4,%xmm0
+       pxor    %xmm1,%xmm4
+       psrlq   $1,%xmm0
+       pxor    %xmm4,%xmm0
+       testq   %rcx,%rcx
+       jnz     .Ldone
+
+.Lodd_tail:
+       movdqu  (%rdx),%xmm3
+.byte  102,15,56,0,221
+       pxor    %xmm3,%xmm0
+       movdqa  %xmm0,%xmm1
+       pshufd  $78,%xmm0,%xmm3
+       pshufd  $78,%xmm2,%xmm4
+       pxor    %xmm0,%xmm3
+       pxor    %xmm2,%xmm4
+.byte  102,15,58,68,194,0
+.byte  102,15,58,68,202,17
+.byte  102,15,58,68,220,0
+       pxor    %xmm0,%xmm3
+       pxor    %xmm1,%xmm3
+
+       movdqa  %xmm3,%xmm4
+       psrldq  $8,%xmm3
+       pslldq  $8,%xmm4
+       pxor    %xmm3,%xmm1
+       pxor    %xmm4,%xmm0
+
+       movdqa  %xmm0,%xmm3
+       psllq   $1,%xmm0
+       pxor    %xmm3,%xmm0
+       psllq   $5,%xmm0
+       pxor    %xmm3,%xmm0
+       psllq   $57,%xmm0
+       movdqa  %xmm0,%xmm4
+       pslldq  $8,%xmm0
+       psrldq  $8,%xmm4
+       pxor    %xmm3,%xmm0
+       pxor    %xmm4,%xmm1
+
+
+       movdqa  %xmm0,%xmm4
+       psrlq   $5,%xmm0
+       pxor    %xmm4,%xmm0
+       psrlq   $1,%xmm0
+       pxor    %xmm4,%xmm0
+       pxor    %xmm1,%xmm4
+       psrlq   $1,%xmm0
+       pxor    %xmm4,%xmm0
+.Ldone:
+.byte  102,15,56,0,197
+       movdqu  %xmm0,(%rdi)
+       .byte   0xf3,0xc3
+.LSEH_end_gcm_ghash_clmul:
+.size  gcm_ghash_clmul,.-gcm_ghash_clmul
+.align 64
+.Lbswap_mask:
+.byte  15,14,13,12,11,10,9,8,7,6,5,4,3,2,1,0
+.L0x1c2_polynomial:
+.byte  1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0xc2
+.align 64
+.type  .Lrem_4bit,@object
+.Lrem_4bit:
+.long  0,0,0,471859200,0,943718400,0,610271232
+.long  0,1887436800,0,1822425088,0,1220542464,0,1423966208
+.long  0,3774873600,0,4246732800,0,3644850176,0,3311403008
+.long  0,2441084928,0,2376073216,0,2847932416,0,3051356160
+.type  .Lrem_8bit,@object
+.Lrem_8bit:
+.value 0x0000,0x01C2,0x0384,0x0246,0x0708,0x06CA,0x048C,0x054E
+.value 0x0E10,0x0FD2,0x0D94,0x0C56,0x0918,0x08DA,0x0A9C,0x0B5E
+.value 0x1C20,0x1DE2,0x1FA4,0x1E66,0x1B28,0x1AEA,0x18AC,0x196E
+.value 0x1230,0x13F2,0x11B4,0x1076,0x1538,0x14FA,0x16BC,0x177E
+.value 0x3840,0x3982,0x3BC4,0x3A06,0x3F48,0x3E8A,0x3CCC,0x3D0E
+.value 0x3650,0x3792,0x35D4,0x3416,0x3158,0x309A,0x32DC,0x331E
+.value 0x2460,0x25A2,0x27E4,0x2626,0x2368,0x22AA,0x20EC,0x212E
+.value 0x2A70,0x2BB2,0x29F4,0x2836,0x2D78,0x2CBA,0x2EFC,0x2F3E
+.value 0x7080,0x7142,0x7304,0x72C6,0x7788,0x764A,0x740C,0x75CE
+.value 0x7E90,0x7F52,0x7D14,0x7CD6,0x7998,0x785A,0x7A1C,0x7BDE
+.value 0x6CA0,0x6D62,0x6F24,0x6EE6,0x6BA8,0x6A6A,0x682C,0x69EE
+.value 0x62B0,0x6372,0x6134,0x60F6,0x65B8,0x647A,0x663C,0x67FE
+.value 0x48C0,0x4902,0x4B44,0x4A86,0x4FC8,0x4E0A,0x4C4C,0x4D8E
+.value 0x46D0,0x4712,0x4554,0x4496,0x41D8,0x401A,0x425C,0x439E
+.value 0x54E0,0x5522,0x5764,0x56A6,0x53E8,0x522A,0x506C,0x51AE
+.value 0x5AF0,0x5B32,0x5974,0x58B6,0x5DF8,0x5C3A,0x5E7C,0x5FBE
+.value 0xE100,0xE0C2,0xE284,0xE346,0xE608,0xE7CA,0xE58C,0xE44E
+.value 0xEF10,0xEED2,0xEC94,0xED56,0xE818,0xE9DA,0xEB9C,0xEA5E
+.value 0xFD20,0xFCE2,0xFEA4,0xFF66,0xFA28,0xFBEA,0xF9AC,0xF86E
+.value 0xF330,0xF2F2,0xF0B4,0xF176,0xF438,0xF5FA,0xF7BC,0xF67E
+.value 0xD940,0xD882,0xDAC4,0xDB06,0xDE48,0xDF8A,0xDDCC,0xDC0E
+.value 0xD750,0xD692,0xD4D4,0xD516,0xD058,0xD19A,0xD3DC,0xD21E
+.value 0xC560,0xC4A2,0xC6E4,0xC726,0xC268,0xC3AA,0xC1EC,0xC02E
+.value 0xCB70,0xCAB2,0xC8F4,0xC936,0xCC78,0xCDBA,0xCFFC,0xCE3E
+.value 0x9180,0x9042,0x9204,0x93C6,0x9688,0x974A,0x950C,0x94CE
+.value 0x9F90,0x9E52,0x9C14,0x9DD6,0x9898,0x995A,0x9B1C,0x9ADE
+.value 0x8DA0,0x8C62,0x8E24,0x8FE6,0x8AA8,0x8B6A,0x892C,0x88EE
+.value 0x83B0,0x8272,0x8034,0x81F6,0x84B8,0x857A,0x873C,0x86FE
+.value 0xA9C0,0xA802,0xAA44,0xAB86,0xAEC8,0xAF0A,0xAD4C,0xAC8E
+.value 0xA7D0,0xA612,0xA454,0xA596,0xA0D8,0xA11A,0xA35C,0xA29E
+.value 0xB5E0,0xB422,0xB664,0xB7A6,0xB2E8,0xB32A,0xB16C,0xB0AE
+.value 0xBBF0,0xBA32,0xB874,0xB9B6,0xBCF8,0xBD3A,0xBF7C,0xBEBE
+
+.byte  
71,72,65,83,72,32,102,111,114,32,120,56,54,95,54,52,44,32,67,82,89,80,84,79,71,65,77,83,32,98,121,32,60,97,112,112,114,111,64,111,112,101,110,115,115,108,46,111,114,103,62,0
+.align 64
diff --git a/lib/accelerated/intel/asm/appro-aes-gcm-x86.s 
b/lib/accelerated/intel/asm/appro-aes-gcm-x86.s
new file mode 100644
index 0000000..791d645
--- /dev/null
+++ b/lib/accelerated/intel/asm/appro-aes-gcm-x86.s
@@ -0,0 +1,991 @@
+# Copyright (c) 2006, Andy Polyakov by <address@hidden>
+# All rights reserved.
+#
+# Redistribution and use in source and binary forms, with or without
+# modification, are permitted provided that the following conditions
+# are met:
+# 
+#     *        Redistributions of source code must retain copyright
+#     * notices,
+#      this list of conditions and the following disclaimer.
+#
+#     *        Redistributions in binary form must reproduce the above
+#      copyright notice, this list of conditions and the following
+#      disclaimer in the documentation and/or other materials
+#      provided with the distribution.
+#
+#     *        Neither the name of the Andy Polyakov nor the names of its
+#      copyright holder and contributors may be used to endorse or
+#      promote products derived from this software without specific
+#      prior written permission.
+#
+# ALTERNATIVELY, provided that this notice is retained in full, this
+# product may be distributed under the terms of the GNU General Public
+# License (GPL), in which case the provisions of the GPL apply INSTEAD OF
+# those given above.
+#
+# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDER AND CONTRIBUTORS
+# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+
+.file  "ghash-x86.s"
+.text
+.globl gcm_gmult_4bit_x86
+.type  gcm_gmult_4bit_x86,@function
+.align 16
+gcm_gmult_4bit_x86:
+.L_gcm_gmult_4bit_x86_begin:
+       pushl   %ebp
+       pushl   %ebx
+       pushl   %esi
+       pushl   %edi
+       subl    $84,%esp
+       movl    104(%esp),%edi
+       movl    108(%esp),%esi
+       movl    (%edi),%ebp
+       movl    4(%edi),%edx
+       movl    8(%edi),%ecx
+       movl    12(%edi),%ebx
+       movl    $0,16(%esp)
+       movl    $471859200,20(%esp)
+       movl    $943718400,24(%esp)
+       movl    $610271232,28(%esp)
+       movl    $1887436800,32(%esp)
+       movl    $1822425088,36(%esp)
+       movl    $1220542464,40(%esp)
+       movl    $1423966208,44(%esp)
+       movl    $3774873600,48(%esp)
+       movl    $4246732800,52(%esp)
+       movl    $3644850176,56(%esp)
+       movl    $3311403008,60(%esp)
+       movl    $2441084928,64(%esp)
+       movl    $2376073216,68(%esp)
+       movl    $2847932416,72(%esp)
+       movl    $3051356160,76(%esp)
+       movl    %ebp,(%esp)
+       movl    %edx,4(%esp)
+       movl    %ecx,8(%esp)
+       movl    %ebx,12(%esp)
+       shrl    $20,%ebx
+       andl    $240,%ebx
+       movl    4(%esi,%ebx,1),%ebp
+       movl    (%esi,%ebx,1),%edx
+       movl    12(%esi,%ebx,1),%ecx
+       movl    8(%esi,%ebx,1),%ebx
+       xorl    %eax,%eax
+       movl    $15,%edi
+       jmp     .L000x86_loop
+.align 16
+.L000x86_loop:
+       movb    %bl,%al
+       shrdl   $4,%ecx,%ebx
+       andb    $15,%al
+       shrdl   $4,%edx,%ecx
+       shrdl   $4,%ebp,%edx
+       shrl    $4,%ebp
+       xorl    16(%esp,%eax,4),%ebp
+       movb    (%esp,%edi,1),%al
+       andb    $240,%al
+       xorl    8(%esi,%eax,1),%ebx
+       xorl    12(%esi,%eax,1),%ecx
+       xorl    (%esi,%eax,1),%edx
+       xorl    4(%esi,%eax,1),%ebp
+       decl    %edi
+       js      .L001x86_break
+       movb    %bl,%al
+       shrdl   $4,%ecx,%ebx
+       andb    $15,%al
+       shrdl   $4,%edx,%ecx
+       shrdl   $4,%ebp,%edx
+       shrl    $4,%ebp
+       xorl    16(%esp,%eax,4),%ebp
+       movb    (%esp,%edi,1),%al
+       shlb    $4,%al
+       xorl    8(%esi,%eax,1),%ebx
+       xorl    12(%esi,%eax,1),%ecx
+       xorl    (%esi,%eax,1),%edx
+       xorl    4(%esi,%eax,1),%ebp
+       jmp     .L000x86_loop
+.align 16
+.L001x86_break:
+       bswap   %ebx
+       bswap   %ecx
+       bswap   %edx
+       bswap   %ebp
+       movl    104(%esp),%edi
+       movl    %ebx,12(%edi)
+       movl    %ecx,8(%edi)
+       movl    %edx,4(%edi)
+       movl    %ebp,(%edi)
+       addl    $84,%esp
+       popl    %edi
+       popl    %esi
+       popl    %ebx
+       popl    %ebp
+       ret
+.size  gcm_gmult_4bit_x86,.-.L_gcm_gmult_4bit_x86_begin
+.globl gcm_ghash_4bit_x86
+.type  gcm_ghash_4bit_x86,@function
+.align 16
+gcm_ghash_4bit_x86:
+.L_gcm_ghash_4bit_x86_begin:
+       pushl   %ebp
+       pushl   %ebx
+       pushl   %esi
+       pushl   %edi
+       subl    $84,%esp
+       movl    104(%esp),%ebx
+       movl    108(%esp),%esi
+       movl    112(%esp),%edi
+       movl    116(%esp),%ecx
+       addl    %edi,%ecx
+       movl    %ecx,116(%esp)
+       movl    (%ebx),%ebp
+       movl    4(%ebx),%edx
+       movl    8(%ebx),%ecx
+       movl    12(%ebx),%ebx
+       movl    $0,16(%esp)
+       movl    $471859200,20(%esp)
+       movl    $943718400,24(%esp)
+       movl    $610271232,28(%esp)
+       movl    $1887436800,32(%esp)
+       movl    $1822425088,36(%esp)
+       movl    $1220542464,40(%esp)
+       movl    $1423966208,44(%esp)
+       movl    $3774873600,48(%esp)
+       movl    $4246732800,52(%esp)
+       movl    $3644850176,56(%esp)
+       movl    $3311403008,60(%esp)
+       movl    $2441084928,64(%esp)
+       movl    $2376073216,68(%esp)
+       movl    $2847932416,72(%esp)
+       movl    $3051356160,76(%esp)
+.align 16
+.L002x86_outer_loop:
+       xorl    12(%edi),%ebx
+       xorl    8(%edi),%ecx
+       xorl    4(%edi),%edx
+       xorl    (%edi),%ebp
+       movl    %ebx,12(%esp)
+       movl    %ecx,8(%esp)
+       movl    %edx,4(%esp)
+       movl    %ebp,(%esp)
+       shrl    $20,%ebx
+       andl    $240,%ebx
+       movl    4(%esi,%ebx,1),%ebp
+       movl    (%esi,%ebx,1),%edx
+       movl    12(%esi,%ebx,1),%ecx
+       movl    8(%esi,%ebx,1),%ebx
+       xorl    %eax,%eax
+       movl    $15,%edi
+       jmp     .L003x86_loop
+.align 16
+.L003x86_loop:
+       movb    %bl,%al
+       shrdl   $4,%ecx,%ebx
+       andb    $15,%al
+       shrdl   $4,%edx,%ecx
+       shrdl   $4,%ebp,%edx
+       shrl    $4,%ebp
+       xorl    16(%esp,%eax,4),%ebp
+       movb    (%esp,%edi,1),%al
+       andb    $240,%al
+       xorl    8(%esi,%eax,1),%ebx
+       xorl    12(%esi,%eax,1),%ecx
+       xorl    (%esi,%eax,1),%edx
+       xorl    4(%esi,%eax,1),%ebp
+       decl    %edi
+       js      .L004x86_break
+       movb    %bl,%al
+       shrdl   $4,%ecx,%ebx
+       andb    $15,%al
+       shrdl   $4,%edx,%ecx
+       shrdl   $4,%ebp,%edx
+       shrl    $4,%ebp
+       xorl    16(%esp,%eax,4),%ebp
+       movb    (%esp,%edi,1),%al
+       shlb    $4,%al
+       xorl    8(%esi,%eax,1),%ebx
+       xorl    12(%esi,%eax,1),%ecx
+       xorl    (%esi,%eax,1),%edx
+       xorl    4(%esi,%eax,1),%ebp
+       jmp     .L003x86_loop
+.align 16
+.L004x86_break:
+       bswap   %ebx
+       bswap   %ecx
+       bswap   %edx
+       bswap   %ebp
+       movl    112(%esp),%edi
+       leal    16(%edi),%edi
+       cmpl    116(%esp),%edi
+       movl    %edi,112(%esp)
+       jb      .L002x86_outer_loop
+       movl    104(%esp),%edi
+       movl    %ebx,12(%edi)
+       movl    %ecx,8(%edi)
+       movl    %edx,4(%edi)
+       movl    %ebp,(%edi)
+       addl    $84,%esp
+       popl    %edi
+       popl    %esi
+       popl    %ebx
+       popl    %ebp
+       ret
+.size  gcm_ghash_4bit_x86,.-.L_gcm_ghash_4bit_x86_begin
+.globl gcm_gmult_4bit_mmx
+.type  gcm_gmult_4bit_mmx,@function
+.align 16
+gcm_gmult_4bit_mmx:
+.L_gcm_gmult_4bit_mmx_begin:
+       pushl   %ebp
+       pushl   %ebx
+       pushl   %esi
+       pushl   %edi
+       movl    20(%esp),%edi
+       movl    24(%esp),%esi
+       call    .L005pic_point
+.L005pic_point:
+       popl    %eax
+       leal    .Lrem_4bit-.L005pic_point(%eax),%eax
+       movzbl  15(%edi),%ebx
+       xorl    %ecx,%ecx
+       movl    %ebx,%edx
+       movb    %dl,%cl
+       movl    $14,%ebp
+       shlb    $4,%cl
+       andl    $240,%edx
+       movq    8(%esi,%ecx,1),%mm0
+       movq    (%esi,%ecx,1),%mm1
+       movd    %mm0,%ebx
+       jmp     .L006mmx_loop
+.align 16
+.L006mmx_loop:
+       psrlq   $4,%mm0
+       andl    $15,%ebx
+       movq    %mm1,%mm2
+       psrlq   $4,%mm1
+       pxor    8(%esi,%edx,1),%mm0
+       movb    (%edi,%ebp,1),%cl
+       psllq   $60,%mm2
+       pxor    (%eax,%ebx,8),%mm1
+       decl    %ebp
+       movd    %mm0,%ebx
+       pxor    (%esi,%edx,1),%mm1
+       movl    %ecx,%edx
+       pxor    %mm2,%mm0
+       js      .L007mmx_break
+       shlb    $4,%cl
+       andl    $15,%ebx
+       psrlq   $4,%mm0
+       andl    $240,%edx
+       movq    %mm1,%mm2
+       psrlq   $4,%mm1
+       pxor    8(%esi,%ecx,1),%mm0
+       psllq   $60,%mm2
+       pxor    (%eax,%ebx,8),%mm1
+       movd    %mm0,%ebx
+       pxor    (%esi,%ecx,1),%mm1
+       pxor    %mm2,%mm0
+       jmp     .L006mmx_loop
+.align 16
+.L007mmx_break:
+       shlb    $4,%cl
+       andl    $15,%ebx
+       psrlq   $4,%mm0
+       andl    $240,%edx
+       movq    %mm1,%mm2
+       psrlq   $4,%mm1
+       pxor    8(%esi,%ecx,1),%mm0
+       psllq   $60,%mm2
+       pxor    (%eax,%ebx,8),%mm1
+       movd    %mm0,%ebx
+       pxor    (%esi,%ecx,1),%mm1
+       pxor    %mm2,%mm0
+       psrlq   $4,%mm0
+       andl    $15,%ebx
+       movq    %mm1,%mm2
+       psrlq   $4,%mm1
+       pxor    8(%esi,%edx,1),%mm0
+       psllq   $60,%mm2
+       pxor    (%eax,%ebx,8),%mm1
+       movd    %mm0,%ebx
+       pxor    (%esi,%edx,1),%mm1
+       pxor    %mm2,%mm0
+       psrlq   $32,%mm0
+       movd    %mm1,%edx
+       psrlq   $32,%mm1
+       movd    %mm0,%ecx
+       movd    %mm1,%ebp
+       bswap   %ebx
+       bswap   %edx
+       bswap   %ecx
+       bswap   %ebp
+       emms
+       movl    %ebx,12(%edi)
+       movl    %edx,4(%edi)
+       movl    %ecx,8(%edi)
+       movl    %ebp,(%edi)
+       popl    %edi
+       popl    %esi
+       popl    %ebx
+       popl    %ebp
+       ret
+.size  gcm_gmult_4bit_mmx,.-.L_gcm_gmult_4bit_mmx_begin
+.globl gcm_ghash_4bit_mmx
+.type  gcm_ghash_4bit_mmx,@function
+.align 16
+gcm_ghash_4bit_mmx:
+.L_gcm_ghash_4bit_mmx_begin:
+       pushl   %ebp
+       pushl   %ebx
+       pushl   %esi
+       pushl   %edi
+       movl    20(%esp),%eax
+       movl    24(%esp),%ebx
+       movl    28(%esp),%ecx
+       movl    32(%esp),%edx
+       movl    %esp,%ebp
+       call    .L008pic_point
+.L008pic_point:
+       popl    %esi
+       leal    .Lrem_8bit-.L008pic_point(%esi),%esi
+       subl    $544,%esp
+       andl    $-64,%esp
+       subl    $16,%esp
+       addl    %ecx,%edx
+       movl    %eax,544(%esp)
+       movl    %edx,552(%esp)
+       movl    %ebp,556(%esp)
+       addl    $128,%ebx
+       leal    144(%esp),%edi
+       leal    400(%esp),%ebp
+       movl    -120(%ebx),%edx
+       movq    -120(%ebx),%mm0
+       movq    -128(%ebx),%mm3
+       shll    $4,%edx
+       movb    %dl,(%esp)
+       movl    -104(%ebx),%edx
+       movq    -104(%ebx),%mm2
+       movq    -112(%ebx),%mm5
+       movq    %mm0,-128(%edi)
+       psrlq   $4,%mm0
+       movq    %mm3,(%edi)
+       movq    %mm3,%mm7
+       psrlq   $4,%mm3
+       shll    $4,%edx
+       movb    %dl,1(%esp)
+       movl    -88(%ebx),%edx
+       movq    -88(%ebx),%mm1
+       psllq   $60,%mm7
+       movq    -96(%ebx),%mm4
+       por     %mm7,%mm0
+       movq    %mm2,-120(%edi)
+       psrlq   $4,%mm2
+       movq    %mm5,8(%edi)
+       movq    %mm5,%mm6
+       movq    %mm0,-128(%ebp)
+       psrlq   $4,%mm5
+       movq    %mm3,(%ebp)
+       shll    $4,%edx
+       movb    %dl,2(%esp)
+       movl    -72(%ebx),%edx
+       movq    -72(%ebx),%mm0
+       psllq   $60,%mm6
+       movq    -80(%ebx),%mm3
+       por     %mm6,%mm2
+       movq    %mm1,-112(%edi)
+       psrlq   $4,%mm1
+       movq    %mm4,16(%edi)
+       movq    %mm4,%mm7
+       movq    %mm2,-120(%ebp)
+       psrlq   $4,%mm4
+       movq    %mm5,8(%ebp)
+       shll    $4,%edx
+       movb    %dl,3(%esp)
+       movl    -56(%ebx),%edx
+       movq    -56(%ebx),%mm2
+       psllq   $60,%mm7
+       movq    -64(%ebx),%mm5
+       por     %mm7,%mm1
+       movq    %mm0,-104(%edi)
+       psrlq   $4,%mm0
+       movq    %mm3,24(%edi)
+       movq    %mm3,%mm6
+       movq    %mm1,-112(%ebp)
+       psrlq   $4,%mm3
+       movq    %mm4,16(%ebp)
+       shll    $4,%edx
+       movb    %dl,4(%esp)
+       movl    -40(%ebx),%edx
+       movq    -40(%ebx),%mm1
+       psllq   $60,%mm6
+       movq    -48(%ebx),%mm4
+       por     %mm6,%mm0
+       movq    %mm2,-96(%edi)
+       psrlq   $4,%mm2
+       movq    %mm5,32(%edi)
+       movq    %mm5,%mm7
+       movq    %mm0,-104(%ebp)
+       psrlq   $4,%mm5
+       movq    %mm3,24(%ebp)
+       shll    $4,%edx
+       movb    %dl,5(%esp)
+       movl    -24(%ebx),%edx
+       movq    -24(%ebx),%mm0
+       psllq   $60,%mm7
+       movq    -32(%ebx),%mm3
+       por     %mm7,%mm2
+       movq    %mm1,-88(%edi)
+       psrlq   $4,%mm1
+       movq    %mm4,40(%edi)
+       movq    %mm4,%mm6
+       movq    %mm2,-96(%ebp)
+       psrlq   $4,%mm4
+       movq    %mm5,32(%ebp)
+       shll    $4,%edx
+       movb    %dl,6(%esp)
+       movl    -8(%ebx),%edx
+       movq    -8(%ebx),%mm2
+       psllq   $60,%mm6
+       movq    -16(%ebx),%mm5
+       por     %mm6,%mm1
+       movq    %mm0,-80(%edi)
+       psrlq   $4,%mm0
+       movq    %mm3,48(%edi)
+       movq    %mm3,%mm7
+       movq    %mm1,-88(%ebp)
+       psrlq   $4,%mm3
+       movq    %mm4,40(%ebp)
+       shll    $4,%edx
+       movb    %dl,7(%esp)
+       movl    8(%ebx),%edx
+       movq    8(%ebx),%mm1
+       psllq   $60,%mm7
+       movq    (%ebx),%mm4
+       por     %mm7,%mm0
+       movq    %mm2,-72(%edi)
+       psrlq   $4,%mm2
+       movq    %mm5,56(%edi)
+       movq    %mm5,%mm6
+       movq    %mm0,-80(%ebp)
+       psrlq   $4,%mm5
+       movq    %mm3,48(%ebp)
+       shll    $4,%edx
+       movb    %dl,8(%esp)
+       movl    24(%ebx),%edx
+       movq    24(%ebx),%mm0
+       psllq   $60,%mm6
+       movq    16(%ebx),%mm3
+       por     %mm6,%mm2
+       movq    %mm1,-64(%edi)
+       psrlq   $4,%mm1
+       movq    %mm4,64(%edi)
+       movq    %mm4,%mm7
+       movq    %mm2,-72(%ebp)
+       psrlq   $4,%mm4
+       movq    %mm5,56(%ebp)
+       shll    $4,%edx
+       movb    %dl,9(%esp)
+       movl    40(%ebx),%edx
+       movq    40(%ebx),%mm2
+       psllq   $60,%mm7
+       movq    32(%ebx),%mm5
+       por     %mm7,%mm1
+       movq    %mm0,-56(%edi)
+       psrlq   $4,%mm0
+       movq    %mm3,72(%edi)
+       movq    %mm3,%mm6
+       movq    %mm1,-64(%ebp)
+       psrlq   $4,%mm3
+       movq    %mm4,64(%ebp)
+       shll    $4,%edx
+       movb    %dl,10(%esp)
+       movl    56(%ebx),%edx
+       movq    56(%ebx),%mm1
+       psllq   $60,%mm6
+       movq    48(%ebx),%mm4
+       por     %mm6,%mm0
+       movq    %mm2,-48(%edi)
+       psrlq   $4,%mm2
+       movq    %mm5,80(%edi)
+       movq    %mm5,%mm7
+       movq    %mm0,-56(%ebp)
+       psrlq   $4,%mm5
+       movq    %mm3,72(%ebp)
+       shll    $4,%edx
+       movb    %dl,11(%esp)
+       movl    72(%ebx),%edx
+       movq    72(%ebx),%mm0
+       psllq   $60,%mm7
+       movq    64(%ebx),%mm3
+       por     %mm7,%mm2
+       movq    %mm1,-40(%edi)
+       psrlq   $4,%mm1
+       movq    %mm4,88(%edi)
+       movq    %mm4,%mm6
+       movq    %mm2,-48(%ebp)
+       psrlq   $4,%mm4
+       movq    %mm5,80(%ebp)
+       shll    $4,%edx
+       movb    %dl,12(%esp)
+       movl    88(%ebx),%edx
+       movq    88(%ebx),%mm2
+       psllq   $60,%mm6
+       movq    80(%ebx),%mm5
+       por     %mm6,%mm1
+       movq    %mm0,-32(%edi)
+       psrlq   $4,%mm0
+       movq    %mm3,96(%edi)
+       movq    %mm3,%mm7
+       movq    %mm1,-40(%ebp)
+       psrlq   $4,%mm3
+       movq    %mm4,88(%ebp)
+       shll    $4,%edx
+       movb    %dl,13(%esp)
+       movl    104(%ebx),%edx
+       movq    104(%ebx),%mm1
+       psllq   $60,%mm7
+       movq    96(%ebx),%mm4
+       por     %mm7,%mm0
+       movq    %mm2,-24(%edi)
+       psrlq   $4,%mm2
+       movq    %mm5,104(%edi)
+       movq    %mm5,%mm6
+       movq    %mm0,-32(%ebp)
+       psrlq   $4,%mm5
+       movq    %mm3,96(%ebp)
+       shll    $4,%edx
+       movb    %dl,14(%esp)
+       movl    120(%ebx),%edx
+       movq    120(%ebx),%mm0
+       psllq   $60,%mm6
+       movq    112(%ebx),%mm3
+       por     %mm6,%mm2
+       movq    %mm1,-16(%edi)
+       psrlq   $4,%mm1
+       movq    %mm4,112(%edi)
+       movq    %mm4,%mm7
+       movq    %mm2,-24(%ebp)
+       psrlq   $4,%mm4
+       movq    %mm5,104(%ebp)
+       shll    $4,%edx
+       movb    %dl,15(%esp)
+       psllq   $60,%mm7
+       por     %mm7,%mm1
+       movq    %mm0,-8(%edi)
+       psrlq   $4,%mm0
+       movq    %mm3,120(%edi)
+       movq    %mm3,%mm6
+       movq    %mm1,-16(%ebp)
+       psrlq   $4,%mm3
+       movq    %mm4,112(%ebp)
+       psllq   $60,%mm6
+       por     %mm6,%mm0
+       movq    %mm0,-8(%ebp)
+       movq    %mm3,120(%ebp)
+       movq    (%eax),%mm6
+       movl    8(%eax),%ebx
+       movl    12(%eax),%edx
+.align 16
+.L009outer:
+       xorl    12(%ecx),%edx
+       xorl    8(%ecx),%ebx
+       pxor    (%ecx),%mm6
+       leal    16(%ecx),%ecx
+       movl    %ebx,536(%esp)
+       movq    %mm6,528(%esp)
+       movl    %ecx,548(%esp)
+       xorl    %eax,%eax
+       roll    $8,%edx
+       movb    %dl,%al
+       movl    %eax,%ebp
+       andb    $15,%al
+       shrl    $4,%ebp
+       pxor    %mm0,%mm0
+       roll    $8,%edx
+       pxor    %mm1,%mm1
+       pxor    %mm2,%mm2
+       movq    16(%esp,%eax,8),%mm7
+       movq    144(%esp,%eax,8),%mm6
+       movb    %dl,%al
+       movd    %mm7,%ebx
+       psrlq   $8,%mm7
+       movq    %mm6,%mm3
+       movl    %eax,%edi
+       psrlq   $8,%mm6
+       pxor    272(%esp,%ebp,8),%mm7
+       andb    $15,%al
+       psllq   $56,%mm3
+       shrl    $4,%edi
+       pxor    16(%esp,%eax,8),%mm7
+       roll    $8,%edx
+       pxor    144(%esp,%eax,8),%mm6
+       pxor    %mm3,%mm7
+       pxor    400(%esp,%ebp,8),%mm6
+       xorb    (%esp,%ebp,1),%bl
+       movb    %dl,%al
+       movd    %mm7,%ecx
+       movzbl  %bl,%ebx
+       psrlq   $8,%mm7
+       movq    %mm6,%mm3
+       movl    %eax,%ebp
+       psrlq   $8,%mm6
+       pxor    272(%esp,%edi,8),%mm7
+       andb    $15,%al
+       psllq   $56,%mm3
+       shrl    $4,%ebp
+       pinsrw  $2,(%esi,%ebx,2),%mm2
+       pxor    16(%esp,%eax,8),%mm7
+       roll    $8,%edx
+       pxor    144(%esp,%eax,8),%mm6
+       pxor    %mm3,%mm7
+       pxor    400(%esp,%edi,8),%mm6
+       xorb    (%esp,%edi,1),%cl
+       movb    %dl,%al
+       movl    536(%esp),%edx
+       movd    %mm7,%ebx
+       movzbl  %cl,%ecx
+       psrlq   $8,%mm7
+       movq    %mm6,%mm3
+       movl    %eax,%edi
+       psrlq   $8,%mm6
+       pxor    272(%esp,%ebp,8),%mm7
+       andb    $15,%al
+       psllq   $56,%mm3
+       pxor    %mm2,%mm6
+       shrl    $4,%edi
+       pinsrw  $2,(%esi,%ecx,2),%mm1
+       pxor    16(%esp,%eax,8),%mm7
+       roll    $8,%edx
+       pxor    144(%esp,%eax,8),%mm6
+       pxor    %mm3,%mm7
+       pxor    400(%esp,%ebp,8),%mm6
+       xorb    (%esp,%ebp,1),%bl
+       movb    %dl,%al
+       movd    %mm7,%ecx
+       movzbl  %bl,%ebx
+       psrlq   $8,%mm7
+       movq    %mm6,%mm3
+       movl    %eax,%ebp
+       psrlq   $8,%mm6
+       pxor    272(%esp,%edi,8),%mm7
+       andb    $15,%al
+       psllq   $56,%mm3
+       pxor    %mm1,%mm6
+       shrl    $4,%ebp
+       pinsrw  $2,(%esi,%ebx,2),%mm0
+       pxor    16(%esp,%eax,8),%mm7
+       roll    $8,%edx
+       pxor    144(%esp,%eax,8),%mm6
+       pxor    %mm3,%mm7
+       pxor    400(%esp,%edi,8),%mm6
+       xorb    (%esp,%edi,1),%cl
+       movb    %dl,%al
+       movd    %mm7,%ebx
+       movzbl  %cl,%ecx
+       psrlq   $8,%mm7
+       movq    %mm6,%mm3
+       movl    %eax,%edi
+       psrlq   $8,%mm6
+       pxor    272(%esp,%ebp,8),%mm7
+       andb    $15,%al
+       psllq   $56,%mm3
+       pxor    %mm0,%mm6
+       shrl    $4,%edi
+       pinsrw  $2,(%esi,%ecx,2),%mm2
+       pxor    16(%esp,%eax,8),%mm7
+       roll    $8,%edx
+       pxor    144(%esp,%eax,8),%mm6
+       pxor    %mm3,%mm7
+       pxor    400(%esp,%ebp,8),%mm6
+       xorb    (%esp,%ebp,1),%bl
+       movb    %dl,%al
+       movd    %mm7,%ecx
+       movzbl  %bl,%ebx
+       psrlq   $8,%mm7
+       movq    %mm6,%mm3
+       movl    %eax,%ebp
+       psrlq   $8,%mm6
+       pxor    272(%esp,%edi,8),%mm7
+       andb    $15,%al
+       psllq   $56,%mm3
+       pxor    %mm2,%mm6
+       shrl    $4,%ebp
+       pinsrw  $2,(%esi,%ebx,2),%mm1
+       pxor    16(%esp,%eax,8),%mm7
+       roll    $8,%edx
+       pxor    144(%esp,%eax,8),%mm6
+       pxor    %mm3,%mm7
+       pxor    400(%esp,%edi,8),%mm6
+       xorb    (%esp,%edi,1),%cl
+       movb    %dl,%al
+       movl    532(%esp),%edx
+       movd    %mm7,%ebx
+       movzbl  %cl,%ecx
+       psrlq   $8,%mm7
+       movq    %mm6,%mm3
+       movl    %eax,%edi
+       psrlq   $8,%mm6
+       pxor    272(%esp,%ebp,8),%mm7
+       andb    $15,%al
+       psllq   $56,%mm3
+       pxor    %mm1,%mm6
+       shrl    $4,%edi
+       pinsrw  $2,(%esi,%ecx,2),%mm0
+       pxor    16(%esp,%eax,8),%mm7
+       roll    $8,%edx
+       pxor    144(%esp,%eax,8),%mm6
+       pxor    %mm3,%mm7
+       pxor    400(%esp,%ebp,8),%mm6
+       xorb    (%esp,%ebp,1),%bl
+       movb    %dl,%al
+       movd    %mm7,%ecx
+       movzbl  %bl,%ebx
+       psrlq   $8,%mm7
+       movq    %mm6,%mm3
+       movl    %eax,%ebp
+       psrlq   $8,%mm6
+       pxor    272(%esp,%edi,8),%mm7
+       andb    $15,%al
+       psllq   $56,%mm3
+       pxor    %mm0,%mm6
+       shrl    $4,%ebp
+       pinsrw  $2,(%esi,%ebx,2),%mm2
+       pxor    16(%esp,%eax,8),%mm7
+       roll    $8,%edx
+       pxor    144(%esp,%eax,8),%mm6
+       pxor    %mm3,%mm7
+       pxor    400(%esp,%edi,8),%mm6
+       xorb    (%esp,%edi,1),%cl
+       movb    %dl,%al
+       movd    %mm7,%ebx
+       movzbl  %cl,%ecx
+       psrlq   $8,%mm7
+       movq    %mm6,%mm3
+       movl    %eax,%edi
+       psrlq   $8,%mm6
+       pxor    272(%esp,%ebp,8),%mm7
+       andb    $15,%al
+       psllq   $56,%mm3
+       pxor    %mm2,%mm6
+       shrl    $4,%edi
+       pinsrw  $2,(%esi,%ecx,2),%mm1
+       pxor    16(%esp,%eax,8),%mm7
+       roll    $8,%edx
+       pxor    144(%esp,%eax,8),%mm6
+       pxor    %mm3,%mm7
+       pxor    400(%esp,%ebp,8),%mm6
+       xorb    (%esp,%ebp,1),%bl
+       movb    %dl,%al
+       movd    %mm7,%ecx
+       movzbl  %bl,%ebx
+       psrlq   $8,%mm7
+       movq    %mm6,%mm3
+       movl    %eax,%ebp
+       psrlq   $8,%mm6
+       pxor    272(%esp,%edi,8),%mm7
+       andb    $15,%al
+       psllq   $56,%mm3
+       pxor    %mm1,%mm6
+       shrl    $4,%ebp
+       pinsrw  $2,(%esi,%ebx,2),%mm0
+       pxor    16(%esp,%eax,8),%mm7
+       roll    $8,%edx
+       pxor    144(%esp,%eax,8),%mm6
+       pxor    %mm3,%mm7
+       pxor    400(%esp,%edi,8),%mm6
+       xorb    (%esp,%edi,1),%cl
+       movb    %dl,%al
+       movl    528(%esp),%edx
+       movd    %mm7,%ebx
+       movzbl  %cl,%ecx
+       psrlq   $8,%mm7
+       movq    %mm6,%mm3
+       movl    %eax,%edi
+       psrlq   $8,%mm6
+       pxor    272(%esp,%ebp,8),%mm7
+       andb    $15,%al
+       psllq   $56,%mm3
+       pxor    %mm0,%mm6
+       shrl    $4,%edi
+       pinsrw  $2,(%esi,%ecx,2),%mm2
+       pxor    16(%esp,%eax,8),%mm7
+       roll    $8,%edx
+       pxor    144(%esp,%eax,8),%mm6
+       pxor    %mm3,%mm7
+       pxor    400(%esp,%ebp,8),%mm6
+       xorb    (%esp,%ebp,1),%bl
+       movb    %dl,%al
+       movd    %mm7,%ecx
+       movzbl  %bl,%ebx
+       psrlq   $8,%mm7
+       movq    %mm6,%mm3
+       movl    %eax,%ebp
+       psrlq   $8,%mm6
+       pxor    272(%esp,%edi,8),%mm7
+       andb    $15,%al
+       psllq   $56,%mm3
+       pxor    %mm2,%mm6
+       shrl    $4,%ebp
+       pinsrw  $2,(%esi,%ebx,2),%mm1
+       pxor    16(%esp,%eax,8),%mm7
+       roll    $8,%edx
+       pxor    144(%esp,%eax,8),%mm6
+       pxor    %mm3,%mm7
+       pxor    400(%esp,%edi,8),%mm6
+       xorb    (%esp,%edi,1),%cl
+       movb    %dl,%al
+       movd    %mm7,%ebx
+       movzbl  %cl,%ecx
+       psrlq   $8,%mm7
+       movq    %mm6,%mm3
+       movl    %eax,%edi
+       psrlq   $8,%mm6
+       pxor    272(%esp,%ebp,8),%mm7
+       andb    $15,%al
+       psllq   $56,%mm3
+       pxor    %mm1,%mm6
+       shrl    $4,%edi
+       pinsrw  $2,(%esi,%ecx,2),%mm0
+       pxor    16(%esp,%eax,8),%mm7
+       roll    $8,%edx
+       pxor    144(%esp,%eax,8),%mm6
+       pxor    %mm3,%mm7
+       pxor    400(%esp,%ebp,8),%mm6
+       xorb    (%esp,%ebp,1),%bl
+       movb    %dl,%al
+       movd    %mm7,%ecx
+       movzbl  %bl,%ebx
+       psrlq   $8,%mm7
+       movq    %mm6,%mm3
+       movl    %eax,%ebp
+       psrlq   $8,%mm6
+       pxor    272(%esp,%edi,8),%mm7
+       andb    $15,%al
+       psllq   $56,%mm3
+       pxor    %mm0,%mm6
+       shrl    $4,%ebp
+       pinsrw  $2,(%esi,%ebx,2),%mm2
+       pxor    16(%esp,%eax,8),%mm7
+       roll    $8,%edx
+       pxor    144(%esp,%eax,8),%mm6
+       pxor    %mm3,%mm7
+       pxor    400(%esp,%edi,8),%mm6
+       xorb    (%esp,%edi,1),%cl
+       movb    %dl,%al
+       movl    524(%esp),%edx
+       movd    %mm7,%ebx
+       movzbl  %cl,%ecx
+       psrlq   $8,%mm7
+       movq    %mm6,%mm3
+       movl    %eax,%edi
+       psrlq   $8,%mm6
+       pxor    272(%esp,%ebp,8),%mm7
+       andb    $15,%al
+       psllq   $56,%mm3
+       pxor    %mm2,%mm6
+       shrl    $4,%edi
+       pinsrw  $2,(%esi,%ecx,2),%mm1
+       pxor    16(%esp,%eax,8),%mm7
+       pxor    144(%esp,%eax,8),%mm6
+       xorb    (%esp,%ebp,1),%bl
+       pxor    %mm3,%mm7
+       pxor    400(%esp,%ebp,8),%mm6
+       movzbl  %bl,%ebx
+       pxor    %mm2,%mm2
+       psllq   $4,%mm1
+       movd    %mm7,%ecx
+       psrlq   $4,%mm7
+       movq    %mm6,%mm3
+       psrlq   $4,%mm6
+       shll    $4,%ecx
+       pxor    16(%esp,%edi,8),%mm7
+       psllq   $60,%mm3
+       movzbl  %cl,%ecx
+       pxor    %mm3,%mm7
+       pxor    144(%esp,%edi,8),%mm6
+       pinsrw  $2,(%esi,%ebx,2),%mm0
+       pxor    %mm1,%mm6
+       movd    %mm7,%edx
+       pinsrw  $3,(%esi,%ecx,2),%mm2
+       psllq   $12,%mm0
+       pxor    %mm0,%mm6
+       psrlq   $32,%mm7
+       pxor    %mm2,%mm6
+       movl    548(%esp),%ecx
+       movd    %mm7,%ebx
+       movq    %mm6,%mm3
+       psllw   $8,%mm6
+       psrlw   $8,%mm3
+       por     %mm3,%mm6
+       bswap   %edx
+       pshufw  $27,%mm6,%mm6
+       bswap   %ebx
+       cmpl    552(%esp),%ecx
+       jne     .L009outer
+       movl    544(%esp),%eax
+       movl    %edx,12(%eax)
+       movl    %ebx,8(%eax)
+       movq    %mm6,(%eax)
+       movl    556(%esp),%esp
+       emms
+       popl    %edi
+       popl    %esi
+       popl    %ebx
+       popl    %ebp
+       ret
+.size  gcm_ghash_4bit_mmx,.-.L_gcm_ghash_4bit_mmx_begin
+.align 64
+.Lrem_4bit:
+.long  0,0,0,471859200,0,943718400,0,610271232
+.long  0,1887436800,0,1822425088,0,1220542464,0,1423966208
+.long  0,3774873600,0,4246732800,0,3644850176,0,3311403008
+.long  0,2441084928,0,2376073216,0,2847932416,0,3051356160
+.align 64
+.Lrem_8bit:
+.value 0,450,900,582,1800,1738,1164,1358
+.value 3600,4050,3476,3158,2328,2266,2716,2910
+.value 7200,7650,8100,7782,6952,6890,6316,6510
+.value 4656,5106,4532,4214,5432,5370,5820,6014
+.value 14400,14722,15300,14854,16200,16010,15564,15630
+.value 13904,14226,13780,13334,12632,12442,13020,13086
+.value 9312,9634,10212,9766,9064,8874,8428,8494
+.value 10864,11186,10740,10294,11640,11450,12028,12094
+.value 28800,28994,29444,29382,30600,30282,29708,30158
+.value 32400,32594,32020,31958,31128,30810,31260,31710
+.value 27808,28002,28452,28390,27560,27242,26668,27118
+.value 25264,25458,24884,24822,26040,25722,26172,26622
+.value 18624,18690,19268,19078,20424,19978,19532,19854
+.value 18128,18194,17748,17558,16856,16410,16988,17310
+.value 21728,21794,22372,22182,21480,21034,20588,20910
+.value 23280,23346,22900,22710,24056,23610,24188,24510
+.value 57600,57538,57988,58182,58888,59338,58764,58446
+.value 61200,61138,60564,60758,59416,59866,60316,59998
+.value 64800,64738,65188,65382,64040,64490,63916,63598
+.value 62256,62194,61620,61814,62520,62970,63420,63102
+.value 55616,55426,56004,56070,56904,57226,56780,56334
+.value 55120,54930,54484,54550,53336,53658,54236,53790
+.value 50528,50338,50916,50982,49768,50090,49644,49198
+.value 52080,51890,51444,51510,52344,52666,53244,52798
+.value 37248,36930,37380,37830,38536,38730,38156,38094
+.value 40848,40530,39956,40406,39064,39258,39708,39646
+.value 36256,35938,36388,36838,35496,35690,35116,35054
+.value 33712,33394,32820,33270,33976,34170,34620,34558
+.value 43456,43010,43588,43910,44744,44810,44364,44174
+.value 42960,42514,42068,42390,41176,41242,41820,41630
+.value 46560,46114,46692,47014,45800,45866,45420,45230
+.value 48112,47666,47220,47542,48376,48442,49020,48830
+.byte  71,72,65,83,72,32,102,111,114,32,120,56,54,44,32,67
+.byte  82,89,80,84,79,71,65,77,83,32,98,121,32,60,97,112
+.byte  112,114,111,64,111,112,101,110,115,115,108,46,111,114,103,62
+.byte  0
diff --git a/lib/crypto-api.c b/lib/crypto-api.c
index 84a2419..5e3f5c7 100644
--- a/lib/crypto-api.c
+++ b/lib/crypto-api.c
@@ -95,8 +95,8 @@ gnutls_cipher_tag (gnutls_cipher_hd_t handle, void *tag, 
size_t tag_size)
  *
  * This function operates on authenticated encryption with
  * associated data (AEAD) ciphers and authenticate the
- * input data. This function can only be called before
- * encryption operations.
+ * input data. This function can only be called once
+ * and before any encryption operations.
  *
  * Returns: Zero or a negative value on error.
  *
diff --git a/lib/gnutls_int.h b/lib/gnutls_int.h
index 104492a..b2bc5c2 100644
--- a/lib/gnutls_int.h
+++ b/lib/gnutls_int.h
@@ -50,6 +50,13 @@ typedef int ssize_t;
 #include <time.h>
 #include <u64.h> /* gnulib for uint64_t */
 
+#ifdef HAVE_LIBNETTLE
+# include <nettle/memxor.h>
+#else
+# include <gl/memxor.h>
+# define memxor gl_memxor
+#endif
+
 /* some systems had problems with long long int, thus,
  * it is not used.
  */
diff --git a/lib/gnutls_num.c b/lib/gnutls_num.c
index 46467d5..8829ae7 100644
--- a/lib/gnutls_num.c
+++ b/lib/gnutls_num.c
@@ -133,6 +133,15 @@ _gnutls_read_uint24 (const opaque * data)
 }
 
 void
+_gnutls_write_uint64 (uint64_t num, opaque * data)
+{
+#ifndef WORDS_BIGENDIAN
+  num = bswap_64 (num);
+#endif
+  memcpy(data, &num, 8);
+}
+
+void
 _gnutls_write_uint24 (uint32_t num, opaque * data)
 {
   uint24 tmp;
diff --git a/lib/gnutls_num.h b/lib/gnutls_num.h
index 456e34e..8deb4f1 100644
--- a/lib/gnutls_num.h
+++ b/lib/gnutls_num.h
@@ -38,6 +38,7 @@ uint16_t _gnutls_read_uint16 (const opaque * data);
 uint32_t _gnutls_conv_uint32 (uint32_t data);
 uint16_t _gnutls_conv_uint16 (uint16_t data);
 uint32_t _gnutls_read_uint24 (const opaque * data);
+void _gnutls_write_uint64 (uint64_t num, opaque * data);
 void _gnutls_write_uint24 (uint32_t num, opaque * data);
 void _gnutls_write_uint32 (uint32_t num, opaque * data);
 void _gnutls_write_uint16 (uint16_t num, opaque * data);
diff --git a/lib/gnutls_state.c b/lib/gnutls_state.c
index 93a38b1..859ed14 100644
--- a/lib/gnutls_state.c
+++ b/lib/gnutls_state.c
@@ -891,20 +891,6 @@ _gnutls_P_hash (gnutls_mac_algorithm_t algorithm,
   return 0;
 }
 
-/* Xor's two buffers and puts the output in the first one.
- */
-inline static void
-_gnutls_xor (opaque * o1, opaque * o2, int length)
-{
-  int i;
-  for (i = 0; i < length; i++)
-    {
-      o1[i] ^= o2[i];
-    }
-}
-
-
-
 #define MAX_PRF_BYTES 200
 
 /* The PRF function expands a given secret 
@@ -982,7 +968,7 @@ _gnutls_PRF (gnutls_session_t session,
           return result;
         }
 
-      _gnutls_xor (o1, o2, total_bytes);
+      memxor (o1, o2, total_bytes);
 
       memcpy (ret, o1, total_bytes);
     }
diff --git a/m4/hooks.m4 b/m4/hooks.m4
index e117eec..bf9a42a 100644
--- a/m4/hooks.m4
+++ b/m4/hooks.m4
@@ -41,7 +41,7 @@ AC_DEFUN([LIBGNUTLS_HOOKS],
   # Interfaces added:                             AGE++
   # Interfaces removed:                           AGE=0
   AC_SUBST(LT_CURRENT, 27)
-  AC_SUBST(LT_REVISION, 1)
+  AC_SUBST(LT_REVISION, 2)
   AC_SUBST(LT_AGE, 0)
 
   AC_SUBST(LT_SSL_CURRENT, 27)
diff --git a/src/benchmark-tls.c b/src/benchmark-tls.c
index fc20f2a..84c69fc 100644
--- a/src/benchmark-tls.c
+++ b/src/benchmark-tls.c
@@ -44,6 +44,8 @@
 #define PRIO_ECDH 
"NONE:+VERS-TLS1.0:+AES-128-CBC:+SHA1:+SIGN-ALL:+COMP-NULL:+ANON-ECDH:+CURVE-SECP224R1"
 
 #define PRIO_AES_CBC_SHA1 
"NONE:+VERS-TLS1.0:+AES-128-CBC:+SHA1:+SIGN-ALL:+COMP-NULL:+ANON-DH"
+#define PRIO_ARCFOUR_128_MD5 
"NONE:+VERS-TLS1.0:+ARCFOUR-128:+MD5:+SIGN-ALL:+COMP-NULL:+ANON-DH"
+#define PRIO_AES_GCM 
"NONE:+VERS-TLS1.2:+AES-128-GCM:+AEAD:+SIGN-ALL:+COMP-NULL:+ANON-DH"
 #define PRIO_CAMELLIA_CBC_SHA1 
"NONE:+VERS-TLS1.0:+CAMELLIA-128-CBC:+SHA1:+SIGN-ALL:+COMP-NULL:+ANON-DH"
 
 /* DH of 2432 bits that is pretty equivalent to 224 bits of ECDH.
@@ -270,11 +272,17 @@ main (int argc, char **argv)
     }
   gnutls_global_init ();
 
-  printf("Testing key exchanges:\n");
-  test_ciphersuite_kx (PRIO_DH);
-  test_ciphersuite_kx (PRIO_ECDH);
+  printf("Testing throughput in cipher/MAC combinations:\n");
+  test_ciphersuite (PRIO_ARCFOUR_128_MD5, 1024);
+  test_ciphersuite (PRIO_ARCFOUR_128_MD5, 4096);
+  test_ciphersuite (PRIO_ARCFOUR_128_MD5, 8*1024);
+  test_ciphersuite (PRIO_ARCFOUR_128_MD5, 15*1024);
+
+  test_ciphersuite (PRIO_AES_GCM, 1024);
+  test_ciphersuite (PRIO_AES_GCM, 4096);
+  test_ciphersuite (PRIO_AES_GCM, 8*1024);
+  test_ciphersuite (PRIO_AES_GCM, 15*1024);
 
-  printf("\nTesting throughput in cipher/MAC combinations:\n");
   test_ciphersuite (PRIO_AES_CBC_SHA1, 1024);
   test_ciphersuite (PRIO_AES_CBC_SHA1, 4096);
   test_ciphersuite (PRIO_AES_CBC_SHA1, 8*1024);
@@ -285,6 +293,10 @@ main (int argc, char **argv)
   test_ciphersuite (PRIO_CAMELLIA_CBC_SHA1, 8*1024);
   test_ciphersuite (PRIO_CAMELLIA_CBC_SHA1, 15*1024);
 
+  printf("\nTesting key exchanges:\n");
+  test_ciphersuite_kx (PRIO_DH);
+  test_ciphersuite_kx (PRIO_ECDH);
+
 
   gnutls_global_deinit ();
 }
diff --git a/src/benchmark.c b/src/benchmark.c
index befc81f..015110f 100644
--- a/src/benchmark.c
+++ b/src/benchmark.c
@@ -62,7 +62,7 @@ value2human (unsigned long bytes, double time, double *data, 
double *speed,
 
 void start_benchmark(struct benchmark_st * st)
 {
-  memset(st, 0, sizeof(st));
+  memset(st, 0, sizeof(*st));
   st->old_handler = signal (SIGALRM, alarm_handler);
   gettime (&st->start);
   benchmark_must_finish = 0;
diff --git a/tests/cipher-test.c b/tests/cipher-test.c
index d9ae400..000cfde 100644
--- a/tests/cipher-test.c
+++ b/tests/cipher-test.c
@@ -11,321 +11,462 @@
 #include <gnutls/gnutls.h>
 #include <gnutls/crypto.h>
 
-struct aes_vectors_st {
-       const uint8_t *key;
-       const uint8_t *plaintext;
-       const uint8_t *ciphertext;
-} aes_vectors[] = {
-       {
-       .key =
-                   (uint8_t *)
-                   
"\xc0\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00",.
-                   plaintext =
-                   (uint8_t *)
-                   
"\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00",.
-                   ciphertext =
-                   (uint8_t *)
-                   
"\x4b\xc3\xf8\x83\x45\x0c\x11\x3c\x64\xca\x42\xe1\x11\x2a\x9e\x87",},
-       {
-       .key =
-                   (uint8_t *)
-                   
"\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00",.
-                   plaintext =
-                   (uint8_t *)
-                   
"\xf3\x44\x81\xec\x3c\xc6\x27\xba\xcd\x5d\xc3\xfb\x08\xf2\x73\xe6",.
-                   ciphertext =
-                   (uint8_t *)
-                   
"\x03\x36\x76\x3e\x96\x6d\x92\x59\x5a\x56\x7c\xc9\xce\x53\x7f\x5e",},
-       {
-       .key =
-                   (uint8_t *)
-                   
"\x10\xa5\x88\x69\xd7\x4b\xe5\xa3\x74\xcf\x86\x7c\xfb\x47\x38\x59",.
-                   plaintext =
-                   (uint8_t *)
-                   
"\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00",.
-                   ciphertext =
-                   (uint8_t *)
-                   
"\x6d\x25\x1e\x69\x44\xb0\x51\xe0\x4e\xaa\x6f\xb4\xdb\xf7\x84\x65",},
-       {
-       .key =
-                   (uint8_t *)
-                   
"\xca\xea\x65\xcd\xbb\x75\xe9\x16\x9e\xcd\x22\xeb\xe6\xe5\x46\x75",.
-                   plaintext =
-                   (uint8_t *)
-                   
"\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00",.
-                   ciphertext =
-                   (uint8_t *)
-                   
"\x6e\x29\x20\x11\x90\x15\x2d\xf4\xee\x05\x81\x39\xde\xf6\x10\xbb",},
-       {
-.key =
-                   (uint8_t *)
-                   
"\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xfe",.
-                   plaintext =
-                   (uint8_t *)
-                   
"\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00",.
-                   ciphertext =
-                   (uint8_t *)
-                   
"\x9b\xa4\xa9\x14\x3f\x4e\x5d\x40\x48\x52\x1c\x4f\x88\x77\xd8\x8e",},};
+struct aes_vectors_st
+{
+  const uint8_t *key;
+  const uint8_t *plaintext;
+  const uint8_t *ciphertext;
+};
+
+struct aes_gcm_vectors_st
+{
+  const uint8_t *key;
+  const uint8_t *auth;
+  int auth_size;
+  const uint8_t *plaintext;
+  int plaintext_size;
+  const uint8_t *iv;
+  const uint8_t *ciphertext;
+  const uint8_t *tag;
+};
+
+struct aes_gcm_vectors_st aes_gcm_vectors[] = {
+  {
+   .key = "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00",
+   .auth = NULL,
+   .auth_size = 0,
+   .plaintext = NULL,
+   .plaintext_size = 0,
+   .ciphertext = NULL,
+   .iv = "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00",
+   .tag = "\x58\xe2\xfc\xce\xfa\x7e\x30\x61\x36\x7f\x1d\x57\xa4\xe7\x45\x5a"},
+  {
+   .key = "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00",
+   .auth = NULL,
+   .auth_size = 0,
+   .plaintext =
+   "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00",
+   .plaintext_size = 16,
+   .ciphertext =
+   "\x03\x88\xda\xce\x60\xb6\xa3\x92\xf3\x28\xc2\xb9\x71\xb2\xfe\x78",
+   .iv = "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00",
+   .tag = "\xab\x6e\x47\xd4\x2c\xec\x13\xbd\xf5\x3a\x67\xb2\x12\x57\xbd\xdf"},
+  {
+   .key = "\xfe\xff\xe9\x92\x86\x65\x73\x1c\x6d\x6a\x8f\x94\x67\x30\x83\x08",
+   .auth =
+   
"\xfe\xed\xfa\xce\xde\xad\xbe\xef\xfe\xed\xfa\xce\xde\xad\xbe\xef\xab\xad\xda\xd2",
+   .auth_size = 20,
+   .plaintext =
+   
"\xd9\x31\x32\x25\xf8\x84\x06\xe5\xa5\x59\x09\xc5\xaf\xf5\x26\x9a\x86\xa7\xa9\x53\x15\x34\xf7\xda\x2e\x4c\x30\x3d\x8a\x31\x8a\x72\x1c\x3c\x0c\x95\x95\x68\x09\x53\x2f\xcf\x0e\x24\x49\xa6\xb5\x25\xb1\x6a\xed\xf5\xaa\x0d\xe6\x57\xba\x63\x7b\x39",
+   .plaintext_size = 60,
+   .ciphertext =
+   
"\x42\x83\x1e\xc2\x21\x77\x74\x24\x4b\x72\x21\xb7\x84\xd0\xd4\x9c\xe3\xaa\x21\x2f\x2c\x02\xa4\xe0\x35\xc1\x7e\x23\x29\xac\xa1\x2e\x21\xd5\x14\xb2\x54\x66\x93\x1c\x7d\x8f\x6a\x5a\xac\x84\xaa\x05\x1b\xa3\x0b\x39\x6a\x0a\xac\x97\x3d\x58\xe0\x91",
+   .iv = "\xca\xfe\xba\xbe\xfa\xce\xdb\xad\xde\xca\xf8\x88",
+   .tag = "\x5b\xc9\x4f\xbc\x32\x21\xa5\xdb\x94\xfa\xe9\x5a\xe7\x12\x1a\x47"}
+};
+
+
+struct aes_vectors_st aes_vectors[] = {
+  {
+   .key =
+   (uint8_t *)
+   "\xc0\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00",
+   .plaintext = (uint8_t *)
+   "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00",
+   .ciphertext = (uint8_t *)
+   "\x4b\xc3\xf8\x83\x45\x0c\x11\x3c\x64\xca\x42\xe1\x11\x2a\x9e\x87",
+  },
+  {
+   .key = (uint8_t *)
+   "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00",
+   .plaintext = (uint8_t *)
+   "\xf3\x44\x81\xec\x3c\xc6\x27\xba\xcd\x5d\xc3\xfb\x08\xf2\x73\xe6",
+   .ciphertext = (uint8_t *)
+   "\x03\x36\x76\x3e\x96\x6d\x92\x59\x5a\x56\x7c\xc9\xce\x53\x7f\x5e",
+  },
+  {
+   .key = (uint8_t *)
+   "\x10\xa5\x88\x69\xd7\x4b\xe5\xa3\x74\xcf\x86\x7c\xfb\x47\x38\x59",
+   .plaintext = (uint8_t *)
+   "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00",
+   .ciphertext = (uint8_t *)
+   "\x6d\x25\x1e\x69\x44\xb0\x51\xe0\x4e\xaa\x6f\xb4\xdb\xf7\x84\x65",
+  },
+  {
+   .key = (uint8_t *)
+   "\xca\xea\x65\xcd\xbb\x75\xe9\x16\x9e\xcd\x22\xeb\xe6\xe5\x46\x75",
+   .plaintext = (uint8_t *)
+   "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00",
+   .ciphertext = (uint8_t *)
+   "\x6e\x29\x20\x11\x90\x15\x2d\xf4\xee\x05\x81\x39\xde\xf6\x10\xbb",
+  },
+  {
+   .key = (uint8_t *)
+   "\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xfe",
+   .plaintext = (uint8_t *)
+   "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00",
+   .ciphertext = (uint8_t *)
+   "\x9b\xa4\xa9\x14\x3f\x4e\x5d\x40\x48\x52\x1c\x4f\x88\x77\xd8\x8e",
+  },
+};
 
 /* AES cipher */
-static int test_aes(void)
+static int
+test_aes (void)
 {
-       gnutls_cipher_hd_t hd;
-       int ret, i, j;
-       uint8_t _iv[16];
-       uint8_t tmp[16];
-       gnutls_datum_t key, iv;
-       
-       fprintf(stdout, "Tests on AES Encryption: ");
-       for (i = 0; i < sizeof(aes_vectors) / sizeof(aes_vectors[0]); i++) {
-               memset(_iv, 0, sizeof(_iv));
-               memset(tmp, 0, sizeof(tmp));
-               key.data = (void*)aes_vectors[i].key;
-               key.size = 16;
-               
-               iv.data = _iv;
-               iv.size = 16;
-
-               ret = gnutls_cipher_init( &hd, GNUTLS_CIPHER_AES_128_CBC, 
-                       &key, &iv);
-               if (ret < 0) {
-                       fprintf(stderr, "%d: AES test %d failed\n", __LINE__, 
i);
-                       return 1;
-               }
-               
-               ret = gnutls_cipher_encrypt2(hd, aes_vectors[i].plaintext, 16,
-                       tmp, 16);
-               if (ret < 0) {
-                       fprintf(stderr, "%d: AES test %d failed\n", __LINE__, 
i);
-                       return 1;
-               }
-               
-               gnutls_cipher_deinit(hd);
-
-               if (memcmp(tmp, aes_vectors[i].ciphertext, 16) != 0) {
-                       fprintf(stderr, "AES test vector %d failed!\n", i);
-
-                       fprintf(stderr, "Cipher[%d]: ", 16);
-                       for (j = 0; j < 16; j++)
-                               fprintf(stderr, "%.2x:", (int)tmp[j]);
-                       fprintf(stderr, "\n");
-
-                       fprintf(stderr, "Expected[%d]: ", 16);
-                       for (j = 0; j < 16; j++)
-                               fprintf(stderr, "%.2x:",
-                                       (int)aes_vectors[i].ciphertext[j]);
-                       fprintf(stderr, "\n");
-                       return 1;
-               }
-       }
-       fprintf(stdout, "ok\n");
-
-       fprintf(stdout, "Tests on AES Decryption: ");
-       for (i = 0; i < sizeof(aes_vectors) / sizeof(aes_vectors[0]); i++) {
-
-               memset(_iv, 0, sizeof(_iv));
-               memset(tmp, 0x33, sizeof(tmp));
-
-               key.data = (void*)aes_vectors[i].key;
-               key.size = 16;
-               
-               iv.data = _iv;
-               iv.size = 16;
-
-               ret = gnutls_cipher_init( &hd, GNUTLS_CIPHER_AES_128_CBC, 
-                       &key, &iv);
-               if (ret < 0) {
-                       fprintf(stderr, "%d: AES test %d failed\n", __LINE__, 
i);
-                       return 1;
-               }
-               
-               ret = gnutls_cipher_decrypt2(hd, aes_vectors[i].ciphertext, 16,
-                       tmp, 16);
-               if (ret < 0) {
-                       fprintf(stderr, "%d: AES test %d failed\n", __LINE__, 
i);
-                       return 1;
-               }
-               
-               gnutls_cipher_deinit(hd);
-
-               if (memcmp(tmp, aes_vectors[i].plaintext, 16) != 0) {
-                       fprintf(stderr, "AES test vector %d failed!\n", i);
-
-                       fprintf(stderr, "Plain[%d]: ", 16);
-                       for (j = 0; j < 16; j++)
-                               fprintf(stderr, "%.2x:", (int)tmp[j]);
-                       fprintf(stderr, "\n");
-
-                       fprintf(stderr, "Expected[%d]: ", 16);
-                       for (j = 0; j < 16; j++)
-                               fprintf(stderr, "%.2x:",
-                                       (int)aes_vectors[i].plaintext[j]);
-                       fprintf(stderr, "\n");
-                       return 1;
-               }
-       }
-
-       fprintf(stdout, "ok\n");
-       fprintf(stdout, "\n");
-
-       return 0;
+  gnutls_cipher_hd_t hd;
+  int ret, i, j;
+  uint8_t _iv[16];
+  uint8_t tmp[128];
+  gnutls_datum_t key, iv;
+
+  fprintf (stdout, "Tests on AES Encryption: ");
+  fflush (stdout);
+  for (i = 0; i < sizeof (aes_vectors) / sizeof (aes_vectors[0]); i++)
+    {
+      memset (_iv, 0, sizeof (_iv));
+      memset (tmp, 0, sizeof (tmp));
+      key.data = (void *) aes_vectors[i].key;
+      key.size = 16;
+
+      iv.data = _iv;
+      iv.size = 16;
+
+      ret = gnutls_cipher_init (&hd, GNUTLS_CIPHER_AES_128_CBC, &key, &iv);
+      if (ret < 0)
+        {
+          fprintf (stderr, "%d: AES test %d failed\n", __LINE__, i);
+          return 1;
+        }
+
+      ret = gnutls_cipher_encrypt2 (hd, aes_vectors[i].plaintext, 16,
+                                    tmp, 16);
+      if (ret < 0)
+        {
+          fprintf (stderr, "%d: AES test %d failed\n", __LINE__, i);
+          return 1;
+        }
+
+      gnutls_cipher_deinit (hd);
+
+      if (memcmp (tmp, aes_vectors[i].ciphertext, 16) != 0)
+        {
+          fprintf (stderr, "AES test vector %d failed!\n", i);
+
+          fprintf (stderr, "Cipher[%d]: ", 16);
+          for (j = 0; j < 16; j++)
+            fprintf (stderr, "%.2x:", (int) tmp[j]);
+          fprintf (stderr, "\n");
+
+          fprintf (stderr, "Expected[%d]: ", 16);
+          for (j = 0; j < 16; j++)
+            fprintf (stderr, "%.2x:", (int) aes_vectors[i].ciphertext[j]);
+          fprintf (stderr, "\n");
+          return 1;
+        }
+    }
+  fprintf (stdout, "ok\n");
+
+  fprintf (stdout, "Tests on AES Decryption: ");
+  fflush (stdout);
+  for (i = 0; i < sizeof (aes_vectors) / sizeof (aes_vectors[0]); i++)
+    {
+
+      memset (_iv, 0, sizeof (_iv));
+      memset (tmp, 0x33, sizeof (tmp));
+
+      key.data = (void *) aes_vectors[i].key;
+      key.size = 16;
+
+      iv.data = _iv;
+      iv.size = 16;
+
+      ret = gnutls_cipher_init (&hd, GNUTLS_CIPHER_AES_128_CBC, &key, &iv);
+      if (ret < 0)
+        {
+          fprintf (stderr, "%d: AES test %d failed\n", __LINE__, i);
+          return 1;
+        }
+
+      ret = gnutls_cipher_decrypt2 (hd, aes_vectors[i].ciphertext, 16,
+                                    tmp, 16);
+      if (ret < 0)
+        {
+          fprintf (stderr, "%d: AES test %d failed\n", __LINE__, i);
+          return 1;
+        }
+
+      gnutls_cipher_deinit (hd);
+
+      if (memcmp (tmp, aes_vectors[i].plaintext, 16) != 0)
+        {
+          fprintf (stderr, "AES test vector %d failed!\n", i);
+
+          fprintf (stderr, "Plain[%d]: ", 16);
+          for (j = 0; j < 16; j++)
+            fprintf (stderr, "%.2x:", (int) tmp[j]);
+          fprintf (stderr, "\n");
+
+          fprintf (stderr, "Expected[%d]: ", 16);
+          for (j = 0; j < 16; j++)
+            fprintf (stderr, "%.2x:", (int) aes_vectors[i].plaintext[j]);
+          fprintf (stderr, "\n");
+          return 1;
+        }
+    }
+
+  fprintf (stdout, "ok\n");
+  fprintf (stdout, "\n");
+
+  fprintf (stdout, "Tests on AES-GCM: ");
+  fflush (stdout);
+  for (i = 0; i < sizeof (aes_gcm_vectors) / sizeof (aes_gcm_vectors[0]); i++)
+    {
+      memset (tmp, 0, sizeof (tmp));
+      key.data = (void *) aes_gcm_vectors[i].key;
+      key.size = 16;
+
+      iv.data = (void *) aes_gcm_vectors[i].iv;
+      iv.size = 12;
+
+      ret = gnutls_cipher_init (&hd, GNUTLS_CIPHER_AES_128_GCM, &key, &iv);
+      if (ret < 0)
+        {
+          fprintf (stderr, "%d: AES-GCM test %d failed\n", __LINE__, i);
+          return 1;
+        }
+
+      if (aes_gcm_vectors[i].auth_size > 0)
+        {
+          ret =
+            gnutls_cipher_add_auth (hd, aes_gcm_vectors[i].auth,
+                                    aes_gcm_vectors[i].auth_size);
+
+          if (ret < 0)
+            {
+              fprintf (stderr, "%d: AES-GCM test %d failed\n", __LINE__, i);
+              return 1;
+            }
+        }
+
+      if (aes_gcm_vectors[i].plaintext_size > 0)
+        {
+          ret =
+            gnutls_cipher_encrypt2 (hd, aes_gcm_vectors[i].plaintext,
+                                    aes_gcm_vectors[i].plaintext_size, tmp,
+                                    aes_gcm_vectors[i].plaintext_size);
+          if (ret < 0)
+            {
+              fprintf (stderr, "%d: AES-GCM test %d failed\n", __LINE__, i);
+              return 1;
+            }
+        }
+
+
+      if (aes_gcm_vectors[i].plaintext_size > 0)
+        if (memcmp
+            (tmp, aes_gcm_vectors[i].ciphertext,
+             aes_gcm_vectors[i].plaintext_size) != 0)
+          {
+            fprintf (stderr, "AES-GCM test vector %d failed!\n", i);
+
+            fprintf (stderr, "Cipher[%d]: ",
+                     aes_gcm_vectors[i].plaintext_size);
+            for (j = 0; j < aes_gcm_vectors[i].plaintext_size; j++)
+              fprintf (stderr, "%.2x:", (int) tmp[j]);
+            fprintf (stderr, "\n");
+
+            fprintf (stderr, "Expected[%d]: ",
+                     aes_gcm_vectors[i].plaintext_size);
+            for (j = 0; j < aes_gcm_vectors[i].plaintext_size; j++)
+              fprintf (stderr, "%.2x:",
+                       (int) aes_gcm_vectors[i].ciphertext[j]);
+            fprintf (stderr, "\n");
+            return 1;
+          }
+
+      gnutls_cipher_tag (hd, tmp, 16);
+      if (memcmp (tmp, aes_gcm_vectors[i].tag, 16) != 0)
+        {
+          fprintf (stderr, "AES-GCM test vector %d failed (tag)!\n", i);
+
+          fprintf (stderr, "Tag[%d]: ", 16);
+          for (j = 0; j < 16; j++)
+            fprintf (stderr, "%.2x:", (int) tmp[j]);
+          fprintf (stderr, "\n");
+
+          fprintf (stderr, "Expected[%d]: ", 16);
+          for (j = 0; j < 16; j++)
+            fprintf (stderr, "%.2x:", (int) aes_gcm_vectors[i].tag[j]);
+          fprintf (stderr, "\n");
+          return 1;
+        }
+
+      gnutls_cipher_deinit (hd);
+
+    }
+  fprintf (stdout, "ok\n");
+  fprintf (stdout, "\n");
+
+
+  return 0;
 
 }
 
-struct hash_vectors_st {
-       const char * name;
-       int algorithm;
-       const uint8_t *key;     /* if hmac */
-       int key_size;
-       const uint8_t *plaintext;
-       int plaintext_size;
-       const uint8_t *output;
-       int output_size;
-} hash_vectors[] = {
-       {
-       .name = "SHA1",
-       .algorithm = GNUTLS_MAC_SHA1,.key = NULL,.plaintext =
-                   (uint8_t *) "what do ya want for nothing?",.
-                   plaintext_size =
-                   sizeof("what do ya want for nothing?") - 1,.output =
-                   (uint8_t *)
-                   
"\x8f\x82\x03\x94\xf9\x53\x35\x18\x20\x45\xda\x24\xf3\x4d\xe5\x2b\xf8\xbc\x34\x32",.
-                   output_size = 20,}
-       , {
-       .name = "HMAC-MD5",
-       .algorithm = GNUTLS_MAC_MD5,.key = (uint8_t *) "Jefe",.key_size =
-                   4,.plaintext =
-                   (uint8_t *) "what do ya want for nothing?",.
-                   plaintext_size =
-                   sizeof("what do ya want for nothing?") - 1,.output =
-                   (uint8_t *)
-                   
"\x75\x0c\x78\x3e\x6a\xb0\xb5\x03\xea\xa8\x6e\x31\x0a\x5d\xb7\x38",.
-                   output_size = 16,}
-       ,
-           /* from rfc4231 */
-       {
-       .name = "HMAC-SHA2-224",
-       .algorithm = GNUTLS_MAC_SHA224,.key =
-                   (uint8_t *)
-                   
"\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b",.
-                   key_size = 20,.plaintext =
-                   (uint8_t *) "Hi There",.plaintext_size =
-                   sizeof("Hi There") - 1,.output =
-                   (uint8_t *)
-                   
"\x89\x6f\xb1\x12\x8a\xbb\xdf\x19\x68\x32\x10\x7c\xd4\x9d\xf3\x3f\x47\xb4\xb1\x16\x99\x12\xba\x4f\x53\x68\x4b\x22",.
-                   output_size = 28,}
-       , {
-       .name = "HMAC-SHA2-256",
-       .algorithm = GNUTLS_MAC_SHA256,.key =
-                   (uint8_t *)
-                   
"\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b",.
-                   key_size = 20,.plaintext =
-                   (uint8_t *) "Hi There",.plaintext_size =
-                   sizeof("Hi There") - 1,.output =
-                   (uint8_t *)
-                   
"\xb0\x34\x4c\x61\xd8\xdb\x38\x53\x5c\xa8\xaf\xce\xaf\x0b\xf1\x2b\x88\x1d\xc2\x00\xc9\x83\x3d\xa7\x26\xe9\x37\x6c\x2e\x32\xcf\xf7",.
-                   output_size = 32,}
-       , {
-       .name = "HMAC-SHA2-384",
-       .algorithm = GNUTLS_MAC_SHA384,.key =
-                   (uint8_t *)
-                   
"\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b",.
-                   key_size = 20,.plaintext =
-                   (uint8_t *) "Hi There",.plaintext_size =
-                   sizeof("Hi There") - 1,.output =
-                   (uint8_t *)
-                   
"\xaf\xd0\x39\x44\xd8\x48\x95\x62\x6b\x08\x25\xf4\xab\x46\x90\x7f\x15\xf9\xda\xdb\xe4\x10\x1e\xc6\x82\xaa\x03\x4c\x7c\xeb\xc5\x9c\xfa\xea\x9e\xa9\x07\x6e\xde\x7f\x4a\xf1\x52\xe8\xb2\xfa\x9c\xb6",.
-                   output_size = 48,}
-       , {
-       .name = "HMAC-SHA2-512",
-       .algorithm = GNUTLS_MAC_SHA512,.key =
-                   (uint8_t *)
-                   
"\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b",.
-                   key_size = 20,.plaintext =
-                   (uint8_t *) "Hi There",.plaintext_size =
-                   sizeof("Hi There") - 1,.output =
-                   (uint8_t *)
-                   
"\x87\xaa\x7c\xde\xa5\xef\x61\x9d\x4f\xf0\xb4\x24\x1a\x1d\x6c\xb0\x23\x79\xf4\xe2\xce\x4e\xc2\x78\x7a\xd0\xb3\x05\x45\xe1\x7c\xde\xda\xa8\x33\xb7\xd6\xb8\xa7\x02\x03\x8b\x27\x4e\xae\xa3\xf4\xe4\xbe\x9d\x91\x4e\xeb\x61\xf1\x70\x2e\x69\x6c\x20\x3a\x12\x68\x54",.
-                   output_size = 64,}
+struct hash_vectors_st
+{
+  const char *name;
+  int algorithm;
+  const uint8_t *key;           /* if hmac */
+  int key_size;
+  const uint8_t *plaintext;
+  int plaintext_size;
+  const uint8_t *output;
+  int output_size;
+} hash_vectors[] =
+{
+  {
+  .name = "SHA1",.algorithm = GNUTLS_MAC_SHA1,.key = NULL,.plaintext =
+      (uint8_t *) "what do ya want for nothing?",.plaintext_size =
+      sizeof ("what do ya want for nothing?") - 1,.output =
+      (uint8_t *)
+      
"\x8f\x82\x03\x94\xf9\x53\x35\x18\x20\x45\xda\x24\xf3\x4d\xe5\x2b\xf8\xbc\x34\x32",.
+      output_size = 20,}
+  ,
+  {
+  .name = "HMAC-MD5",.algorithm = GNUTLS_MAC_MD5,.key =
+      (uint8_t *) "Jefe",.key_size = 4,.plaintext =
+      (uint8_t *) "what do ya want for nothing?",.plaintext_size =
+      sizeof ("what do ya want for nothing?") - 1,.output =
+      (uint8_t *)
+      "\x75\x0c\x78\x3e\x6a\xb0\xb5\x03\xea\xa8\x6e\x31\x0a\x5d\xb7\x38",.
+      output_size = 16,}
+  ,
+    /* from rfc4231 */
+  {
+  .name = "HMAC-SHA2-224",.algorithm = GNUTLS_MAC_SHA224,.key =
+      (uint8_t *)
+      
"\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b",.
+      key_size = 20,.plaintext = (uint8_t *) "Hi There",.plaintext_size =
+      sizeof ("Hi There") - 1,.output =
+      (uint8_t *)
+      
"\x89\x6f\xb1\x12\x8a\xbb\xdf\x19\x68\x32\x10\x7c\xd4\x9d\xf3\x3f\x47\xb4\xb1\x16\x99\x12\xba\x4f\x53\x68\x4b\x22",.
+      output_size = 28,}
+  ,
+  {
+  .name = "HMAC-SHA2-256",.algorithm = GNUTLS_MAC_SHA256,.key =
+      (uint8_t *)
+      
"\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b",.
+      key_size = 20,.plaintext = (uint8_t *) "Hi There",.plaintext_size =
+      sizeof ("Hi There") - 1,.output =
+      (uint8_t *)
+      
"\xb0\x34\x4c\x61\xd8\xdb\x38\x53\x5c\xa8\xaf\xce\xaf\x0b\xf1\x2b\x88\x1d\xc2\x00\xc9\x83\x3d\xa7\x26\xe9\x37\x6c\x2e\x32\xcf\xf7",.
+      output_size = 32,}
+  ,
+  {
+  .name = "HMAC-SHA2-384",.algorithm = GNUTLS_MAC_SHA384,.key =
+      (uint8_t *)
+      
"\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b",.
+      key_size = 20,.plaintext = (uint8_t *) "Hi There",.plaintext_size =
+      sizeof ("Hi There") - 1,.output =
+      (uint8_t *)
+      
"\xaf\xd0\x39\x44\xd8\x48\x95\x62\x6b\x08\x25\xf4\xab\x46\x90\x7f\x15\xf9\xda\xdb\xe4\x10\x1e\xc6\x82\xaa\x03\x4c\x7c\xeb\xc5\x9c\xfa\xea\x9e\xa9\x07\x6e\xde\x7f\x4a\xf1\x52\xe8\xb2\xfa\x9c\xb6",.
+      output_size = 48,}
+  ,
+  {
+  .name = "HMAC-SHA2-512",.algorithm = GNUTLS_MAC_SHA512,.key =
+      (uint8_t *)
+      
"\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b",.
+      key_size = 20,.plaintext = (uint8_t *) "Hi There",.plaintext_size =
+      sizeof ("Hi There") - 1,.output =
+      (uint8_t *)
+      
"\x87\xaa\x7c\xde\xa5\xef\x61\x9d\x4f\xf0\xb4\x24\x1a\x1d\x6c\xb0\x23\x79\xf4\xe2\xce\x4e\xc2\x78\x7a\xd0\xb3\x05\x45\xe1\x7c\xde\xda\xa8\x33\xb7\xd6\xb8\xa7\x02\x03\x8b\x27\x4e\xae\xa3\xf4\xe4\xbe\x9d\x91\x4e\xeb\x61\xf1\x70\x2e\x69\x6c\x20\x3a\x12\x68\x54",.
+      output_size = 64,}
 ,};
 
 #define HASH_DATA_SIZE 64
 
 /* SHA1 and other hashes */
-static int test_hash(void)
+static int
+test_hash (void)
 {
-       uint8_t data[HASH_DATA_SIZE];
-       int i, j, ret;
-       size_t data_size;
-
-       fprintf(stdout, "Tests on Hashes\n");
-       for (i = 0; i < sizeof(hash_vectors) / sizeof(hash_vectors[0]); i++) {
-
-               fprintf(stdout, "\t%s: ", hash_vectors[i].name);
-               /* import key */
-               if (hash_vectors[i].key != NULL) {
-
-                       ret = gnutls_hmac_fast( hash_vectors[i].algorithm,
-                               hash_vectors[i].key, hash_vectors[i].key_size,
-                               hash_vectors[i].plaintext, 
hash_vectors[i].plaintext_size,
-                               data);
-                       data_size = 
gnutls_hmac_get_len(hash_vectors[i].algorithm);
-                       if (ret < 0) {
-                               fprintf(stderr, "Error: %s:%d\n", __func__,
-                                       __LINE__);
-                               return 1;
-                       }
-               } else {
-                       ret = gnutls_hash_fast( hash_vectors[i].algorithm,
-                               hash_vectors[i].plaintext, 
hash_vectors[i].plaintext_size,
-                               data);
-                       data_size = 
gnutls_hash_get_len(hash_vectors[i].algorithm);
-                       if (ret < 0) {
-                               fprintf(stderr, "Error: %s:%d\n", __func__,
-                                       __LINE__);
-                               return 1;
-                       }
-               }
-
-               if (data_size != hash_vectors[i].output_size ||
-                   memcmp(data, hash_vectors[i].output,
-                          hash_vectors[i].output_size) != 0) {
-                       fprintf(stderr, "HASH test vector %d failed!\n", i);
-
-                       fprintf(stderr, "Output[%d]: ", (int)data_size);
-                       for (j = 0; j < data_size; j++)
-                               fprintf(stderr, "%.2x:", (int)data[j]);
-                       fprintf(stderr, "\n");
-
-                       fprintf(stderr, "Expected[%d]: ",
-                               hash_vectors[i].output_size);
-                       for (j = 0; j < hash_vectors[i].output_size; j++)
-                               fprintf(stderr, "%.2x:",
-                                       (int)hash_vectors[i].output[j]);
-                       fprintf(stderr, "\n");
-                       return 1;
-               }
-               
-               fprintf(stdout, "ok\n");
-       }
-
-       fprintf(stdout, "\n");
-
-       return 0;
+  uint8_t data[HASH_DATA_SIZE];
+  int i, j, ret;
+  size_t data_size;
+
+  fprintf (stdout, "Tests on Hashes\n");
+  for (i = 0; i < sizeof (hash_vectors) / sizeof (hash_vectors[0]); i++)
+    {
+
+      fprintf (stdout, "\t%s: ", hash_vectors[i].name);
+      /* import key */
+      if (hash_vectors[i].key != NULL)
+        {
+
+          ret = gnutls_hmac_fast (hash_vectors[i].algorithm,
+                                  hash_vectors[i].key,
+                                  hash_vectors[i].key_size,
+                                  hash_vectors[i].plaintext,
+                                  hash_vectors[i].plaintext_size, data);
+          data_size = gnutls_hmac_get_len (hash_vectors[i].algorithm);
+          if (ret < 0)
+            {
+              fprintf (stderr, "Error: %s:%d\n", __func__, __LINE__);
+              return 1;
+            }
+        }
+      else
+        {
+          ret = gnutls_hash_fast (hash_vectors[i].algorithm,
+                                  hash_vectors[i].plaintext,
+                                  hash_vectors[i].plaintext_size, data);
+          data_size = gnutls_hash_get_len (hash_vectors[i].algorithm);
+          if (ret < 0)
+            {
+              fprintf (stderr, "Error: %s:%d\n", __func__, __LINE__);
+              return 1;
+            }
+        }
+
+      if (data_size != hash_vectors[i].output_size ||
+          memcmp (data, hash_vectors[i].output,
+                  hash_vectors[i].output_size) != 0)
+        {
+          fprintf (stderr, "HASH test vector %d failed!\n", i);
+
+          fprintf (stderr, "Output[%d]: ", (int) data_size);
+          for (j = 0; j < data_size; j++)
+            fprintf (stderr, "%.2x:", (int) data[j]);
+          fprintf (stderr, "\n");
+
+          fprintf (stderr, "Expected[%d]: ", hash_vectors[i].output_size);
+          for (j = 0; j < hash_vectors[i].output_size; j++)
+            fprintf (stderr, "%.2x:", (int) hash_vectors[i].output[j]);
+          fprintf (stderr, "\n");
+          return 1;
+        }
+
+      fprintf (stdout, "ok\n");
+    }
+
+  fprintf (stdout, "\n");
+
+  return 0;
 
 }
 
 
-int main(int argc, char** argv)
+int
+main (int argc, char **argv)
 {
-        gnutls_global_init();
+  gnutls_global_init ();
 
-       if (test_aes())
-               return 1;
+  if (test_aes ())
+    return 1;
 
-       if (test_hash())
-               return 1;
+  if (test_hash ())
+    return 1;
 
-        gnutls_global_deinit();
-       return 0;
+  gnutls_global_deinit ();
+  return 0;
 }


hooks/post-receive
-- 
GNU gnutls



reply via email to

[Prev in Thread] Current Thread [Next in Thread]