[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Qemu-devel] [RFC][PATCH v2 01/17] json-lexer: make lexer error-recovery
From: |
Michael Roth |
Subject: |
[Qemu-devel] [RFC][PATCH v2 01/17] json-lexer: make lexer error-recovery more deterministic |
Date: |
Mon, 18 Apr 2011 10:02:17 -0500 |
Currently when we reach an error state we effectively flush everything
fed to the lexer, which can put us in a state where we keep feeding
tokens into the parser at arbitrary offsets in the stream. This makes it
difficult for the lexer/tokenizer/parser to get back in sync when bad
input is made by the client.
With these changes we emit an error state/token up to the tokenizer as
soon as we reach an error state, and continue processing any data passed
in rather than bailing out. The reset token will be used to reset the
tokenizer and parser, such that they'll recover state as soon as the
lexer begins generating valid token sequences again.
We also map chr(192,193,245-255) to an error state here, since they are
invalid UTF-8 characters. QMP guest proxy/agent will use chr(255) to
force a flush/reset of previous input for reliable delivery of certain
events, so also we document that thoroughly here.
Signed-off-by: Michael Roth <address@hidden>
---
json-lexer.c | 33 ++++++++++++++++++++++++++++-----
json-lexer.h | 1 +
2 files changed, 29 insertions(+), 5 deletions(-)
diff --git a/json-lexer.c b/json-lexer.c
index 3462c89..bede557 100644
--- a/json-lexer.c
+++ b/json-lexer.c
@@ -105,7 +105,8 @@ static const uint8_t json_lexer[][256] = {
['u'] = IN_DQ_UCODE0,
},
[IN_DQ_STRING] = {
- [1 ... 0xFF] = IN_DQ_STRING,
+ [1 ... 0xBF] = IN_DQ_STRING,
+ [0xC2 ... 0xF4] = IN_DQ_STRING,
['\\'] = IN_DQ_STRING_ESCAPE,
['"'] = JSON_STRING,
},
@@ -144,7 +145,8 @@ static const uint8_t json_lexer[][256] = {
['u'] = IN_SQ_UCODE0,
},
[IN_SQ_STRING] = {
- [1 ... 0xFF] = IN_SQ_STRING,
+ [1 ... 0xBF] = IN_SQ_STRING,
+ [0xC2 ... 0xF4] = IN_SQ_STRING,
['\\'] = IN_SQ_STRING_ESCAPE,
['\''] = JSON_STRING,
},
@@ -305,10 +307,25 @@ static int json_lexer_feed_char(JSONLexer *lexer, char ch)
new_state = IN_START;
break;
case ERROR:
+ /* XXX: To avoid having previous bad input leaving the parser in an
+ * unresponsive state where we consume unpredictable amounts of
+ * subsequent "good" input, percolate this error state up to the
+ * tokenizer/parser by forcing a NULL object to be emitted, then
+ * reset state.
+ *
+ * Also note that this handling is required for reliable channel
+ * negotiation between QMP and the guest agent, since chr(0xFF)
+ * is placed at the beginning of certain events to ensure proper
+ * delivery when the channel is in an unknown state. chr(0xFF) is
+ * never a valid ASCII/UTF-8 sequence, so this should reliably
+ * induce an error/flush state.
+ */
+ lexer->emit(lexer, lexer->token, JSON_ERROR, lexer->x, lexer->y);
QDECREF(lexer->token);
lexer->token = qstring_new();
new_state = IN_START;
- return -EINVAL;
+ lexer->state = new_state;
+ return 0;
default:
break;
}
@@ -334,7 +351,6 @@ int json_lexer_feed(JSONLexer *lexer, const char *buffer,
size_t size)
for (i = 0; i < size; i++) {
int err;
-
err = json_lexer_feed_char(lexer, buffer[i]);
if (err < 0) {
return err;
@@ -346,7 +362,14 @@ int json_lexer_feed(JSONLexer *lexer, const char *buffer,
size_t size)
int json_lexer_flush(JSONLexer *lexer)
{
- return lexer->state == IN_START ? 0 : json_lexer_feed_char(lexer, 0);
+ if (lexer->state != IN_START) {
+ lexer->state = IN_START;
+ QDECREF(lexer->token);
+ lexer->token = qstring_new();
+ return -EINVAL;
+ }
+
+ return 0;
}
void json_lexer_destroy(JSONLexer *lexer)
diff --git a/json-lexer.h b/json-lexer.h
index 3b50c46..10bc0a7 100644
--- a/json-lexer.h
+++ b/json-lexer.h
@@ -25,6 +25,7 @@ typedef enum json_token_type {
JSON_STRING,
JSON_ESCAPE,
JSON_SKIP,
+ JSON_ERROR,
} JSONTokenType;
typedef struct JSONLexer JSONLexer;
--
1.7.0.4
- [Qemu-devel] [RFC][PATCH v2 04/17] qapi: fix function name typo in qmp-gen.py, (continued)
- [Qemu-devel] [RFC][PATCH v2 04/17] qapi: fix function name typo in qmp-gen.py, Michael Roth, 2011/04/18
- [Qemu-devel] [RFC][PATCH v2 09/17] qmp proxy: core code for proxying qmp requests to guest, Michael Roth, 2011/04/18
- [Qemu-devel] [RFC][PATCH v2 10/17] qmp proxy: add qmp_proxy chardev, Michael Roth, 2011/04/18
- [Qemu-devel] [RFC][PATCH v2 06/17] qapi: fix memory leak for async marshalling code, Michael Roth, 2011/04/18
- [Qemu-devel] [RFC][PATCH v2 07/17] qapi: qmp-gen.py, use basename of path for guard/core prefix, Michael Roth, 2011/04/18
- [Qemu-devel] [RFC][PATCH v2 02/17] json-streamer: add handling for JSON_ERROR token/state, Michael Roth, 2011/04/18
- [Qemu-devel] [RFC][PATCH v2 01/17] json-lexer: make lexer error-recovery more deterministic,
Michael Roth <=
- [Qemu-devel] [RFC][PATCH v2 05/17] qapi: fix handling for null-return async callbacks, Michael Roth, 2011/04/18
- [Qemu-devel] [RFC][PATCH v2 03/17] json-parser: add handling for NULL token list, Michael Roth, 2011/04/18
- [Qemu-devel] [RFC][PATCH v2 08/17] qapi: fix Error usage in qemu-sockets.c, Michael Roth, 2011/04/18
- [Qemu-devel] [RFC][PATCH v2 12/17] guest agent: worker thread class, Michael Roth, 2011/04/18
- [Qemu-devel] [RFC][PATCH v2 16/17] guest agent: add guest agent RPCs/commands, Michael Roth, 2011/04/18
- [Qemu-devel] [RFC][PATCH v2 15/17] guest agent: qemu-ga daemon, Michael Roth, 2011/04/18