qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [RFC PATCH v3 2/3] simpletrace-v2: Handle variable numb


From: Harsh Bora
Subject: Re: [Qemu-devel] [RFC PATCH v3 2/3] simpletrace-v2: Handle variable number/size of elements per trace record.
Date: Wed, 18 Jan 2012 16:39:16 +0530
User-agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.1.10) Gecko/20100621 Fedora/3.0.5-1.fc13 Thunderbird/3.0.5

On 01/18/2012 04:29 PM, Stefan Hajnoczi wrote:
On Wed, Jan 18, 2012 at 10:52 AM, Harsh Bora<address@hidden>  wrote:
On 01/18/2012 04:11 PM, Harsh Bora wrote:

On 01/18/2012 04:01 PM, Stefan Hajnoczi wrote:

On Wed, Jan 18, 2012 at 9:14 AM, Harsh Bora<address@hidden>
wrote:

On 01/10/2012 10:11 PM, Stefan Hajnoczi wrote:

+ unused = fwrite(&record, ST_V2_REC_HDR_LEN, 1, trace_fp);
writeout_idx += num_available;
}

idx = writeout_idx % TRACE_BUF_LEN;
- while (get_trace_record(idx,&record)) {
- trace_buf[idx].event = 0; /* clear valid bit */
- unused = fwrite(&record, sizeof(record), 1, trace_fp);
- idx = ++writeout_idx % TRACE_BUF_LEN;
+ while (get_trace_record(idx,&recordptr)) {
+ unused = fwrite(recordptr, recordptr->length, 1, trace_fp);
+ writeout_idx += recordptr->length;
+ g_free(recordptr);
+ recordptr = (TraceRecord *)&trace_buf[idx];

+ recordptr->event = 0;
+ idx = writeout_idx % TRACE_BUF_LEN;
}



I'm wondering if it's worth using a different approach here. Writing
out individual records has bothered me.

If we have a second buffer, as big as trace_buf[], then a function can
copy out all records and make them available in trace_buf again:

/**
* Copy completed trace records out of the ring buffer
*
* @idx offset into trace_buf[]
* @buf destination buffer
* @len size of destination buffer
* @ret the number of bytes consumed
*/
size_t consume_trace_records(unsigned int idx, void *buf, size_t len);

That means consume gobbles up as many records as it can:
* Until it reaches an invalid record which has not been written yet
* Until it reaches the end of trace_buf[], the caller can call again
with idx wrapped to 0

After copying into buf[] it clears the valid bit in trace_buf[].

Then the loop which calls consume_trace_records() does a single
fwrite(3) and increments idx/writeout_idx.

The advantage to this is that we write many records in one go and I
think it splits up the writeout steps a little nicer than what we've
previously done.


I think this optimization can be introduced as a separate patch later.
Let me know if you think otherwise.


Yes, that could be done later. However there is something incorrect
here. Threads will continue to write trace records into the
ringbuffer while the write-out thread is doing I/O. Think about what
happens when threads overtake the write-out index modulo ringbuffer
size. Since records are variable-length the write-out thread's next
index could point into the middle of an overwritten record. And that
means the ->length field is junk - we may crash if we use it.


In case of overwritten records, the valid bit of event id will also be
overwritten, and therefore we will not consider that record. Moreover,
the writeout thread will further get to know that some events were
dropped and will start with the latest trace_idx, right ?

However, to handle the extreme rare case of having an overwritten value
such that its valid bit appears to be set, we can put a check for<
NR_TRACE_EVENTS. Even better to have a magic byte for events also ?

Harsh


Also, I just realized that we need to put buffer boundary checks while
copying a trace record:


-static bool get_trace_record(unsigned int idx, TraceRecord *record)
+static bool get_trace_record(unsigned int idx, TraceRecord **recordptr)
   {
-    if (!(trace_buf[idx].event&    TRACE_RECORD_VALID)) {
+    TraceRecord *record = (TraceRecord *)&trace_buf[idx];
+    if (!(record->event&    TRACE_RECORD_VALID)) {
          return false;
      }

      __sync_synchronize(); /* read memory barrier before accessing record
*/

-    *record = trace_buf[idx];
-    record->event&= ~TRACE_RECORD_VALID;
+    *recordptr = g_malloc(record->length);
+    /* make a copy of record to avoid being overwritten */
+    memcpy(*recordptr, record, record->length);

So, I need to use a wrapper which should copy byte by byte taking care of
boundary overlaps, and that will also save us from crashing (as we will
never be crossing buffer boundaries), right ?

It won't be enough if length has a junk value like 2^32 - 1.  We don't
want to allocate/copy 4 GB :).

So, whats the max limit that we may want to propose for a record length ? Obviously less than buffer length, or TRACE_BUF_LEN / 2 ?

Hope, a check for max buffer length with byte-wise copy will help ?

Harsh


Stefan





reply via email to

[Prev in Thread] Current Thread [Next in Thread]