qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] RFC: Code fetch optimisation


From: Paul Brook
Subject: Re: [Qemu-devel] RFC: Code fetch optimisation
Date: Mon, 15 Oct 2007 17:01:16 +0100
User-agent: KMail/1.9.7

> > > +    unsigned long phys_pc;
> > > +    unsigned long phys_pc_start;
> >
> > These are ram offsets, not physical addresses. I recommend naming them as
> > such to avoid confusion.
>
> Well, those are host addresses. Fabrice even suggested me to replace
> them with void * to prevent confusion, but I kept using unsigned long
> because the _p functions API do not use pointers. As those values are
> defined as phys_ram_base + offset, those are likely to be host address,
> not RAM offset, and are used directly to dereference host pointers in
> the ldxxx_p functions. Did I miss something ?

You are correct, they are host addresses. I still think calling them phys_pc 
is confusing. It took me a while to convince myself that "unsigned long" was 
an appropriate type (ignoring 64-bit windows hosts for now).

How about host_pc?

> > > +    /* Avoid softmmu access on next load */
> > > +    /* XXX: dont: phys PC is not correct anymore
> > > +     *      We could call get_phys_addr_code(env, pc); and remove the
> > > else +     *      condition, here.
> > > +     */
> > > +    //*start_pc = phys_pc;
> >
> > The commented out code is completely bogus, please remove it. The comment
> > is also somewhat misleading/incorrect. The else would still be required
> > for accesses that span a page boundary.
>
> I guess trying to optimize this case retrieving the physical address
> would not bring any optimization as in fact only the last translated
> instruction of a TB (then only a few code loads) may hit this case.

VLE targets (x86, m68k) can translate almost a full page of instructions, and 
a page boundary can be anywhere within that block. Once we've spanned 
multiple pages there's not point stopping translation immediately. We may as 
well translate as many instructions as we can on the second page.

I'd guess most TB are much smaller than a page, so on average only a few 
instructions are going to come after the page boundary.

> I'd like to keep a comment here to show that it may not be a good idea
> (or may not be as simple as it seems at first sight) to try to do more
> optimisation here, but you're right this comment is not correct.

Agreed.

> > The code itself looks ok, though I'd be surprised if it made a
> > significant difference. We're always going to hit the fast-path TLB
> > lookup case anyway.
>
> It seems that the generated code for the code fetch is much more
> efficient than the one generated when we get when using the softmmu
> routines. But it's true we do not get any significant performance boost.
> As it was previously mentioned, the idea of the patch is more a 'don't
> do unneeded things during code translation' than a great performance
> improvment.

OTOH it does make the the code more complicated. I'm agnostic about whether 
this patch should be applied.

Paul




reply via email to

[Prev in Thread] Current Thread [Next in Thread]