[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Implementing persistence
Re: Implementing persistence
Thu, 12 Jan 2006 14:40:21 +0100
On Thu, Jan 12, 2006 at 02:18:16PM +0100, Ludovic Court?s wrote:
> Did you look at libckpt and similar libraries (ad: I once wrote
> pego which produces /portable/ checkpoints, unlike libckpt, but I'm
> not sure it's interesting in this case ;-))?
I didn't look at anything yet, I just tried some things and thought about it.
But I'll have a look at those, thanks for the pointers.
> File descriptors and capabilities are the main issue since they are
> bound to state that is /external/ to the application (be it in the
> kernel or in a server). In Fluke, the authors argue that kernel
> state should be exportable to allow for the implementation of user-level
I didn't read the referenced article yet, but I don't agree in principle. One
of the cool results of checkpointing would IMO be that you can move your
application to some other place, where it will continue where you left off.
This is not possible if the state is well recorded, only when it is rebuilt.
> However, in a multi-server system, application state is
> spread across a bunch of servers which would all have to make their
> state exportable. But from the server viewpoint, restoring complex
> state from an untrusted source is not a reasonable thing.
For the moment I'm limiting things to single process single threaded stuff.
When adding multi-process, I think I'd need transaction support to make sure I
get consistent checkpoints. I want to keep it simple at first, and I already
noticed that it's hard enough anyway.
> Furthermore, a protected capability system does not allow the disclosure
> of the "bit representation" of capabilities, so checkpointing
> capabilities themselves is a meaningful way is not something
> applications can do on their own.
Capabilities are owned by the kernel, so indeed the application cannot
checkpoint them. I will not try to do that, either. They will have to be
provided to the application when it boots. Note that for this it doesn't mean
if the OS is persistent, the only difference is that it won't boot very often
on a persistent system (except if you want to upgrade to a newer version of
the application, for example).
> In EROS, the whole system (kernel - drivers + all the processes) is
> persistent, so there is, I think, no such problem: each checkpoint
> contains everything that's needed to restore the whole thing.
That simply means that the process is not shut down when the power goes down.
It doesn't mean the process is persistent in the sense that it can continue
where it left off after it was killed (that is, the process was destroyed and
is restarted). That's the kind of persistence I'm talking about here. I
think it would be useful on any system, persistent or not.
> The issue of restoring capabilities and their associated state arises when
> trying to make only part of the processes persistent.
It arises in any situation where the process must restart without the ability
to pass capabilities from the old to the new version.
> One solution would consist in logging all the interactions between the
> persistent world and the non-persistent world in order to replay them
> upon recovery, but that's quite ugly IMO.
Very. I'm not going to implement that. :-)
> Instead, maybe special support from the capability system could solve that.
For the moment I'm not concerning myself with capabilities. I just assume I
get all the rights I need when starting up. If that turns out not to be the
case, startup should fail. At the moment I'm writing things on GNU/Linux, and
the kind of applications I write for now use only stdin/stdout, so it isn't an
issue yet anyway. :-)
> Good luck, and best wishes! ;-)
I encourage people to send encrypted e-mail (see http://www.gnupg.org).
If you have problems reading my e-mail, use a better reader.
Please send the central message of e-mails as plain text
in the message body, not as HTML and definitely not as MS Word.
Please do not use the MS Word format for attachments either.
For more information, see http://22.214.171.124/e-mail.html
Description: Digital signature
Re: Is the list still working?, Jonathan S. Shapiro, 2006/01/11