[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Savannah-hackers-public] Re: [Monotone-devel] Hosting multiple Monotone
From: |
Nathaniel Smith |
Subject: |
[Savannah-hackers-public] Re: [Monotone-devel] Hosting multiple Monotone projects |
Date: |
Fri, 12 Aug 2005 03:10:14 -0700 |
User-agent: |
Mutt/1.5.9i |
On Thu, Aug 11, 2005 at 11:32:39PM +0200, Sylvain Beucler wrote:
> After reading this, I actually asked myself what I would like for an
> VCS' server-side part.
>
> I spotted the following issues:
>
> - Process model:
>
> > OSes have all been optimized for apache's ol' process-per-connection
> > anyway...
>
> Well, technically Apache uses a pool of thread and now has a
> multithreaded worker ;)
But the forking server is still the default on posix, because there's
not really much difference between threads and processes anyway.
> Anyway that's indeed the way CVS works now: either pserver spawn by
> xinetd for each connection, either the 'cvs server' command ran
> through a remote shell connection such as ssh. In each case, one
> process per connection - though of course that's not a reason to use
> the same design :)
Yeah; lots of servers work this way, not just apache.
We already have an event-based model that can multiplex simultaneous
access to a single db; while event-based models can often outperform
ones that use OS-level parallelism, you do need some parallelism to
get good IO usage and stuff. The process-per-db model seems
quite plausible, at least at first blush :-).
> > So my off-the-cuff suggestion for a more aesthetic setup would be to
> > put a proxy on your "monotone port", tweak the netsync protocol to
> > make sure that the first packet includes some sort of vhost parameter,
> > and then teach the proxy how to spawn monotone processes on demand and
> > forward traffic to them.
>
> That could do the trick. From what I know starting a thread rather
> than a process should be less resource consuming but I'm not an expert
> here. Fork&exec might also be better than traffic forwarding, although
> there may be issues with fork() under Windows.
>
> The process model doesn't look like a critical feature though. Also
> check the note about authentication below.
>
>
> - Authentication and access control: with CVS, we use one Unix groups
> per repository to give access to different repositories but:
>
> * that's not fine-grained (unlike Monotone's per-branch read access)
>
> * Unix groups have limitation, mainly the number of groups one user
> can belong to (usually 16 or 32).
>
> * it should be possible to use ACLs and a bit of hacking to get rid of
> those limitations, though.
This is because CVS write access invariable involves giving people
logins on the server machine, and letting them run programs (i.e.,
cvs) that get to full write access to the filesystem. Right?
Monotone doesn't work that way; the only process that touches the
database is the server, which can run under any uid you feel like; it
does its own public-key based access control, and mediates all remote
access.
It might make sense to give each group its own UID, to insulate them
a bit against each other if there turns out to be some sort of buffer
overflow or the like in the monotone server, but that's the best
reason I can think of.
> About granularity, I'm actually surprised that Monotone does not allow
> per-branch repository write access. That's not symetric with the read
> access control. Is there a reason for this?
>
> I think it would be nice for a project admin to setup up different
> branches in its project with different write access.
Yes; the read access control isn't actually very strict either -- if
you know the id of the revision you want, you can always get it,
assuming you have read access to any branch at all; read access being
denied just means that the server won't tell you what pieces you're
missing.
There are similar such problems with carefully restricting write
access, that are probably outside the scope of this discussion... the
basic problem is that monotone branches do not designate discrete
storage areas, as in most VCSes, but rather are some arbitrary marked
subset of the big global revision graph.
Agreed this is annoying; it's a work in progress :-).
> The authentication model has an impact to the process model: if the
> server maps to OS users (and maybe groups), the server will usually
> fork and setuid, one process per request. If the authentication is
> done by the server (Apache-style), then the server needs only one
> system uid, so it can use either thread or process-based request
> handling. It will be able to access all repositories with the same uid
> though.
>
> There are security issues in both case: in the former, the OS may
> permit unexpected authentications (cf. cvs pserver's security issues,
> due to the combination of its own authentication system and the
> delegation to /etc/passwd or PAM); in the later, if the server is
> cracked, then all repositories on the system, not just the one, will
> be at stake.
Monotone servers always do their own authentication; there are some
patches to add the ability to sync over ssh, but it's not what you
want -- only one person could sync over ssh at a time.
It would also be bad because it circumvents policy checking -- the
server will only modify the database in well-defined, valid ways.
(E.g., it will never delete things.) The same guarantee cannot be
made if you allow filesystem-level access.
> - Protecting the source code: usually, project admins don't want the
> code to be removed by (mad/jealous/...) project members, and system
> admins want to keep the history of the source code intact (in case the
> project developers decide to switch from a free to a proprietary
> softare license and remove any trace from the free version).
>
> To avoid this, it would be nice to be able to prevent the 'kill'
> monotone commands. This is possible with CVS (by blocking the 'cvs
> admin' commands), but it wasn't planned properly: doing so also blocks
> other interesting features such as switching a file type
> (text/binary).
>
> Does it sound reasonable to make 'kill' commands blockable?
The 'kill' commands affect only the local database -- the one sitting
on the user's hard disk. Seeing as its sitting on their hard disk, we
can't really prevent them doing anything to it that they feel like
:-). But as mentioned above, this doesn't give them any ability to
hurt the project's code.
(Even if someone did break into the server and kill some revisions,
all the other developers would still have local copies, and the first
one to sync up would automatically re-push the deleted pieces. Not a
very effective sort of vandalism.)
> - Configuration at two level, system and project-wide (or at least
> just project-wide). Currently, I guess the proxy server you mentioned
> would need to set HOME for each monotone subprocess so it reads a
> different configuration file. A command-line switch might be more
> convenient. Then the proxy would set an extra rc file to read for the
> system-wide configuration.
There's already the switch --rcfile, to do basically that.
> - Administration tools: one would need a simple way to manage access
> control and per-repository hooks. That might be done using
> configuration files processed by a specific set of ~/.monotone Lua
> scripts.
Yeah, that would be good.
> - Server-side hooks: apparently Monotone does this already. Some VCS
> do not have a server part and rely on the client's good will, for
> example for commit notifications.
Yeah. It might raise some interesting issues in this context, since
if you're not giving users login shells on the VCS server, the only
way they get to upload arbitrary code for execution is by uploading
server-side hooks. (Such hooks, of course, are executed within the
server process, and thus have the same privileges as the server.)
So one might want to restrict them to certain limited things, like
sending out notifications in some standard way.
(It's also easy to use a polling model to detect changes; that's how
monotone's current CIA (http://cia.navi.cx/stats/project/monotone)
support is implemented, since it was written before the netsync hooks
were added. So users can also set up whatever they want that way.)
-- Nathaniel
--
/* Tell the world that we're going to be the grim
* reaper of innocent orphaned children.
*/
-- Linux kernel 2.4.5, main.c