[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Call for GSoC and Outreachy project ideas for summer 2023
From: |
Stefan Hajnoczi |
Subject: |
Re: Call for GSoC and Outreachy project ideas for summer 2023 |
Date: |
Wed, 8 Feb 2023 20:25:28 -0500 |
On Wed, 8 Feb 2023 at 18:02, Warner Losh <imp@bsdimp.com> wrote:
> On Fri, Jan 27, 2023 at 3:02 PM Stefan Hajnoczi <stefanha@gmail.com> wrote:
>>
>> On Fri, 27 Jan 2023 at 12:10, Warner Losh <imp@bsdimp.com> wrote:
>> >
>> > [[ cc list trimmed to just qemu-devel ]]
>> >
>> > On Fri, Jan 27, 2023 at 8:18 AM Stefan Hajnoczi <stefanha@gmail.com> wrote:
>> >>
>> >> Dear QEMU, KVM, and rust-vmm communities,
>> >> QEMU will apply for Google Summer of Code 2023
>> >> (https://summerofcode.withgoogle.com/) and has been accepted into
>> >> Outreachy May 2023 (https://www.outreachy.org/). You can now
>> >> submit internship project ideas for QEMU, KVM, and rust-vmm!
>> >>
>> >> Please reply to this email by February 6th with your project ideas.
>> >>
>> >> If you have experience contributing to QEMU, KVM, or rust-vmm you can
>> >> be a mentor. Mentors support interns as they work on their project. It's a
>> >> great way to give back and you get to work with people who are just
>> >> starting out in open source.
>> >>
>> >> Good project ideas are suitable for remote work by a competent
>> >> programmer who is not yet familiar with the codebase. In
>> >> addition, they are:
>> >> - Well-defined - the scope is clear
>> >> - Self-contained - there are few dependencies
>> >> - Uncontroversial - they are acceptable to the community
>> >> - Incremental - they produce deliverables along the way
>> >>
>> >> Feel free to post ideas even if you are unable to mentor the project.
>> >> It doesn't hurt to share the idea!
>> >
>> >
>> > I've been a GSoC mentor for the FreeBSD project on and off for maybe
>> > 10-15 years now. I thought I'd share this for feedback here.
>> >
>> > My project idea falls between the two projects. I've been trying
>> > to get bsd-user reviewed and upstreamed for some time now and my
>> > time available to do the upstreaming has been greatly diminished lately.
>> > It got me thinking: upstreaming is more than just getting patches reviewed
>> > often times. While there is a rather mechanical aspect to it (and I could
>> > likely
>> > automate that aspect more), the real value of going through the review
>> > process
>> > is that it points out things that had been done wrong, things that need to
>> > be
>> > redone or refactored, etc. It's often these suggestions that lead to the
>> > biggest
>> > investment of time on my part: Is this idea good? if I do it, does it
>> > break things?
>> > Is the feedback right about what's wrong, but wrong about how to fix it?
>> > etc.
>> > Plus the inevitable, I thought this was a good idea, implemented it only
>> > to find
>> > it broke other things, and how do I explain that and provide feedback to
>> > the
>> > reviewer about that breakage to see if it is worth pursuing further or not?
>> >
>> > So my idea for a project is two fold: First, to create scripts to automate
>> > the
>> > upstreaming process: to break big files into bite-sized chunks for review
>> > on
>> > this list. git publish does a great job from there. The current backlog to
>> > upstream
>> > is approximately " 175 files changed, 30270 insertions(+), 640
>> > deletions(-)" which
>> > is 300-600 patches at the 50-100 line patch guidance I've been given. So
>> > even
>> > at .1hr (6 minutes) per patch (which is about 3x faster than I can do it
>> > by hand),
>> > that's ~60 hours just to create the patches. Writing automation should take
>> > much less time. Realistically, this is on the order of 10-20 hours to get
>> > done.
>> >
>> > Second, it's to take feedback from the reviews for refactoring
>> > the bsd-user code base (which will eventually land in upstream). I often
>> > spend
>> > a few hours creating my patches each quarter, then about 10 or so hours
>> > for the
>> > 30ish patches that I do processing the review feedback by refactoring
>> > other things
>> > (typically other architectures), checking details of other architectures
>> > (usually by
>> > looking at the FreeBSD kernel), or looking for ways to refactor to share
>> > code with
>> > linux-user (though so far only the safe signals is upstream: elf could be
>> > too), or
>> > chatting online about the feedback to better understand it, to see what I
>> > can mine
>> > from linux-user (since the code is derived from that, but didn't pick up
>> > all the changes
>> > linus-user has), etc. This would be on the order of 100 hours.
>> >
>> > Third, the testing infrastructure that exists for linux-user is not well
>> > leveraged to test
>> > bsd-user. I've done some tests from time to time with it, but it's not in
>> > a state that it
>> > can be used as, say, part of a CI pipeline. In addition, the FreeBSD
>> > project has some
>> > very large jobs, a subset of which could be used to further ensure that
>> > critical bits of
>> > infrastructure don't break (or are working if not in a CI pipeline).
>> > Things like building
>> > and using go, rust and the like are constantly breaking for reasons too
>> > long to enumerate
>> > here. This job could be as little as 50 hours to do a minimal but complete
>> > enough for CI job,
>> > or as much as 200 hours to do a more complete jobs that could be used to
>> > bisect breakage
>> > more quickly and give good assurance that at any given time bsd-user is
>> > useful and working.
>> >
>> > That's in addition to growing the number of people that can work on this
>> > code and
>> > on the *-user code in general since they are quite similar.
>> >
>> > Some of these tasks are squarely in the qemu-realm, while others are in
>> > the FreeBSD realm,
>> > but that's similar to linux-user which requires very heavy interfacing
>> > with the linux realm. It's
>> > just that a lot of that work is already complete so the needs are
>> > substantially less there on an
>> > ongoing basis. Since it does stratal the two projects, I'm unsure where to
>> > propose this project
>> > be housed. But since this is a call for ideas, I thought I'd float it to
>> > see what the feedback is. I'm
>> > happy to write this up in a more formal sense if it would be seriously
>> > considered, but want to get
>> > feedback as to what areas I might want to emphasize in such a proposal.
>> >
>> > Comments?
>>
>> Hi Warner,
>> Don't worry about it spanning FreeBSD and QEMU, you're welcome to list
>> the project idea through QEMU. You can have co-mentors that are not
>> part of the QEMU community in order to bring in additional FreeBSD
>> expertise.
>>
>> My main thought is that getting all code upstream sounds like a
>> sprawling project that likely won't be finished within one internship.
>> Can you pick just a subset of what you described? It should be a
>> well-defined project that depends minimally on other people finishing
>> stuff or reaching agreement on something controversial? That way the
>> intern will be able to come up with specific tasks for their project
>> plan and there is little risk that they can't complete them due to
>> outside factors.
>
>
> I like this notion of limiting the scope. There's three or maybe four main
> areas
> that I can call out. I got to thinking about all the details I have to do for
> how
> I've been upstreaming things, and realized that there's a lot due to the
> complicated
> history here...
>
>> One way to go about this might be for you to define a milestone that
>> involves completing, testing, and upstreaming just a subset of the
>> out-of-tree code. For example, it might implement a limited set of
>> core syscall families. The intern will then focus on delivering that
>> instead of worrying about the daunting task of getting everything
>> merged. Finishing this subset would advance bsd-user FreeBSD support
>> by a useful degree (e.g. ability to run certain applications).
>>
>> Does that sound good?
>
>
> Yes. I like this, but it's hard to know what that might be because many
> things are
> hidden behind the scenes... But I'll try running a quick build to see if I
> can gather
> enough stats to come up with a good set of tests... But maybe I'll start with
> building
> 'hello world' with clang on armv7 running on an amd64 host to see what's
> missing
> today. I also have an aarch64 set of patches I might try hard to get in ASAP
> so that
> might be the target instead (since it might be a bit more useful).
Hi Warner,
Great to hear back from you. Don't worry if you don't have the details
right now. I have created a placeholder on the ideas list that you can
fill in over the coming days:
https://wiki.qemu.org/Internships/ProjectIdeas/FreeBSDUser
You can either reply via email and I'll post your project description
on the wiki, or feel free to edit the above wiki page directly.
Thanks,
Stefan
- Re: Call for GSoC and Outreachy project ideas for summer 2023, (continued)
Re: Call for GSoC and Outreachy project ideas for summer 2023, Alberto Faria, 2023/02/06
Re: Call for GSoC and Outreachy project ideas for summer 2023, Warner Losh, 2023/02/08
- Re: Call for GSoC and Outreachy project ideas for summer 2023,
Stefan Hajnoczi <=
Re: Call for GSoC and Outreachy project ideas for summer 2023, Stefano Garzarella, 2023/02/17
Re: Call for GSoC and Outreachy project ideas for summer 2023, German Maglione, 2023/02/17
Re: Call for GSoC and Outreachy project ideas for summer 2023, Stefan Hajnoczi, 2023/02/17