emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Subprojects in project.el (Was: Eglot, project.el, and python virtua


From: Tim Cross
Subject: Re: Subprojects in project.el (Was: Eglot, project.el, and python virtual environments)
Date: Mon, 28 Nov 2022 15:10:55 +1100
User-agent: mu4e 1.9.3; emacs 29.0.50

Dmitry Gutov <dgutov@yandex.ru> writes:

> On 25/11/22 00:46, Tim Cross wrote:
>> João Távora <joaotavora@gmail.com> writes:
>> 
>>> On Thu, Nov 24, 2022 at 3:01 AM Dmitry Gutov <dgutov@yandex.ru> wrote:
>>>
>>>   
>>>>   I'm imagining that traversing a directory tree with an arbitrary
>>>>   predicate is going to be slow. If the predicate is limited somehow (e.g.
>>>>   to a list of "markers" as base file name, or at least wildcards), 'git
>>>>   ls-files' can probably handle this, with certain but bounded cost.
>> I've seen references to superior performance benefits of git ls-file a
>> couple of times in this thread, which has me a little confused.
>> There has been lots in other threads regarding the importance of not
>> relying on and not basing development on an underlying assumption
>> regarding the VCS being used. For example, I would expect project.el to
>> be completely neutral with respect to the VCS used in a project.
>
> That's the situation where we can optimize this case: when a project is 
> Git/Hg.
>
>> So how is git ls-file at all relevant when discussing performance
>> characteristics when identifying files in a project?
>
> Not files, though. Subprojects. Meaning, listing all (direct and indirect) 
> subdirectories
> which satisfy a particular predicate. If the predicate is simple (has a 
> particular project
> marker: file name or wildcard), it can be fetched in one shell command, like:
>
> git ls-files -co -- "Makefile" "package.json"
>
> (which will traverse the directory tree for you, but will also use Git's 
> cache).
>
> If the predicate is arbitrary (i.e. implemented in Lisp), the story would 
> become harder.
>
>> I also wonder if some of the performance concerns may be premature. I've
>> seen references to poor performance in projects with 400k or even 100k
>> files. What is the expected/acceptable performance for projects of that
>> size? How common are projects of that size? When considering
>> performance, are we not better off focusing on the common case rather
>> than extreme cases, leaving the extremes for once we have a known
>> problem we can then focus in on?
>
> OT1H, large projects are relatively rare. OT2H, having a need for subprojects 
> seems to be
> correlated with working on large projects.
>
> What is the common case, in your experience, and how is it better solved? 
> Globally
> customizing a list of "markers", or customizing a list of subprojects for 
> every "parent"
> project?

In my personal experience, sub-projects have been more about project
structure and not size. I would agree they are more prevalent in large
projects, but can exist in medium and even smaller projects.

I don't think I have a preference for customizing a list of markers or a
list of sub project definitions per project. I suspect different
approaches will work better in different scenarios and neither is a
clear 'winner'. However, as pointed out by Stephan, terminology
confusion/meaning may well be contributing to the confusion here. Not
only am I unsure everyone is thinking the same thing when talking about
sub-projects, I'm not sure everyone is even talking about the same thing
when referencing 'project'.

I wrote a lot about how I use projects and sub-projects in my work flow
and then realised it probably isn't helping that much. It struck me that
perhaps the issue is that the notion of sub-projects isn't really that
useful in itself and may actually be more detrimental than useful.

When you think about it, a sub-project is really just a more narrow
project focus. A project is really just a collection of files and
environment settings which can be considered, for some purpose, as a
'unit' in itself. It might define the set of files used when considering
find and replace for a symbol, when looking for symbol completion
candidates, or file/buffer switching, opening, linting, cross
referencing etc. It may correspond to a VCS repository, but it may
not. It could cut across repositories, or it could be made up of
multiple repositories or it could simply be some bespoke virtual project
concept specific to a particular use case.

I guess what I want is the ability to define arbitrary collections of
files and environment settings as a project, have a way to select/target
a project and an API which various tools can use to get the files or
environment settings to then operate on. Whether one project can be
considered a sub-project of another project is less relevant compared to
the ability to select/identify the target project. Automatic definition
of projects based on VCS repositories is great and a real time saver,
but the ability to define what makes up a project manually is also
important. The ability of the system to automatically determine which
project is 'active' (for example, based on the location of the file
being opened) is good and having the system prompt you when it isn't
clear or when there are multiple options is useful, but just being able
to run a command to set the current project would also be
sufficient. However, how one project relates to another project i.e. sub
project, main project, etc, seem of limited use compared to just having
the ability to select a sub-set of the files and environment settings of
a project, whether we call these sub project or nested projects or
whatever, seems of limited benefit.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]