bug-wget
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-wget] [GSoC] Proposal: “Design and implementation of a framewo


From: Tim Rühsen
Subject: Re: [Bug-wget] [GSoC] Proposal: “Design and implementation of a framework for plugins.”
Date: Mon, 3 Apr 2017 12:50:57 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.8.0


On 04/02/2017 09:40 PM, Akash Rawal wrote:
> 
> 
> On 04/02/2017 10:41 PM, Darshit Shah wrote:
>> * Akash Rawal <address@hidden> [170331 23:34]:
>>> Hello,
>>>
>>> First of all, thanks for the initial review of my proposal!
>>>
>>> I have made some changes in response to the comments as well as last
>>> mail on the mailing list by Darshit Shah. Kindly consider reviewing it.
>>> For reference here is the link.
>>> https://docs.google.com/document/d/1SBM-jNG5dJL85C7iK9aid2eDlByrxOsYUNDrk4049vQ/edit?usp=sharing
>>>
>>>
>>>
>>> The proposal is still not final, in the sense that I am analysing the
>>> code for further ideas of what can be customized using plugins.
>>>
>>> I was asked to elaborate on rewriting of command line option processing
>>> code. So here are my thoughts on it.
>>>
>>> The current code in src/options.c has following drawbacks:
>>>  -  Built around order-insensitive and idempotent model (Supplying
>>>     an option modifies a variable). This might be adequete for wget,
>>>     but might not be the case for plugins. Repeated options or
>>>     order of options can matter for plugins.
>> Breaking the order-insensitive and idempotent model of command line
>> processing is an absolute no-go. It will definitely break many scripts
>> out there. We do wish to make it easy to transition from Wget to Wget2
>> and this is a huge change in how options are processed.  Also, I see no
>> clear advantages to why this should be required.
> Backward compatibility will only be broken if options are changed,
> which I didn't plan to do. wget2 will support all options it
> supports now. Existing options shall remain order-insensitive
> and idempotent, only new options and plugin-side options may
> take advantage of greater flexibility.
> 
> Many programs have order-sensitive (e.g. ffmpeg, options
> apply to input or output file based on placement) and non-idempotent
> (e.g. -v (verbose) option in many program, -vv for higher verbosity)
>>>  -  The need for keeping the list of options sorted. I don't think
>>>     we can expect plugin writers to do the same.
>> No. Plugins should not be allowed to tinker with certain core
>> functionalities and that includes how options are processed.
>>>  -  And most importantly, options cannot be added at runtime.
>> Which is a very good thing. Keeping everything at compile time means we
>> can have most of that code on the stack without the need for costly
>> dynamic memory allocations on each invokation. Options processing code
>> should definitely not be dynamic.
> Okay if that is your requirement I'll remove it from my proposal.
>>>
>>> So far I think rewriting the option processing code would be easier
>>> than trying to patch existing code to support custom options.
>>> Some parts of the original code could be reused but structure
>>> would undergo a drastic change anyway.
>>
>> Easier? Have you every tried writing one? The number of use cases and
>> corner cases you'd have to handle is massive. That alone could be an
>> entire GSoC project. You can't expect to write it as a byproduct of your
>> project.
> I have written simpler ones. But even in wget's code option processor is
> not so large (~400 LOC, starting from src/option.c:853)
> Cases for long options:
>     --option
>     --option=value
>     --option value
>     --no-option
> Cases for short options:
>     -x
>     -x value
>     -xvalue
>     -xyzw
>     -xyzvalue
>     -xyz value
>>
>>>
>>> The idea I have is to expose an interface for plugins to describe
>>> command line options they want to receive. I have attached a
>>> prototype header. From this interface, wget will have a list of
>>> command line options to support, along with plugin function to
>>> be called for handling plugin specific options. In addition to
>>> using this to parse command line, wget can Use the descriptions
>>> and category information provided to generate --help output.
>>
>> First let me ask you, how exactly do you envision a user using plugins
>> with Wget2? How are plugins selected? How does the user even know if a
>> plugin exists?
> How about:
>     $ wget_plugins="foo" wget2 --foo-option http://example.com/file.html
> Or:
>     $ wget2 --plugin=foo --foo-option http://example.com/file.html
> Hypothetical example: music site parser that filters by rating
>     $ wget2 --plugin=music_rating_filter --music_min_rating=4 \
>           --recursive --level=20 \
>           http://example_music_site.com/music/1
> Multiple plugins:
>     $ wget_plugins="foo:bar" wget2 --plugin=foo --plugin=bar \
>           --foo-option --bar-option=value http://example.com/file.html
> Or:
>     $ wget2 --plugin=foo --plugin=bar --foo-option --bar-option=value \
>           http://example.com/file.html
>     $ #The above is also an example of non-idempotent option --plugin
> Loading plugins which are not installed:
>     $ wget2 --local-plugin=/path/to/libfoo.so --foo-option \
>           http://example.com/file.html
> 
> And regarding user knowing which plugins are installed:
>     $ wget2 --list-plugins
>     foo
>     bar
>     music_rating_filter
> Plugins could be installed in a directory like
>  ${PREFIX}/lib/wget/plugins/. Many software do like this.
>>>
>>> Regarding the need to support plugin side addition of command line
>>> options at the first place, here are my thoughts.
>>> Plugins might need additional data from the user. I can think
>>> of 3 ways in which it can be done:
>>>  1. Environment variables.
>>>  2. Files with well-known names.
>>>  3. Command line.
>>> Environment variables have a disadvantage that they are automatically
>>> passed to the subprocesses, creating a chance for interference.
>>> Also, name clashes cannot be discovered.
>>>
>>> Using files with well-known names has a chance of interference
>>> with parallel running instances of wget.
>>>
>>> None of these disadvantages are shared with using command line
>>> options.
>>
>> However, all three are commonly used to provide options to a program and
>> should all always be supported. Environment variables allow for all
>> instances in one subshell to share a set of commands without having to
>> pass them multiple times and without interfering with instances in other
>> subshells
>> Configuration files are useful so the system administrator can set a
>> bunch of common options for all users, maybe to enforce a policy.
>> Command line options give the user the power to modify options on a per
>> invokation level.
> Per invocation level is actually my primary concern.
>>
>>>
>>> Opinions are welcome.
>>>
>>> Regards,
>>> Akash Rawal.
>>>
>>>
>>>
> 
> So the summary is, due to the requirement of running the code on
> stack as much as possible, I'm removing the customizable option
> processing from my proposal.

Sorry when I drop in without having read/understood all...

> Multiple plugins:
>     $ wget_plugins="foo:bar" wget2 --plugin=foo --plugin=bar \
>           --foo-option --bar-option=value http://example.com/file.html

I like this. We can have this without changing much of Wget2's option
processing code. Add --plugin to Wget2 (we will have a list/array of
plugin names than).
Define a option name scheme to identify options for a certain plugin,
e.g. --plugin-foo-<option-name>=value. Make as as simple as possible,
e.g. a value is mandatory (bool is 1|0 or y|n or whatever).

When the current option processing doesn't find an option in it's list,
we special check for --plugin-<plugin-name>-<option-name> and give the
key/value pair <option-name>/<value> to a function that is provided by
the plugin (e.g. wget_plugin_set_option(const char *option_name, const
char *value).

With this you can provide CLI options to plugins without changing much
in Wget2's code.

Regards, Tim

Attachment: signature.asc
Description: OpenPGP digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]