Re: The future of the file-sharing service

Jacki,

On Wed, Sep 21, 2022 at 12:23 AM TheJackiMonster <thejackimonster@gmail.com> wrote:

I generally just like the idea that we could use GNUnet for contribution instead of the current central servers.

I did think about a decentralized git on top of GNUnet more than once, and it would be awesome.

David,

On Wed, Sep 21, 2022 at 1:37 AM David Barksdale via Mailinglist for GNUnet developers <gnunet-developers@gnu.org> wrote:

Possibility of sharing a file while it is still being downloaded (parts of it, of course)
This is already possible, the GAP routing layer can decide to cache any block of data it receives in the datastore.

Yes, you also have (and automatically share) chunks of files you never downloaded, but (as far as I know) at the moment you cannot explictly publish foobar.mkv while you are still downloading it (you will have to wait for the whole file, and if you don't launch gnunet-publish after you have downloaded it these chunks will eventually get lost).

What I am imagining is an option (let's say a -s argument for gnunet-download) that allows you explicitly to publish a file, using the metadata that you decide in the moment, already when you are still downloading that file – basically a full-fledged gnunet-publish nested inside gnunet-download.

Metadata must be editable and sharable
You can always publish new metadata under a keyword. What does sharable mean here?

You can add new keywords to a file, but you cannot see the current keywords whereby a file has been indexed, nor edit them, nor remove them (as far as I know). If I am not wrong you also cannot edit a metadata field that you had previously assigned to a file. Sharable means that the metadata can be exported and imported using an exchange format if the user wants (JSON? INI?), but also queryable (see next point).

Search keywords must be visible, editable, sharable (part of the metadata?)
Metadata (along with the URI of the file) is published under keywords, a keyword search returns this information. Are you suggesting including all the known keywords as part of the metadata?

If I publish a file under the keywords “foo” and “bar”, these keywords should be visible, not just guessable indirectly (at the moment if I search for “foo” and I find a file I can guess that that keyword has been used for that file, but I cannot access the other keywords, if there are any).

Basically, I imagine the entire metadata associated with an URI as an open book (or better, a collection of open books). That means that I must be able to access the metadata that I assigned to that URI, but also the metadata that other users (anonymously, of course, i.e. “the network”) assigned to that same URI. For example, if I publish hamlet.pdf and my metadata contains “comment=best drama ever” I must be able to access that metadata field and know that that is what I am using. If you publish the same pdf and your metadata contains “comment=worst drama ever”, I must also be able to know that someone is publishing the same file as “worst drama ever” (without having any info on whom). The old eMule/aMule did something similar and you could “search” the network for the metadata associated with a specific file (but without any privacy and without too much sophistication).

Introduction of a rating mechanism for files (against spam)
Allow reverse search (i.e. chk-URI lookup)
This could be done by using the URI as a keyword when publishing.

I don't know if using the URI as a keyword when publishing would be the best implementation (sounds a bit redundant, doesn't it?), but what I am suggesting is making it a default GNUnet feature for public files.

Automatically and fully auto-unindex a file when it is missing
Autoshare the dynamic content of a directory and update its index in real time (e.g. if I “autoshare” the content of /srv/filesharing/gnunet/madmurphy/, when I add foobar.txt to that directory it must be automatically indexed – the opposite if I remove it)
This is what gnunet-auto-share does.

Uh I missed that. Is there a “libgnunetautoshare” too?

Implement file statistics (download counter? last seen? etc.) – this should allow the network to get rid easily of “lost” content
The gtk gui tries to download parts of search results to display how available a file is.

This info would need to be stored also for files that you don't download (i.e. it must be part of the DHT, not just of a GUI). It should be like a sort of collective consciousness of the network behind the scenes, with the aim of preventing the propagation of orphan chunks or regulating the propagation of files that have not been downloaded often.

--madmurphy

From:	madmurphy
Subject:	Re: The future of the file-sharing service
Date:	Wed, 21 Sep 2022 06:04:00 +0100