2010/3/31 Carlo von Loesch <address@hidden>
Dear Social-discuss, don't look for László Török's original posting.
He sent it to me in private by mistake. I'll quote it in full.
László Török typeth:
| > My "URI" is psyc://psyced.org/~lynx
although I don't consider myself a
| > resource. If you want to know who my friends are you get a list of URIs
| > in a list data structure. That's practical, because you don't have
| > to extract it from some document.
| How does this exactly work? Is this a specialized function of the system?
| (Not an expert on PSYC.)
Currently it's in the reply to _request_description, that is how a
profile is transferred within PSYC. Take https://psyced.org/~lynX
for example - if you were my friend, you would see my phone number and
my list of friends in there (the ones who agree to be shown to you -
by configuring their trust and expose levels towards me). Actually
you should be seeing https://psyced.org/~lynX?lang=en but by mistake
it shows the fields in my language, not yours. Bug filed.
Yes, the content could be formatted in a way ready for use by semantic
web tools, but PSYC doesn't use those - it works with native data
structures in the entire decentralized network. I don't even want my
publicly accessible profile to be spidered and semanticized. I only
want my friends to have that level of data quality.
There's also a next generation approach to "requesting a profile:"
In some future PSYC intends to notify all friends when you
make a new friendship, and since that information is meant to be
stored by every recipient (using the = syntax in the protocol),
everyone already *has* the information in the morning when she woke
up. There is no need to go and check out profiles. You already have
the entire social graph on your laptop and perform computations on
who might be able to help you with your maths homework or whatever.
And it's not just some outdated cache. By definition of the protocol
you always have the current graph with current phone numbers of your
friends and whatever else they put in there, available on your hard
disk to build amazing tools upon.
Occasionally we will experience leaks, like Blaine Cook described
them - some malware will be designed to harvest this information if
you're a Windows user, some people will gateway their PSYC data
into Facebook or sync it with public FOAF. But still that's less
bad than having ALL information on public websites or within
transparent virtual machinery (also known as commodity web hosting).
It's the completeness of knowledge about social links of all of
humanity which is a threat to political freedom, and as it stands
some people are in the position of having access to the complete
graphs of Internet's social networks, and a decentralized platform
hosted on virtual machines would not change much about that. Maybe
some good ISPs in some good countries would be safe havens, but
how can you be sure? And would that still count as commodity hosting?
What's worse, it's not okay to let the people decide about this -
it doesn't solve the problem. The majority still believes they have
nothing to hide, which is a terrible fallacy. We should not write
code for the people who have nothing to hide. We should write code
for everybody, even if they think they have nothing to hide.
Did I just reply to a simple tech question with 6 paragraphs of
| Btw you can do the same with SPARQL (you don't have to extract anything from
| a document, you formulate your query and you get exactly what you asked for)
| and actually much more. You can reuse the published linked data in ways the
| publisher have never thought of and the system serving the data doesn't
| necessarily have to programmed for this. (give me list of friends, etc.)
Hmm I have mixed feelings about these query languages.. I've used them and
I know you can formulate powerful things with them.. but I also noticed
that if there's a chance you can keep your data rather flat and have
your hierarchies in your naming strategy, you can access the data faster.
The PSYC approach is, instead of processing a query language to extract
data from a structured file, the data is stored in variables that have
names that resemble a structured query. So when you need that data,
accessing it takes close to zero processing power. Yes this approach
is limited and I am willing to drop it if we hit the flexibility wall. ;)
PSYC could just as well be adapted to additionally support RDF and SPARQL.
| I assume by saying powerful you mean faster? There are other measures of
| > PSYC isn't just a transport layer. It models multicast, subscriptions,
| > trust and has extensible message structures. That's why you wouldn't
| > typically do RDF on top of it if there are more powerful native ways
| > to do things. But ways to regenerate RDF out of PSYC can be created.
| being powerful:
| expressivity(!), extensibility, ubiquity...
Yes, I mean those kind of features, not just faster. PSYC has a semantically
pretty rich syntax for a network protocol, but unlike SGML-derived languages
it was designed with a programmer's pragmatic point of view. Easy to parse,
binary capable, rapid to extract info from it...
| I can see the merits of a binary serialization, but it comes at a steep
| > There once was a working group on "binary XML." It didn't come up with
| > a format but it created a wishlist of things binary XML should be able
| > to do. PSYC does a lot of that, so you can think of it as binary XML,
| > too.
| price. You need quite a bit of tooling around it. So I'd avoid using it,
Ironically, the most obvious feature expected from "binary XML," being
encoded in binary, is what PSYC doesn't do. PSYC is a binary capable
text protocol, like HTTP. But the other features people wanted to
gain from "binary XML" is what PSYC delivers pretty well. PSYC vs
"binary XML" is discussed on http://about.psyc.eu/Syntax (the
"The W3C Feature Checklist" paragraph)
| unless the lack of requirements on computing resources would dictate
| otherwise. (I come from the embedded world, now a bit about the lower end
The embedded world could use semantically rich protocols that are
actually fast to parse and process, right?
| I think it was not merely an accident that the web as we know it has been
| built around text-based protocols.
The web started out with a protocol that closed its socket after
each file transfer operation, defeating many useful features of TCP
sockets. TimBL is a great visionary concerning URIs and hyperlinks,
but he had no clue about network programming when he invented the web.
It took many years until HTTP/1.1 finally fixed that behaviour.
Sometimes brilliant ideas have great success, no matter if the
implementation is good or bad.
I agree with you that efficiency is a key, if not the key, element in a distributed network. We've only seen a fraction of what the web can do, due to historical paths taken.
But I think the difficulty or importance of getting wide adoption can sometimes be underestimated.
When TimBL proposed the the WWW in a paper that contained the concept, URIs, the World Wide Web, the HTTP protocol, HTML, the first web server, and the first browser, to the HYPERTEXT conference. They said that it was not noteworthy enough to go in the main conference, but could be part of the poster track.
When the web started, gopher was by far the more popular technology. But then one day there was talk about licencing the technology, and gopher overnight became not cool. It's great credit to TimBL that he pushed CERN to release the Web under a free licence, that allowed it to take off.
Whether it was right or wrong, it's something that is ubiquitous today, and the decisions made to get us to where we are should be respected, things could have turned out very differently. Even with non optimal technology the web has had an amazing first two decades. Let's take what's already there and be prepared to embrace and extend.
Counter-theory: There is no big reason why Bittorrent has to be a
binary protocol, yet it works like hell. So does DNS, except for
the historic security weaknesses.