[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Gzz] Re: Content types in the URI?
From: |
Benja Fallenstein |
Subject: |
[Gzz] Re: Content types in the URI? |
Date: |
Sun, 23 Mar 2003 15:55:16 +0100 |
User-agent: |
Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.3) Gecko/20030319 Debian/1.3-3 |
Hi Gordon, hi Justin,
been mulling over this problem some more, trying to look at it from
different sides, but my position essentially hasn't changed. I think
that we won't agree on this point; I also think that this won't do great
harm, though-- our systems should still be able to interoperate without
problems. (See below.)
Gordon Mohr wrote:
But the point about URIs is that they do not need context to identify a
resource. It would be nice to be able to use our URNs in the same
context as a HTTP URL, for example in an <img> tag or an <a href>.
Basically, the idea about URIs is that you can use many different URI
schemes in the same context, because they do not depend on a context, no?
Yep. So just do it without any type-labelling.
HTTP URLs don't include content-types. The URL...
http://foobar.com/smiley
...could be a GIF or HTML or an executable. It's only the extra
(outside-the-URI) context provided over HTTP that sets its type --
Yes, but the point is that HTTP maps this URI to a content type plus a
body, with the same 'trust level'. If I make a link to
http://foobar.com/smiley, I expect that following that link will give a
correct body *and* a correct content type. Both are in the hand of the
same person (the server operator).
and I believe most browsers will even be tolerant of many kinds of
mistyping by the server, once they see the data themselves, and
coerce the file into its intended type.
Yes, but I'm sure the developers of those browsers will tell you they'd
rather be served correct content types by all servers. :)
In my opinion, content type guessing is a kludgy workaround, because it
is hard to do, error-prone, and hard to extend. Why do all those
operating systems determine the content type of a file based on its
extension, rather that 'just' looking at it and guessing its type?
The data URL scheme (RFC 2397). (It also puts the data in the URL; my
opinion is that content type plus data isn't so different from content
type plus cryptographic hash...)
Aha. I forgot about that one.
There might be a good reason you have to specify the type-interpretation,
but so far I haven't seen a specific case where it's necessary. This
ought to work (in a properly extended browser)...
<img src="urn:sha1:BLAH">
...even without advance knowledge of the format of the bitstream, as
long as it turns out to be a recognizable image format.
I'm thinking that we probably won't be able to reach an argument here. I
believe that providing a content type is essential for making
developers' life easier; you don't think so.
(BTW, I think the analogy with data: runs deep-- why not guess the type
of the content in the data URI, instead of putting the type in the URI?)
To summarize the options we've discussed:
1. Give the content type in the URI. You don't like that.
2. Make an indirect reference: The URI points to a hashed block
containing a content type and the hash of the block with the actual
data. This means we wouldn't use the same hash for e.g. an MP3 as you
do, as we'd use the hash of the block *refering to* that MP3. I don't
think you liked that.
3. Guessing the content type. I don't like that, on grounds that it
makes life harder for client developers, and that may necessiate
specifying the content type in the context (e.g. for digital signatures,
to ensure we know the correct interpretation of the bytes that were signed).
4. Getting the content type through an out-of-bounds mechanism, like the
Bitzi database. I don't like that, on grounds that it requires an
Internet connection, and it doesn't have the same 'trust level' as
having the content type in the URI or hashed data.
I still think content type in URI is the best alternative for us.
Now, if we cannot agree on this, can our system (Storm) still
interoperate with the ones you are developing?
I should think so. Given a Storm URI, we can easily create a bitprint
from it by stripping the content type. This means we can find Storm
blocks and metadata about them using bitprint-based systems. Given a
bitprint, we can generate a Storm URI by using either of the methods you
proposed-- adding the content type from the Bitzi database, or guessing
it from the content. (We could also use application/octet-stream if we
don't know what kind of data it is, for some reason.)
I would also be entirely comfortable with building the Storm storage
layer so that data is looked up by bitprint. For lookup, the content
type doesn't matter, after all. It would only be used at the higher
levels building on Storm, to interpret the data that the lower levels
have retrieved. Thus, a Storm system could be used to look up bitprints,
and a bitprint-based system could be used to lookup Storm blocks.
So, given all this, I do not see great harm if Storm proceeds in the way
that seems most natural to me, and you proceed in the way that seems
most natural to you. Right?
- Benja
P.S. Gordon, is there a public domain Java version of the new bitprint
calculator, including source, already? Thanks, -b
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- [Gzz] Re: Content types in the URI?,
Benja Fallenstein <=