lynx-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: lynx-dev Now I got the error from idmb trying to...


From: Heather Stern
Subject: Re: lynx-dev Now I got the error from idmb trying to...
Date: Wed, 21 Jul 1999 13:44:07 -0700 (PDT)

> On Tue, 20 Jul 1999, Klaus Weide wrote:
> >But maybe you don't want to talk to someone at IMDB about it, since
> >you are now aware that your use of the Web site is probably something
> >they like to block or discourage.  After all you're still trying to
> >fake a User-Agent after you know that it is being used by the site to
> >detect specific clients in order to block them, 

address@hidden wrote:
> No, I was trying to _set_ a user agent string because I thought it was
> blank.

In an ideal world (which we're not), the user agent string merely indicates
what abilities are present, so the site can display correctly.  (Site that
use this properly would never show us the javascripted version, unless or
until we ever code that up.)  Unfortunately both the clients and the sites
abuse this terribly. 

Clients like you, and the authors of certain rude crawler programs, abuse
it by setting it to something they're not.  Site authors abuse it when they
want to be lazy and exclude robots, by excluding too broadly.  The whole
feature of being ablt to change it, only showed up because we needed some 
way to tell if we were being excluded from sites, by being included with 
lying web-crawlers.

Klaus>> AND after you should have realized that it does you now good anyway.  
mattack> Why does it do no good to set the user agent string?  What are you 
       > talking about?

Your problem isn't the user agent string, very likely.  It's much more likely
that their logfile analysis (see    
   Linkname: IMDb: Terms and Conditions of Use
        URL: http://us.imdb.com/terms           )
detected your cron job and has excommunicated you.

I went and looked at their policies page.  "web accelerating" is against
their policy.  So fast fetching via your commandline string is nearly
certainly against their policy.  But doing it on a timed basis, definitely
is, unless you follow robot policies, for which there are better tools to 
use than lynx.

Klaus>>Should people on lynx-dev really try to help you??

mattack> Yes, they should, because it seems like Lynx could very well be 
       > deficient in this respect, 

Your problem is not nearly so much with lynx being "deficient" - it
seems to be doing what you told it to, but they don't want to talk with
you.  Either you need to follow their policy more closely, or you need to
talk to them about why their policy is so broken that you, one of their
customers (?) can't get to rightful information.  They have a help desk -
if you feel you're in the right, tell them so.  Stand up for the rights
of lynx users everywhere!  But I suspect that you know perfectly well that
you're not - and so slinking about, you want us to help you get around 
what they want to do.  We have better fish to fry.

mattack> just like it is not doing the Referrer string properly,
       > I thought it may not be doing the user agent or something else 
       > properly (in this specific case).

The Referrer string is a pretty bogus chunk of information to base their
threaded world on, but that's my opinion.  HTTP was originally described as
a stateless protocol, and that means that they shouldn't care -- at all --
where the *H* [1] you were last, except that it might be handy if they need 
to generate an error message.  We may actually be doing it "wrong" but 
really, I seem to be able to use their database okay so far...
        <1> replace with your favorite cussword here.

If that's really your only problem with them, by all means contact their
helpdesk, and see if they can use cookies or something a bit more ordinary
to figure out who you are.

Of course all this isn't US helping YOU, today, with THEIR page.  But I think 
that at this point it has to be a problem between you and them.  You know how 
to go in the front door, and you say it works.  (You'd better try again the
normal way though, because if they blacklisted you for cron-jobbing, it
probably doesn't matter anymore what you browse with.)  Stop trying to go 
in through the window;  either use the real door, have them build a servant's 
entrance, or stop burgling and find a vendor that wants (your traffic or 
money) to give you what you've decided you really need.  Maybe IMDb has an 
auto-information subscription option, and then all the cron-jobbing can be 
on their side, maintained by real DBAs.  (That "Cool Today" feature looks
like a start in the right direction.)

Meanwhile it looks pretty nice to me - thanks for the bookmark :)  I liked
it so much I even clicked on their Doubleclick sponsor ad, so they'd know
a lynx visitor came by.  (Then laughed really hard when the destination was 
a 404.  Teehee.)

* Heather

reply via email to

[Prev in Thread] Current Thread [Next in Thread]