[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: signature method in locate
From: |
Ondrej Bilka |
Subject: |
Re: signature method in locate |
Date: |
Tue, 12 May 2009 08:51:44 +0200 |
User-agent: |
Mutt/1.5.18 (2008-05-17) |
On Tue, May 12, 2009 at 12:08:21AM +0200, James Youngman wrote:
> On Mon, May 11, 2009 at 10:42 AM, Ondrej Bilka <address@hidden> wrote:
> > On Mon, May 11, 2009 at 10:39:11AM +0200, James Youngman wrote:
> >> On Mon, May 11, 2009 at 8:14 AM, Ondrej Bilka <address@hidden> wrote:
> >> > Hello,
> >> > Was signature method discused here? search gives only pgp.
> >> > Its that you give to each letter bit create signature.
> >> > When searching you first create signature from largest continuous part
> >> > not containing /. Then you can compare signature of each directory and
> >> > filename.
> >> > There is risk it will be slowdown because you need to read more.
> >>
> >> I'm sorry, I did not understand what you were trying to express.
> >> Would you try again with an example?
> > Sure. Say you have files aaa ,baba, abcd and aaccee. Signatures are
> > 10000, 11000, 11110 and 10101
>
> I don't understand the mapping you are implying here.
>
> > If we want find *ba*, it coresponds to 11000
>
> But above, you indicated that 11000 represented baba. I don't
> understand what mapping or calculation you are trying to describe.
If a is present first bit is 1, if b present second bit is 1, z 26... and you
or it. Then you can tell what letters are present.
> Are you suggesting some kind of Bloom filter?
Yes.
But I would use this signature function
int makesignature(char *s){
int sig=0;
char c1,c2;
c1=*s++;
c2=*s++;
while (*s){
sig |= 1<<((121*c1+11*c2 +*s)%31);
c1=c2;
c2=*s++;
}
return sig;
}
int canmatch(sigstr,sigpat){return sigstr&sigpat==sigpat;}