I'm building a shell script to search and inform broken links on various domains hosted by my server.
The script is allmost ready, but I'm tied in two points.
It uses the --spider parameter to test links and this is the line wich does all the magic /usr/bin/wget --header='Accept-Charset: iso-8859-2' -F --base=http://${DOMINIO} --spider -r -nd -nc http://${DOMINIO} -o spider-${DOMINIO}.log --limit-rate=20k --delete-after -b
Now the questions
1) The output file doesn't indicate which page is calling the broken link. Am I missing something?
2) The broken link report shows some hexadecimals characters. Is there a way to prevent wget to decode the originals characters?
Output Exemple: http://${DOMINIO}/images/%22%20+%20imagem[x]%20+%20%22
Thanks in advance and sorry if I wrote to the wrong list.