bug-wget
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-wget] How to intercept wget to extract the raw requests and the


From: Bykov Alexey
Subject: Re: [Bug-wget] How to intercept wget to extract the raw requests and the raw responses?
Date: Thu, 15 Feb 2018 21:34:22 +0200
User-agent: Mozilla/5.0 (Windows NT 6.0; rv:49.0) Gecko/20100101 SeaMonkey/2.46


    wget --warc-file=httpbin -qO- https://httpbin.org/get


How to convert the warc format to the actual header of requests and responses?
Greetings
WARC is gzipped plain text.

wget --warc-file=httpbin --no-warc-compression -qO response.raw -- https://httpbin.org/get

Extract headers with GNU Sed
sed -n -r -e "/WARC-Type: (request|response)/{s/.*: (.)/\n\L\1/;p;:a;N;s/\n$//;Ta;s/.*//;:b;N;s/\n$//;Tb;p;}" httpbin.warc > headers.txt

Extract headers with GNU AWK
awk "{if(/WARC-Type: (response|request)/){print n;hp=1;np=0;}if(hp){if(np){if(!$1){np=0;hp=0;}else print}if(!np&&!$1)np=1;}}" httpbin.warc > headers.txt


Best regards.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]