[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Lynx-dev] retrieving text from html5 page?
From: |
voytek |
Subject: |
[Lynx-dev] retrieving text from html5 page? |
Date: |
Thu, 9 Jan 2014 10:52:57 +1100 |
User-agent: |
SquirrelMail/1.5.2 [SVN] |
I have a script like:
wget -O page.html url
lynx -dump page.html > page.txt
that worked TILL web server was redeveloped;
now they use html5 stuff, and, page.html has data I want, but, page.txt
only has 'labels' but not data contents, andy thought how I can do
that...?
when displayed on screen, data shows, in text file, not
looking at page.html it has like:
/snip/
<label class="pfbc-label">Suburb</label><input type="text"
name="SYS_Addresses_e_address_i_0_e_district_tx" value="SYDNEY"
readonly="readonly" class="ro pfbc-textbox"/>
<label class="pfbc-label">State</label><input type="hidden" value="NSW"
name="SYS_Addresses_e_address_i_0_e_state_cd"><input type="text"
name="SYS_Addresses_e_address_i_0_e_state_cd_d" value="NSW"
readonly="readonly" class="ro pfbc-textbox"/>
<label class="pfbc-label">Postcode</label><input type="text"
name="SYS_Addresses_e_address_i_0_e_postcode_tx" value="2000"
readonly="readonly" class="ro pfbc-textbox"/>
- [Lynx-dev] retrieving text from html5 page?,
voytek <=