[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[bug-gawk] gawk not working properly on the first line
From: |
Miriam English |
Subject: |
[bug-gawk] gawk not working properly on the first line |
Date: |
Tue, 12 Aug 2014 01:55:42 +1000 |
User-agent: |
Mozilla/5.0 (X11; Linux i686; rv:13.0) Gecko/20120604 Firefox/13.0 SeaMonkey/2.10 |
Hi folks,
I've looked through the archives and can't see if this has been dealt
with (though my search terms could be at fault).
Yesterday I was trying to process a small table of tab-separated values
to select out 3 of the 6 columns and print them formatted. I used this
piece of awk along with some sed bits and pieces:
awk '{FS="\t" ; print $3 "- " $2 "[" $5 "]"}'
Here is a sample table:
Play 01 A Filbert is a Nut Rick Raphael Etext Steve Mattingly
Play 02 Ask a Foolish Question Robert Sheckley Etext Bellona
Times
Play 03 The Beast in the Void Paul W. Fairman Etext James
Rogers
Play 04 The Burning Bridge Poul William Anderson Etext Mark
Nelson
Play 05 From an Amber Block Tom Curry Etext Lars Rolander
It is supposed to produce:
Rick Raphael - A Filbert is a Nut [Steve Mattingly]
Robert Sheckley - Ask a Foolish Question [Bellona Times]
Paul W Fairman - The Beast in the Void [James Rogers]
Poul William Anderson - The Burning Bridge [Mark Nelson]
Tom Curry - From an Amber Block [Lars Rolander]
It worked perfectly on every line of the table except the first line.
A- 01[is]
Robert Sheckley - Ask a Foolish Question [Bellona Times]
Paul W. Fairman - The Beast in the Void [James Rogers]
Poul William Anderson - The Burning Bridge [Mark Nelson]
Tom Curry - From an Amber Block [Lars Rolander]
When I inserted a blank line before the first line it worked on that
line too, so it wasn't a problem with the line itself. It looks to me
like awk is not setting the field separator to TAB until after
processing the first line.
Is this a genuine bug or am I misusing awk somehow?
I abandoned using awk for this and used sed instead because it works
properly. A pity because sed is ugly and difficult to read compared to
the very clear awk line, as you can see:
sed -r 's/[^\t]*\t([^\t]*)\t([^\t]*)\t[^\t]*\t(.*)/\2 - \1 \[\3\]/'
awk '{FS="\t" ; print $3 "- " $2 "[" $5 "]"}'
Afterward I got to remembering a few times recently when I've been
confounded by awk acting weirdly on the first line of a file. Has
anybody else seen this?
I was using GNU Awk 3.1.6, but looked online and found version 4.1.1 so
compiled it and tried it. Same problem. I'm using Linux.
Incidentally, if anybody is interested in why I'm doing this, I've been
downloading from Librivox the rather cool free short science fiction
collection read by volunteers.
https://librivox.org/group/435
The mp3 files are named quite arcanely, so I wanted a way to use the
playlist table found on each of the Librivox download pages to rename
the files more sensibly as:
author - storytitle [reader].mp3
Doing this by hand would be a real pain as there are hundreds of stories
-- 50 collections, each containing 10 or more stories. Automating it
makes it a breeze.
Cheers,
- Miriam
--
If you don't have any failures then you're not trying hard enough.
- Dr. Charles Elachi, director of NASA's Jet Propulsion Laboratory
-----
Website: http://miriam-english.org
Blogs: http://miriam-e.dreamwidth.org
http://miriam-e.livejournal.com
- [bug-gawk] gawk not working properly on the first line,
Miriam English <=