help-bash
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Printing TITLE, SUBTITLE, and KEYWORDS


From: Hans Lonsdale
Subject: Re: Printing TITLE, SUBTITLE, and KEYWORDS
Date: Sat, 4 Feb 2023 04:05:13 +0100 (CET)


> ----------------------------------------
> From: Greg Wooledge <greg@wooledge.org>
> Date: Feb 4, 2023, 5:55:29 AM
> To: <help-bash@gnu.org>
> Subject: Re: Printing TITLE, SUBTITLE, and KEYWORDS
> 
> 
> On Fri, Feb 03, 2023 at 10:25:25AM -0600, Dennis Williamson wrote:
> > On Fri, Feb 3, 2023, 9:34 AM Hans Lonsdale <hanslonsdale@mailfence.com>
> > wrote:
> > 
> > > I have changed to using awk as the way to print sections defined by the
> > > following structure
> > >
> > > ## NFAML [NASMB] KEYWORDS
> > > ## BODY (can include empty lines)
> > > ## END OF NFAML [NASMB]
> > >
> > > Am trying to print TITLE, SUBTITLE and the KEYWORDS array, but not getting
> > > them printed.
> 
> The question keeps changing.  In the previous iteration, you said any
> lines that did not begin with ## should be treated as the end of the
> section, because sometimes the ## END OF ... section footer is missing.

In case  ## END OF section is missing, the script was printing everything,
I wanted to ensure that a begin section has a corresponding end section.

## NFAML [NASMB] KEYWORDS
## BODY (can include empty lines)
 ## END OF NFAML [NASMB]    # end section  has to match  NFAML and NASMB in 
begin section

A simplistic implementation could have a test to see whether each begin has an 
end.  Then, there
would be no need to stop on non-comment non-blank lines 

Have done some more work on this, which should help understand the intention.
 
> Now, suddenly, there can be blank lines in the section body.

The body should consist of comments, with empty lines allowed. 
 
> What are "TITLE" and "SUBTITLE"?
> 
> What "KEYWORDS array" are you talking about?  All I see is the string
> KEYWORDS on the header line.  Is the portion of the header which follows
> the ] character supposed to be parsed in some way?

The string KEYWORDS is composed of comma separated keyword values

Example

## Function [Tobin] bash,resource

where there are two keywords namely bash and resource.
 
> It's pretty much impossible for anyone to help you if we can't pin down
> the actual requirements.
> 
> > >     pn_ere='^[[:space:]]*([#;!]+|@c|//)[[:space:]]+'
> 
> That does not AT ALL match the sample inputs you have been providing.
> That's the main problem here.  You're working with inputs that only
> you can see.

This is used to allow indentations in comments 
 
> If you're going to inject your strings into regexes, then you need to
> "escape" them as I showed in a previous message.
> 
> It sounds like you're definitely going to do this, for reasons that
> will never be clearly explained (perhaps they can be inferred by reverse
> engineering your regexes).  So be it.  Just make sure you do it correctly.
> 
> > AWK isn't Bash. Please take your questions to an AWK or other list.
> 
> I'd say this is reasonably on-topic.  Especially since he'll need to
> use bash to prepare the input strings for injection into the awk regex
> variables.
> 
> Using awk within a bash script -- even if the awk portion ends up being
> 90% of the script -- is still a reasonable thing to discuss on help-bash
> in my opinion.

Upon your discussion I have shifted to using awk.

The code follows

  spc='[[:space:]]*'
  ebl='\\[' ; ebr='\\]'  # for awk to apply '\['' and '\]'
  pn_ere='^[[:space:]]*([#;!]+|@c|//)[[:space:]]+'

  ## :- modifier, use GPH if parameters are unset or empty (null).
  nfaml=${faml:-"[[:graph:]]+"}  # Use GPH if FAML null ("" or '')
  nasmb=${asmb:-"[[:graph:]]+"}  # Use GPH if ASMB null ("" or '')

  local kys=".*"
  local pn_ere="^[[:space:]]*([#;!]+|@c|//)[[:space:]]+"
  beg_ere="${pn_ere}(${nfaml}) ${ebl}(${nasmb})${ebr}${spc}(${kys})$"
  end_ere="${pn_ere}END OF ${nfaml} ${ebl}${nasmb}${ebr}${spc}$"

  awk -v beg_ere="$beg_ere" -v pn_ere="$pn_ere" -v end_ere="$end_ere" \
    '$0 ~ beg_ere {
       title=gensub(beg_ere, "\\2", 1, $0);
       subtitle=gensub(beg_ere, "\\3", 1, $0);
       keywords=gensub(beg_ere, "\\4", 1, $0);
       nk = split(keywords, kaggr, ",");
       display=1;
       next
     }
     $0 ~ end_ere { display=0 ; print "" }
     display { sub(pn_ere, "") ; print }
    ' "$filename"






-- 
Sent with https://mailfence.com  
Secure and private email



reply via email to

[Prev in Thread] Current Thread [Next in Thread]