-1

Platform is Centos 7

I'm working towards extracting the non-version portion of a filename and puzzled by this result:

echo "xorg-x11-font-utils-7.5-21.el7.x86_64.rpm" | sed -nr "s/([[:alpha:]]+)-[0-9].*\.*rpm/\1/p"

which yields

xorg-x11-font-utils

hence [:alpha:]+ appears to match a string including two *1*s, two non-alpha characters. I was expecting this not to match at all.

Explanations?

djna
  • 52,574
  • 11
  • 70
  • 109
  • Got to say Walter, I think you closed the question prematurely. Effectively "go read the manual" is far from useful. I've given a explanation of what I expect and what I see and could really use a pointer as to how my understanding is flawed. – djna Feb 21 '20 at 16:25

1 Answers1

3

([[:alpha:]]+)-[0-9] matches utils-7 in your string. When you replace with \1, it becomes utils.

Everything before (i.e. xorg-x11-font-) remains unchanged.

How it works:

\1 is a backreference to group 1, it contains what is match in group 1, in this case utils, -[0-9] matches -7 that is just after utils, then, .*\.*rpm matches the rest of the string.

The substitution replaces the whole match utils-7.5-21.el7.x86_64.rpm with the content of group 1 utils so at the end you got :

  • the beginning of the string xorg-x11-font-, unchanged
  • the rest of the string that is replaced with utils
  • Finally: xorg-x11-font-utils

You'll find explanation here

Community
  • 1
  • 1
Toto
  • 83,193
  • 59
  • 77
  • 109
  • having accepted the answer, for which I'm grateful, I'm now not sure I believe it :-) The replace with \1 should only print the matching group, the bit inside the () - evidence for that: the whole tail is *not* printed. The xorg-x111-font is surely not part of the matching group? – djna Feb 21 '20 at 16:18
  • 1
    @djna: See my edit. – Toto Feb 21 '20 at 16:37
  • Yes! Thanks, I just figured it out myself. Apologies for doubting you, and thanks for taking the time to explain. – djna Feb 21 '20 at 16:43
  • @djna: You're welcome, glad it helps. – Toto Feb 21 '20 at 16:44