0

Hi I am trying to remove part of string using sed command but looks like all options that i came across stack overflow doesn't seem to work.

sub-285345_task-WM_dir-28_epi.nii
sub-285345_task-LANGUAGE_dir-11_epi.nii.gz

I want to remove _task-*** part of it. I want to remove task-**, key value pair.

sed s/_task-.*//g 

This removes even dir-** after task. sub-285345_epi.nii.gz

How can i remove only task key value pair?

Kashi Vishwa
  • 81
  • 1
  • 1
  • 6
  • See: [The Stack Overflow Regular Expressions FAQ](http://stackoverflow.com/a/22944075/3776858) – Cyrus Aug 19 '16 at 17:50
  • You haven't told us what the `***` represents in "I want to remove `_task-***`". Is it the part up to the next `_` or `-` or `.` or something else? [edit] your question to include the expected output given that input. – Ed Morton Aug 19 '16 at 19:04

3 Answers3

1

Do:

sed 's/_task-[^_]*//' 

[^_]* will match upto the next _.

Example:

$ sed 's/_task-[^_]*//' <<<'sub-285345_task-WM_dir-28_epi.nii'
sub-285345_dir-28_epi.nii

$ sed 's/_task-[^_]*//' <<<'sub-285345_task-LANGUAGE_dir-11_epi.nii.gz'
sub-285345_dir-11_epi.nii.gz
heemayl
  • 32,535
  • 3
  • 52
  • 57
  • This is ok when there is only one word between task- and _dir. If there will be a double word with an underscore that would not work. This is IMO safer method: `sed 's/\(_task-.*_dir-\)/_dir-0/'` – Dave Grabowski Aug 19 '16 at 18:00
  • @DawidGrabowski This was to cover OP's example, if they want we can come with a different one. As it stands now, this covers OP's cases very well. – heemayl Aug 19 '16 at 18:02
0

.* is greedy.

Try something like

$ echo sub-285345_task-WM_dir-28_epi.nii | sed -r 's/_task-.*?_dir/_dir/'
sub-285345_dir-28_epi.nii
Marc Young
  • 3,541
  • 3
  • 15
  • 21
0

The underscore seems to be a delimiter. In that case you can use

echo "sub-285345_task-WM_dir-28_epi.nii" | cut -d"_" -f1,3-
Walter A
  • 16,400
  • 2
  • 19
  • 36