1

I have the following tab delimited data:

chr1    3119713 3119728 MA05911Bach1Mafk    839 +
chr1    3119716 3119731 MA05011MAFNFE2  860 +
chr1    3120036 3120051 MA01502Nfe2l2   866 +

What I want to do is to remove 7 characters from 4th column. Resulting in

chr1    3119713 3119728 Bach1Mafk   839 +
chr1    3119716 3119731 MAFNFE2 860 +
chr1    3120036 3120051 Nfe2l2  866 +

How can I do that? Note the output needs to be also TAB separated.

I'm stuck with the following code, which replaces from the first column onward, which I don't want

sed 's/^.\{7\}//' myfile.txt
scamander
  • 3,218
  • 3
  • 21
  • 49

3 Answers3

6
 awk  '{ $4 = substr($4, 8); print }'
Michael Rourke
  • 1,499
  • 13
  • 23
5
perl -anE'$F[3] =~ s/.{7}//; say join "\t", @F' data.txt

or

perl -anE'substr $F[3],0,7,""; say join "\t", @F' data.txt
zdim
  • 53,586
  • 4
  • 45
  • 72
0

With sed

$ sed -E 's/^(([^\t]+\t){3}).{7}/\1/' myfile.txt
chr1    3119713 3119728 Bach1Mafk   839 +
chr1    3119716 3119731 MAFNFE2 860 +
chr1    3120036 3120051 Nfe2l2  866 +
  • -E use extended regular expressions, to avoid having to use \ for (){}. Some sed versions might need -r instead of -E
  • ^(([^\t]+\t){3}) capture the first three columns, easy to change number of columns if needed
  • .{7} characters to delete from 4th column
  • \1 the captured columns
  • Use -i option for in-place editing


With perl you can use \K for variable length positive lookbehind

perl -pe 's/^([^\t]+\t){3}\K.{7}//' myfile.txt
Sundeep
  • 19,273
  • 2
  • 19
  • 42