-1

first, I need to extract the substring by a known position in the file.txt file.txt in bash, but starting from the second line

>header
cgatgcgctctgtgcgtgcgtgcg

so let's assume I want position 10 from the second line, the output should be:

c

second, I want to include the surrounding ±5 characters, resulting in

gcgctctgtgc
rororo
  • 672
  • 8
  • 27
  • I'm voting to close this question as off-topic because "I need X" and "I want X" are not (programming) questions. – melpomene May 28 '17 at 09:54
  • @melpomene you are funny! First you post the only correct answer, then you complain for the question not being good enough. I think this is a legitimate question (how to do this in bash?), only lacking to show some effort. More funny, the OP accepted an answer in `awk`. Ah ah... at least, you got two upvotes! – linuxfan says Reinstate Monica May 28 '17 at 10:12
  • I simply LOVE awk :D and it works... – rororo May 28 '17 at 10:17

3 Answers3

2
{ read -r; read -r; echo "${REPLY:9:1}"; echo "${REPLY:4:11}"; } < file.txt

Output:

c
gcgctctgtgc

The ${parameter:offset:length} syntax for substrings is explained in https://www.gnu.org/software/bash/manual/bashref.html#Shell-Parameter-Expansion.

The read command is explained in https://www.gnu.org/software/bash/manual/bashref.html#index-read.

Input redirection: https://www.gnu.org/software/bash/manual/bashref.html#Redirections.

melpomene
  • 79,257
  • 6
  • 70
  • 127
  • @linuxfan OP didn't ask for an explanation, only code ("I need to do X in bash"), and most of the other answers don't explain anything either (notable exception: the awk one, but that one doesn't use bash). – melpomene May 28 '17 at 09:44
  • Almost no one asks for explanation. But for future readers who might have a slightly different problem, the explanation is of great value. – choroba May 28 '17 at 09:50
  • So you think that every OP should ask explicitely for an explanation? I don't. Anyway, you are the only one who did respect the tag `bash`. But an explanation is useful to everyone happens to stumble on this question+answer. – linuxfan says Reinstate Monica May 28 '17 at 09:50
  • @linuxfan No, I think every OP should explicitly ask a question. "I want X" or "I need X" aren't questions. Preferably questions should also include at least some effort to solve the problem (e.g. looking at the bash manual). This question should be closed as off-topic IMHO. – melpomene May 28 '17 at 09:53
  • @linuxfan I try myself to do some research and find an explanation after a solution has been given; @melpomene nice way, when I am using it with a variable as part of a bigger script I am using it like this: `[...]${REPLY:($var-1):1}[...]` – rororo May 28 '17 at 09:59
1

use sed and cut:

sed  -n '2p' file|cut -c 5-15

sed for access 2nd line and cut for print desired characters

tso
  • 4,174
  • 2
  • 17
  • 30
1

With awk:

To get the character at position 10, 1-indexed:

awk 'NR==2 {print substr($0, 10, 1)}'
  • NR==2 is checking if the record is second, if so the statements inside {} would be executed

  • substr($0, 10, 1) will extract 1 character starting from position 10 from field $0 (the whole record) i.e. only the 10-th character will be extracted. The format for substr() is substr(field, offset, length).

Similarly, to get ±5 characters around 10-th:

awk 'NR==2 {print substr($0, (10-5), 11)}'

(10-5) instead of 5 is just to give you the idea of the stuffs.

Example:

% cat file.txt                      
>header
cgatgcgctctgtgcgtgcgtgcg

% awk 'NR==2 {print substr($0, 10, 1)}' file.txt     
c

% awk 'NR==2 {print substr($0, (10-5), 11)}' file.txt
gcgctctgtgc
heemayl
  • 32,535
  • 3
  • 52
  • 57