0

I've got a text file given, and the results and the counts vary (date, link and id can be anything). However, the count of dates, links and id's is always the same (so n - n - n for any positive integer n). Is n a positive integer, then the lines n, (n + k/3) and (n+2(k/3)), where k is the number of all lines, belong together.

As en example, I picked out n=3. So lines (1 | 4 | 7), (2 | 5 | 8) and (3 | 6 | 9) belong together:

Today, 17:09
Yesterday, 09:44
08.09.2020
07.09.2020
06.09.2020
/s-show/Link111...
/s-show/Link211...
/s-show/Link311...
/s-show/Link411...
/s-show/Link511...
id="1222222222"
id="2222222222"
id="3222222222"
id="4222222222"
id="5222222222"

I would like to sort the text file as the following:

id="1222222222"Today, 17:09/s-show/Link111...
id="2222222222"Yesterday, 09:44/s-show/Link211
id="3222222222"08.09.2020/s-show/Link311
id="4222222222"07.09.2020/s-show/Link411
id="5222222222"06.09.2020/s-show/Link511

In a former question, I only had two categories (date and link) and was adviced to do it like the following:

lc=$(wc -l <Textfile); paste -d '' <(head -n $((lc/2)) Textfile) <(tail -n  
$((lc/2)) Textfile)

However, here I have 3 categories and the head and tail command won't let me read only the lines in the middle. How could this be solved?

X3nion
  • 51
  • 6
  • You can _combine_ `head` and `tail` to get content in the middle. (It's not what I would actually choose to do, but it's certainly something you _can_ do). – Charles Duffy Sep 08 '20 at 22:58
  • 1
    ...in terms of what I'd _actually_ do, I wouldn't choose to use bash for the job at hand at all. A programming language with access to `seek()` and `tell()` calls will allow a much more efficient implementation. – Charles Duffy Sep 08 '20 at 23:00
  • (...and if you can control the file format, it'd be better to change it -- three separate files would be much easier to handle efficiently). – Charles Duffy Sep 08 '20 at 23:01
  • 1
    ...as for your immediate question, though -- do you consider [How can I extract a predetermined range of lines from a text file on UNIX](https://stackoverflow.com/questions/83329) a duplicate? As I understand it, it's narrowly addressing the part of your problem you don't currently know how to do. – Charles Duffy Sep 08 '20 at 23:03
  • @CharlesDuffy Could you maybe provide a possible code for that? – X3nion Sep 08 '20 at 23:25
  • Which "that"? Leveraging the linked duplicate for a shell-native implementation, or writing an efficient implementation in a non-shell language? – Charles Duffy Sep 08 '20 at 23:39
  • Sorry @CharlesDuffy, I mean a shell script for the issue. Would that be possible? – X3nion Sep 08 '20 at 23:42
  • As another advantage fo changing your input format to take three separate files -- with the current format, to add an extra line to each of the three inputs you need to _completely rewrite_ your input file past the end of its first 3rd. Make it three separate files, and it's three very short `write()` calls with just the data you want to add being appended to each in turn. – Charles Duffy Sep 08 '20 at 23:42
  • ...it's _possible_, sure. Don't expect it to be efficient, though. – Charles Duffy Sep 08 '20 at 23:43
  • Into which input format would you change the file? – X3nion Sep 08 '20 at 23:43
  • The format I would choose and recommend is multiple, distinct files. One with only dates, one with only links, one with only ids. – Charles Duffy Sep 08 '20 at 23:44
  • Well with the head and tail command it was easy to write the first part and second part. But what about in this case, how can I make write read the first 3rd, second 3rd and third 3rd of the file? Could you maybe provide an example for the code? – X3nion Sep 08 '20 at 23:46
  • Well and if I have three separate files (one containing the date, the other the Link and the last one the id), what would then be the command to merge them according to this criterion? – X3nion Sep 08 '20 at 23:53
  • 1
    `paste file1 file2 file3`. Which is to say, if you have three separate files, the whole problem becomes _completely trivial_. – Charles Duffy Sep 09 '20 at 00:00
  • 1
    ...if you _really_ don't want tabs added, add the `-d` argument to `paste`, as in, `paste -d '' file1 file2 file3`. – Charles Duffy Sep 09 '20 at 00:03
  • BTW, the "how do I use paste?" part of this question is duplicative of [how to merge two files consistently line-by-line](https://stackoverflow.com/questions/16394176/how-to-merge-two-files-consistently-line-by-line). – Charles Duffy Sep 09 '20 at 00:06

1 Answers1

0

Leveraging the techniques taught in How can I extract a predetermined range of lines from a text file on Unix? --

#!/usr/bin/env bash

input=$1
total_lines=$(wc -l <"$1")
sections=$2

lines_per_section=$(( total_lines / sections ))
if (( lines_per_section * sections != total_lines )); then
  echo "ERROR: ${total_lines} does not evenly divide into ${sections} sections" >&2
  exit 1
fi

start=0
ranges=( )
for (( i=0; i<sections; i++ )); do
  ranges+=( "$start:$(( start + lines_per_section ))" )
  (( start += lines_per_section ))
done

get_range() { sed -n "$(( $1 + 1 )),$(( $2 ))p;$(( $2 + 1 ))q" <"$input"; }
consolidate_input() {
  if (( $# )); then
    current=$1; shift
    paste <(get_range "${current%:*}" "${current#*:}") <(consolidate_input "$@")
  fi
}

consolidate_input "${ranges[@]}"

But don't do that. Just put your three sections in three separate files, so you can use paste file1 file2 file3.

Charles Duffy
  • 235,655
  • 34
  • 305
  • 356