-3

I have a log file contain multi-line string. Each log entry will start with date and time in this format 2016/07/15 14:20:57:642. Would it be possible to use regex to extract everything between this date time?

2016/07/15 14:20:57:642  Log info
2016/07/15 14:22:37:213  Log info
2016/07/15 14:34:41:286  Log info
2016/07/15 14:44:09:618  Log info
2016/07/15 15:02:40:539  Log info
2016/07/15 15:02:40:700  Uploading Activities <KBDailyDataCollection: 0x7fbe7c9b4af0> {
    data =     (
    );
}
2016/07/15 15:02:40:709  Uploading Activities <KBDailyDataCollection: 0x7fbe7c8a48e0> {
    data =     (
    );
}
2016/07/15 15:02:40:710  Uploading Activities <KBDailyDataCollection: 0x7fbe7c9c10c0> {
    data =     (
    );
}
2016/07/15 15:02:40:713  Uploading Activities <KBDailyDataCollection: 0x7fbe7c87f540> {
    data =     (
    );
}
2016/07/15 15:02:48:277  Uploading Activities <KBDailyDataCollection: 0x7fbe7c8d3e80> {
    data =     (
    );
}
2016/07/15 15:04:57:072  Log info
2016/07/15 15:04:57:216  Uploading Activities <KBDailyDataCollection: 0x7f95d2df0e60> {
    data =     (
    );
}
2016/07/15 15:04:57:219  Uploading Activities <KBDailyDataCollection: 0x7f95d2f235d0> {
    data =     (
    );
}
2016/07/15 15:04:57:221  Uploading Activities <KBDailyDataCollection: 0x7f95d51b6520> {
    data =     (
    );
}
2016/07/15 15:04:57:225  Uploading Activities <KBDailyDataCollection: 0x7f95d2f4a950> {
    data =     (
    );
}
2016/07/15 15:29:50:543  Log info
2016/07/15 15:29:50:721  Uploading Activities <KBDailyDataCollection: 0x7ff6e6158be0> {
    data =     (
    );
}
2016/07/15 15:29:50:725  Uploading Activities <KBDailyDataCollection: 0x7ff6e60ae2f0> {
    data =     (
    );
}
2016/07/15 15:29:50:728  Uploading Activities <KBDailyDataCollection: 0x7ff6e60b1790> {
    data =     (
    );
}
2016/07/15 15:29:50:732  Uploading Activities <KBDailyDataCollection: 0x7ff6e6180b50> {
    data =     (
    );
}
2016/07/15 15:33:34:046  Uploading Activities <KBDailyDataCollection: 0x7ff6e3da5840> {
    data =     (
    );
}
2016/07/15 15:33:41:379  Uploading Activities <KBDailyDataCollection: 0x7ff6e6188340> {
    data =     (
    );
}
2016/07/15 15:34:13:570  Log info
2016/07/15 15:34:13:764  Uploading Activities <KBDailyDataCollection: 0x7fa3c399ac40> {
    data =     (
    );
}
2016/07/15 15:34:13:765  Uploading Activities <KBDailyDataCollection: 0x7fa3c38bc710> {
    data =     (
    );
}
2016/07/15 15:34:13:766  Uploading Activities <KBDailyDataCollection: 0x7fa3c39a81f0> {
    data =     (
    );
}
2016/07/15 15:34:13:767  Uploading Activities <KBDailyDataCollection: 0x7fa3c3999d70> {
    data =     (
    );
}
2016/07/15 15:36:04:247  Log info
2016/07/15 15:36:04:462  Uploading Activities <KBDailyDataCollection: 0x7fa4c8c11030> {
    data =     (
    );
}
2016/07/15 15:36:04:466  Uploading Activities <KBDailyDataCollection: 0x7fa4cb04b220> {
    data =     (
    );
}
2016/07/15 15:36:04:477  Uploading Activities <KBDailyDataCollection: 0x7fa4c8f6cab0> {
    data =     (
    );
}
2016/07/15 15:36:04:484  Uploading Activities <KBDailyDataCollection: 0x7fa4cb121d00> {
    data =     (
    );
}
2016/07/15 15:41:32:582  Uploading Activities <KBDailyDataCollection: 0x7fa4cb06f160> {
    data =     (
    );
}
2016/07/15 15:45:51:920  Log info
2016/07/15 15:45:52:119  Uploading Activities <KBDailyDataCollection: 0x7fefcb417a00> {
    data =     (
    );
}
2016/07/15 15:45:52:121  Uploading Activities <KBDailyDataCollection: 0x7fefcd977670> {
    data =     (
    );
}
2016/07/15 15:45:52:122  Uploading Activities <KBDailyDataCollection: 0x7fefcda09060> {
    data =     (
    );
}
2016/07/15 15:45:52:125  Uploading Activities <KBDailyDataCollection: 0x7fefcb6995b0> {
    data =     (
    );
}
2016/07/15 15:45:56:136  Uploading Activities <KBDailyDataCollection: 0x7fefcb7f79e0> {
    data =     (
    );
}
2016/07/15 15:47:01:250  Log info
2016/07/15 15:47:01:449  Uploading Activities <KBDailyDataCollection: 0x7ff82acb6f90> {
    data =     (
    );
}
2016/07/15 15:47:01:451  Uploading Activities <KBDailyDataCollection: 0x7ff82d196ad0> {
    data =     (
    );
}
2016/07/15 15:47:01:454  Uploading Activities <KBDailyDataCollection: 0x7ff82ac32000> {
    data =     (
    );
}
2016/07/15 15:47:01:456  Uploading Activities <KBDailyDataCollection: 0x7ff82afde380> {
    data =     (
    );
}
2016/07/15 15:47:48:261  Log info
2016/07/15 15:47:48:524  Uploading Activities <KBDailyDataCollection: 0x7fc2bd0722c0> {
    data =     (
    );
}
2016/07/15 15:47:48:529  Uploading Activities <KBDailyDataCollection: 0x7fc2bad2edf0> {
    data =     (
    );
}
2016/07/15 15:47:48:533  Uploading Activities <KBDailyDataCollection: 0x7fc2bae7a540> {
    data =     (
    );
}
2016/07/15 15:47:48:539  Uploading Activities <KBDailyDataCollection: 0x7fc2bd0dfd00> {
    data =     (
    );
}
2016/07/15 15:49:01:597  Log info
2016/07/15 15:49:01:811  Uploading Activities <KBDailyDataCollection: 0x7ff2305cb860> {
    data =     (
    );
}
2016/07/15 15:49:01:812  Uploading Activities <KBDailyDataCollection: 0x7ff2328ba5b0> {
    data =     (
    );
}
2016/07/15 15:49:01:815  Uploading Activities <KBDailyDataCollection: 0x7ff23289d870> {
    data =     (
    );
}
2016/07/15 15:49:01:817  Uploading Activities <KBDailyDataCollection: 0x7ff2304a6880> {
    data =     (
    );
}
2016/07/15 15:50:32:141  Log info
2016/07/15 15:50:32:285  Uploading Activities <KBDailyDataCollection: 0x7f8eb8426e70> {
    data =     (
    );
}
2016/07/15 15:50:32:287  Uploading Activities <KBDailyDataCollection: 0x7f8eba898f40> {
    data =     (
    );
}
2016/07/15 15:50:32:291  Uploading Activities <KBDailyDataCollection: 0x7f8ebaa48be0> {
    data =     (
    );
}
2016/07/15 15:50:32:294  Uploading Activities <KBDailyDataCollection: 0x7f8eb8537090> {
    data =     (
    );
}
2016/07/15 15:54:32:171  Uploading Activities <KBDailyDataCollection: 0x7f8eba9c4980> {
    data =     (
    );
}

Result would be something like these matches

  Log info

  Log info

  Log info

  Log info

  Log info

Uploading Activities <KBDailyDataCollection: 0x7fbe7c9b4af0> {
    data =     (
    );
}

Progress so far, still not work.

([0-9]{4}\/[0-9]{2}\/[0-9]{2} [0-9]{2}:[0-9]{2}:[0-9]{2}:[0-9]{3})([\s\S]*?)([0-9]{4}\/[0-9]{2}\/[0-9]{2} [0-9]{2}:[0-9]{2}:[0-9]{2}:[0-9]{3})
sarunw
  • 7,448
  • 10
  • 43
  • 76

1 Answers1

0

You could for example use RegExr to interactively get the expression you need, then perform a string replace with multiline mode enabled to filter out the date you don't need:

import re
re.sub(r"\d{4}\/\d{2}\/\d{2} \d{2}:\d{2}:\d{2}:\d{3}", r"", input_text, re.M)

\d means digit. {n} means there should be n matches.