2

I am writing a python program which reads C++ program files from a folder. The program then strips only certain parts from the C++ code read and writes them into a new separate file.

This is the C++ code that I am trying to strip:

#include <stdio.h>

int max(int num1, int num2);

//selectmethod 
int main () {
   int a = 100;
   int b = 200;
   int ret;
   ret = max(a, b);
   printf( "Max value is : %d\n", ret );
   return 0;
}

int max(int num1, int num2) {
   int result;
   if (num1 > num2)
      result = num1;
   else
      result = num2;
   return result; 
}

The only parts I am interesting in taking are marked with a //selectmethod comment. In the example code above, this would be the main() method.

Right now I have a python code which reads this file and writes all of the file contents into an output file. However, I want to modify my code so the output file will only contain this:

int main () {
   int a = 100;
   int b = 200;
   int ret;
   ret = max(a, b);
   printf( "Max value is : %d\n", ret );
   return 0;
}

The python code I have so far is below:

import glob
import os.path

list_of_files = glob.glob('/my/input/files/*.cc')

def main():
    for file_name in list_of_files:
        print(file_name)

        f= open(file_name, 'r')
        lst = [];
        plist = [];
        for line in f:
           fline = line.strip()
           lst.append(fline)
           plist.append(line)
        f.close()

        print(lst)

        f=open(os.path.join('/my/output/files/path',
        os.path.basename(file_name)), 'w')

        for line2 in plist:
           f.write(line2)
        f.close()


if __name__ == "__main__":
    main()

How can I modify my code to only extract the method below lines starting with //selectmethod?

Meowmere
  • 3,924
  • 1
  • 14
  • 33
user5455438
  • 119
  • 1
  • 1
  • 8

2 Answers2

1

You can do it using a simple Regex expression, here's a sample code below When searching, the expressions uses the single line option re.S to treat the file as a single string, allowing to capture multi lines. Regex is a very powerful and efficient way to search/replace text, for more information see https://www.w3schools.com/python/python_regex.asp

import re

#a multi line string representing the C file
string = """
#include <stdio.h>

int max(int num1, int num2);

//selectmethod 
int main () {
   int a = 100;
   int b = 200;
   int ret;
   ret = max(a, b);
   printf( "Max value is : %d\n", ret );
   return 0;
}

int max(int num1, int num2) {
   int result;
   if (num1 > num2)
      result = num1;
   else
      result = num2;
   return result; 
}
"""

result = re.search("int main\s*\(\s*\)\s*\{.*?\}",string, re.S).group()
print(result)
Mahmoud Hanafy
  • 703
  • 9
  • 11
  • This is the sort of solution that I am looking to implement. Would this still work for different types of methods (for example methods that are not "main" and methods that have lot of if and else statements inside them – user5455438 Nov 19 '19 at 02:06
  • This can be done using recursion or subroutines, i.e. this expression will match exactly what you want `int\s*main\s*(\{(?:[^}{]*|(?1))*\})` – Mahmoud Hanafy Nov 19 '19 at 03:58
  • The above is using subroutines, which isn't natively supported by Python; this one uses recursion in python but can't match the `int main` part `\{[^}{]*+(?:(?R)[^}{]*)*+\}` – Mahmoud Hanafy Nov 19 '19 at 03:59
  • credit to this answer here https://stackoverflow.com/a/35271017/2136590 – Mahmoud Hanafy Nov 19 '19 at 04:01
1

In general, this task is equal to creating full C++ parser even if code is "properly" formatted. For those who look for single } character on a line, here as an example of C++ code that contains two false positives in a raw string literal:

#include <iostream>
int main()
{
    std::cout << R"(Rules: You may use any JSON string but the following three characters are forbidden by Big Brother Inc.:
$
!
}

Example of JSON string:
{
  "name":"value"
}
)";
    return 0;
}

C++ code may also contain namespaces that are usually end with single } on a line.

Although you may be lucky and your C++ code is quite simple so you don't need full C++ parser :)

If someone marked the beginning of functions of interest by //selectmethod marker, ask that person to also mark the end of those functions by some other marker :)

4LegsDrivenCat
  • 746
  • 1
  • 10
  • 22