0

I am reading a text file in Java that looks like this,

" Q1. You are given a train data set having 1000 columns and 1 million rows. The data set is based on a classification problem. Your manager has asked you to reduce the dimension of this data so that model computation time can be reduced. Your machine has memory constraints. What would you do? (You are free to make practical assumptions.)

Q2. Is rotation necessary in PCA? If yes, Why? What will happen if you don’t rotate the components?

Q3. You are given a data set. The data set has missing values which spread along 1 standard deviation from the median. What percentage of data would remain unaffected? Why? "

Now, I want to read this file and then store each of these sentences(questions) in a string array. How can I do that in java?

I tried this,

String mlq = new String(Files.readAllBytes(Paths.get("MLques.txt")));
String[] mlq1=mlq.split("\n\n");

But this is not working.

  • How do you know it is not working? – ThomasEdwin Dec 08 '17 at 12:06
  • 1
    Possible duplicate of [Reading a plain text file in Java](https://stackoverflow.com/questions/4716503/reading-a-plain-text-file-in-java) – red13 Dec 08 '17 at 12:08
  • @ThomasEdwin sir, because mlq1 is coming as an array of size 1. So its not splitting basically – Anukool Raj Dec 08 '17 at 12:11
  • Why is there two consecutive new line characters "\n\n"? Also, try to give encoding parameter as the second input to String(). For ML questions, visit Cross Validated: https://stats.stackexchange.com – Dorukhan Arslan Dec 08 '17 at 13:07

3 Answers3

0
File file = new File("C:\\MLques.txt");
BufferedReader br = new BufferedReader(new FileReader(file));
String st;
while ((st = br.readLine()) != null) {
    System.out.println(st + "\n");
}

I think it will work.

Dorukhan Arslan
  • 2,366
  • 1
  • 16
  • 36
Naresh
  • 1
  • 1
  • It is printing the contents of the file, but it is not printing the whole file. It is starting from around half of the text file. Also, I need it in an array. Like the 1st question which may be of 2 lines, should be the 1st element of the array. Like that. – Anukool Raj Dec 08 '17 at 12:59
0

This is a piece of code from one of my project.

public static List<String> readStreamByLines(InputStream in) throws IOException {
    return IOUtils.readLines(in, StandardCharsets.UTF_8).stream()
                  .map(String::trim)
                  .collect(Collectors.toList());
}

But!!! If you have really big file, then collecting all content into a List is not good. You have to read InputStream line by line and do all you need for every single row.

oleg.cherednik
  • 12,764
  • 2
  • 17
  • 25
0

Try this

String mlq = new String(Files.readAllBytes(Paths.get("MLQ.txt")));
    String[] mlq1=mlq.split("\r\n\r\n");
    System.out.println(mlq1.length);
    System.out.println(Arrays.toString(mlq1));

This should do it by line gap of 2 lines.

A Paul
  • 7,473
  • 2
  • 24
  • 53