11

I am new here. I recently started working with object detection and decided to use the Tensorflow object detection API. But, when I start training the model, it does not display the global step like it should, although it's still training in the background.

Details: I am training on a server and accessing it using OpenSSH on Windows. I trained a custom dataset, by collecting pictures and labeling them. I trained it using model_main.py. Also, until a couple of months back, the API was a little different, and only recently they changed to the latest version. For instance, earlier it used to use train.py for training, instead of model_main.py. All the online tutorial I can find use train.py, so it might be a problem with the latest commit. But I don't find anyone else fining this problem.

Thanks in advance!

Aditya Singh
  • 513
  • 4
  • 11
  • Welcome to the site. It is indeed a bit difficult to help with limited information. You could try to read [How do I ask a good question?](https://stackoverflow.com/help/how-to-ask) from the [Help center](https://stackoverflow.com/help). – user2653663 Aug 25 '18 at 10:49
  • If you use putty kind of a client and invoke the train.py from the putty you will be able to see the steps and progress – Srinivas Bringu Aug 25 '18 at 21:16
  • @SrinivasBringu Hi, thanks for the reply. I am using openSSH on windows. And it's still not showing the steps. Also, it no more uses train.py, they changed their API and it used model_main.py now. – Aditya Singh Aug 26 '18 at 04:13
  • @user2653663 I added more details! – Aditya Singh Aug 26 '18 at 04:20
  • 1
    Thanks for this amazing question!!! i've been wasting quite a lot of time thinking training is not happening – Piyal George Oct 25 '18 at 05:30
  • @PiyalGeorge I am glad it helped! – Aditya Singh Oct 28 '18 at 13:51

3 Answers3

15

Add tf.logging.set_verbosity(tf.logging.INFO) after the import section of the model_main.py script. It will display a summary after every 100th step.

Thommy257
  • 656
  • 6
  • 20
  • 1
    It worked, thanks! I had a few more doubts: 1. Why did it not work initially? 2. Can I make it display every step? – Aditya Singh Aug 29 '18 at 20:41
  • Cool, would you mind to accept my answer as the right one, so that it helps others too? To your questions: 1. I actually don't know why they disabled this feature, maybe someone just forgot to commit it in the official repository. 2. Should be possible but I've not figured out how. But having an output after every 100th seem sufficient for debugging (to at least check if the process is still running). – Thommy257 Aug 30 '18 at 08:06
  • I figured it out! Thanks for the help, it saved me a lot of time. Great community! – Aditya Singh Sep 01 '18 at 05:54
11

As Thommy257 suggested, adding tf.logging.set_verbosity(tf.logging.INFO) after the import section of model_main.py prints the summary after every 100 steps by default.

Further, to specify the frequency of the summary, change

config = tf.estimator.RunConfig(model_dir=FLAGS.model_dir)

to

config = tf.estimator.RunConfig(model_dir=FLAGS.model_dir, log_step_count_steps=k)

where it will print after every k steps.

Aditya Singh
  • 513
  • 4
  • 11
2

Regarding the recent change to model_main , the previous version is available at the folder "legacy". I use train.py and eval.py from this legacy folder with the same functionality as before.

Hafplo
  • 71
  • 2
  • 8