2

I have successfully deployed a ML web service using the Reader module to take in CSV data from my blob storage. I can see the CSV data is correct by visualizing it in the experiment.

However, when I try to provide the SAME CSV data as input to the web service using the BES example from this tutorial, I get the following error:

Error 1000: AFx Library exception: table: The data set being scored must
contain all features used during training, missing feature(s): 'Col 2'.

This error makes no sense as the SAME data is successfully accepted by the experiment.

Also note that the same problem occurs when I use TSV format.

Serge
  • 574
  • 3
  • 9

1 Answers1

1

Here is how I could have it working.

1/ I create an experiment that looks like what you describe. sample experiment

the reader reads the following file from blob storage:

col 1,col 2
1.32,somestring
3.34,anotherstring

the apply SQL transformation has the following statement:

select sum([col 1]) from t1

2/ publish the web service

3/ go to the Batch Execution (BES) documentation and copy the Python code

4/ In a text editor replace the values as documented in the beginning of the invokeBatchExecutionService method (storage_account_name, storage_account_key, storage_container_name, api_key values)

5/ create a new Python 2 notebook in your Azure ML workspace

In the first cell, copy and paste the following code:

with open('input1data.csv','a') as myfile:
    myfile.write("col 1,col 2\n")
    myfile.write("1.32,somestring\n")
    myfile.write("3.34,anotherstring\n")

On next cell, copy and paste the code you wrote at step 4/

on Next cell, copy and paste the following code:

with open('myresults.csv','r') as myfile:
    for line in myfile:
        print(line)

execute the cells, in order. You should get the following result with third cell execution:

sum([col 1])

4.66
benjguin
  • 1,446
  • 1
  • 12
  • 19
  • I will try the sql transformation and use my typical C# BES implementation. THanks – Serge Dec 19 '15 at 03:41
  • The goal of the SQL transformation here is just to simplify the problem. Same with the Python notebook, as it depends on Azure ML environment, not your or my PC environment. Please let us know if that solves your issue. – benjguin Dec 19 '15 at 10:53
  • The SQL transformation leaves the training experiment with the following error on the Train Model component: Number of columns in input dataset is less than allowed minimum of 2 column(s). – Serge Dec 21 '15 at 01:06
  • I would have though of regional settings, but I tried on my laptop with a decimal symbol of "," and a list separator of ";". The workspace and the IPython notebook work as expected. Can you replace the Reader by an "Enter Data" module, and copy/paste my sample file content of 3 lines (col 1,...) in that "Enter Data" module? – benjguin Dec 21 '15 at 08:24