What's the point of the threshold in a perceptron?

Question

I'm having trouble seeing what the threshold actually does in a single-layer perceptron. The data is usually separated no matter what the value of the threshold is. It seems a lower threshold divides the data more equally; is this what it is used for?

related question: http://stackoverflow.com/questions/1697243/help-with-perceptron — Amro, Jul 03 '11 at 10:56

score 15 · Accepted Answer · answered Jul 03 '11 at 14:29

Actually, you'll just set threshold when you aren't using bias. Otherwise, the threshold is 0.

Remember that, a single neuron divides your input space with a hyperplane. Ok?

Now imagine a neuron with 2 inputs X=[x1, x2], 2 weights W=[w1, w2] and threshold TH. The equation shows how this neuron works:

x1.w1 + x2.w2 = TH

this is equals to:

x1.w1 + x2.w2 - 1.TH = 0

I.e., this is your hyperplane equation that will divides the input space.

Notice that, this neuron just work if you set manually the threshold. The solution is change TH to another weight, so:

x1.w1 + x2.w2 - 1.w0 = 0

Where the term 1.w0 is your BIAS. Now you still can draw a plane in your input space without set manually a threshold (i.e, threshold is always 0). But, in case you set the threshold to another value, the weights will just adapt themselves to adjust equation, i.e., weights (INCLUDING BIAS) absorves the threshold effects.

Thanks for clarifying, that makes a lot of sense! I think it's essentially what I said only clearer. — Hypercube, Jul 03 '11 at 18:07
@renatopp Are you aware if there are any effects on feature importance based on a learned threshold? — dmcqu314, Feb 13 '14 at 13:18

score 3 · Answer 2 · answered Jul 02 '11 at 03:04

3

The sum of the products of the weights and the inputs is calculated in each node, and if the value is above some threshold (typically 0) the neuron fires and takes the activated value (typically 1); otherwise it takes the deactivated value (typically -1). Neurons with this kind of activation function are also called Artificial neurons or linear threshold units.

answered Jul 02 '11 at 03:04

Patrick Desjardins

125,683
80
286
335

2

I understand this, but what is the point of changing the threshold value? The training set allows for the separation of points anyways. Is the threshold value arbitrary? – Hypercube Jul 02 '11 at 03:11
1

Well, you can "train" the threshold so it become dynamically and not suggestive. At the beginning you can set it almost randomly or to zero. – Patrick Desjardins Jul 02 '11 at 03:15
Oh, I see. So essentially, changing the threshold is equivalent to changing the bias, but negatively (since they are on opposite sides of the equation)? – Hypercube Jul 02 '11 at 04:29
Changing bias resonates for me. So essentially you are shifting the separating line? How about the slope of that line? – John Strong Jun 26 '17 at 18:21
So when we use a threshold other than 0 in training, we also use the same threshold for classifying new unseen sample, right? For example, if we choose our threshold to be 10 then we use the same threshold of 10 to decide which class an unseen sample belongs to, right? – André Yuhai Dec 05 '20 at 15:44

score 3 · Answer 3 · edited May 23 '17 at 10:29

3

I think I understand now, with help from Daok. I just wanted to add information for other people to find.

The equation for the separator for a single-layer perceptron is

Σw_jx_j+bias=threshold

This means that if the input is higher than the threshold, or

Σw_jx_j+bias > threshold, it gets classified into one category, and if

Σw_jx_j+bias < threshold, it get classified into the other.

The bias and the threshold really serve the same purpose, to translate the line (see Role of Bias in Neural Networks). Being on opposite sides of the equation, though, they are "negatively proportional".

For example, if the bias was 0 and the threshold 0.5, this would be equivalent to a bias of -0.5 and a threshold of 0.

edited May 23 '17 at 10:29

Community

1
1

answered Jul 03 '11 at 01:17

Hypercube

1,143
2
10
16

This video helped me quite a lot: https://youtu.be/dXuNAkHsos4. I interpret it to be saying that in addition to rotating the weight vector based on its angle from the vector of input data, we also want to make a binary determination of whether the weight vector should be flipped 180 degrees into a mirror quadrant on the opposite side of the decision boundary. Thus thresholding helps us determine the sign of angle we are rotating by. – John Strong Jun 27 '17 at 00:50

What's the point of the threshold in a perceptron?

3 Answers3

Linked