If one value of bias is high and one is low then what indication it gives us?

up vote
1
down vote

favorite

I am dealing with fully connected neural network. Where I initialize bias with zero weights. But during training process one bias adopts a high positive value and other adopts negative value. I want to classify my data into two classes. I want to know what these bias values tell us?How they can help in classification problem?

asked Nov 10 at 4:22

R.joe

migrated from stackoverflow.com Nov 10 at 10:01

This question came from our site for professional and enthusiast programmers.

add a comment |

up vote
1
down vote

favorite

asked Nov 10 at 4:22

R.joe

migrated from stackoverflow.com Nov 10 at 10:01

This question came from our site for professional and enthusiast programmers.

add a comment |

up vote
1
down vote

favorite

asked Nov 10 at 4:22

R.joe

machine-learning

asked Nov 10 at 4:22

R.joe

asked Nov 10 at 4:22

R.joe

asked Nov 10 at 4:22

R.joe

asked Nov 10 at 4:22

R.joe

asked Nov 10 at 4:22

R.joe

migrated from stackoverflow.com Nov 10 at 10:01

This question came from our site for professional and enthusiast programmers.

migrated from stackoverflow.com Nov 10 at 10:01

This question came from our site for professional and enthusiast programmers.

add a comment |

1 Answer
1

active

oldest

votes

up vote
0
down vote

The biases cannot be interpreted independently of the weights. It could just mean your weights are getting quite large, and so the biases get large as well. If you use weight decay properly, though, this shouldn't happen. Assuming you did that, then at first glance, it seems that it means your data can be well separated into two classes. Does your test accuracy reflect this? Can you get a high classification accuracy on the test set?

However, in general it is not a good idea to over-interpret the weights and biases of a neural network. They are created by a random highly non-convex gradient descent. If you were to run the optimization again, you would get different weights and biases, even if you got the same level of accuracy. It is better to try to run experiments on the output of the neural network to see if it does what you want, and not to try to interpret the individual weights.

answered Nov 10 at 4:39

Stephen Phillips

Actually I'm classifying data into two classes. I'm not using any hidden layer. Simply input nodes and two 2 output nodes. I'm not sure that should I use 2 biases in this case attached to the 2 output nodes or not? My model is showing 98 percent accuracy in this case. I want to figure out why is it so?Why this network is performing so well. If I remove bias its accuracy suddenly decreases. Beside I'm initializing biases with zero value.
– R.joe
Nov 10 at 4:46

Without a hidden layer, isn't it just a linear classifier? How do you apply the non-linearity? And the bias initialization shouldn't matter too much since if you are training with SGD it will converge to high-performing values
– Stephen Phillips
Nov 10 at 4:51

Yes I'm not adding any hidden layer. My network simply consists of input node and 2 output nodes. On the output nodes i use softmax as an activation function. But I also add two biases. Should I add theses 2 biases? If I remove biases network performace is too low. But with biases it gives 98 percent accuracy.I'm curious to know what's going on here? Why these biases are important
– R.joe
Nov 10 at 4:56

Secondly after training both biases have same value but with opposite signs.One is positive and other is negative. What they tell us?
– R.joe
Nov 10 at 4:57

1

What you are doing is not a neural network - it is called logistic regression. I assume you are using cross entropy as your loss? Either way the biases are appropriate. Again, assuming you have weight decay and your weights are not too big, the large different in bias just means that your variables are well separated. If this is indeed the case you should expect that your first set of weights are just the negative of the other, and thus the bias is the negative of the other (up to noise)
– Stephen Phillips
Nov 10 at 5:00

|
show 7 more comments

Your Answer

StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\$","\$"]]);
});
});
}, "mathjax-editing");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "65"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f376305%2fif-one-value-of-bias-is-high-and-one-is-low-then-what-indication-it-gives-us%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

up vote
0
down vote

answered Nov 10 at 4:39

Stephen Phillips

Actually I'm classifying data into two classes. I'm not using any hidden layer. Simply input nodes and two 2 output nodes. I'm not sure that should I use 2 biases in this case attached to the 2 output nodes or not? My model is showing 98 percent accuracy in this case. I want to figure out why is it so?Why this network is performing so well. If I remove bias its accuracy suddenly decreases. Beside I'm initializing biases with zero value.
– R.joe
Nov 10 at 4:46

Without a hidden layer, isn't it just a linear classifier? How do you apply the non-linearity? And the bias initialization shouldn't matter too much since if you are training with SGD it will converge to high-performing values
– Stephen Phillips
Nov 10 at 4:51

Yes I'm not adding any hidden layer. My network simply consists of input node and 2 output nodes. On the output nodes i use softmax as an activation function. But I also add two biases. Should I add theses 2 biases? If I remove biases network performace is too low. But with biases it gives 98 percent accuracy.I'm curious to know what's going on here? Why these biases are important
– R.joe
Nov 10 at 4:56

Secondly after training both biases have same value but with opposite signs.One is positive and other is negative. What they tell us?
– R.joe
Nov 10 at 4:57

1

What you are doing is not a neural network - it is called logistic regression. I assume you are using cross entropy as your loss? Either way the biases are appropriate. Again, assuming you have weight decay and your weights are not too big, the large different in bias just means that your variables are well separated. If this is indeed the case you should expect that your first set of weights are just the negative of the other, and thus the bias is the negative of the other (up to noise)
– Stephen Phillips
Nov 10 at 5:00

|
show 7 more comments

up vote
0
down vote

answered Nov 10 at 4:39

Stephen Phillips

Actually I'm classifying data into two classes. I'm not using any hidden layer. Simply input nodes and two 2 output nodes. I'm not sure that should I use 2 biases in this case attached to the 2 output nodes or not? My model is showing 98 percent accuracy in this case. I want to figure out why is it so?Why this network is performing so well. If I remove bias its accuracy suddenly decreases. Beside I'm initializing biases with zero value.
– R.joe
Nov 10 at 4:46

Without a hidden layer, isn't it just a linear classifier? How do you apply the non-linearity? And the bias initialization shouldn't matter too much since if you are training with SGD it will converge to high-performing values
– Stephen Phillips
Nov 10 at 4:51

Yes I'm not adding any hidden layer. My network simply consists of input node and 2 output nodes. On the output nodes i use softmax as an activation function. But I also add two biases. Should I add theses 2 biases? If I remove biases network performace is too low. But with biases it gives 98 percent accuracy.I'm curious to know what's going on here? Why these biases are important
– R.joe
Nov 10 at 4:56

Secondly after training both biases have same value but with opposite signs.One is positive and other is negative. What they tell us?
– R.joe
Nov 10 at 4:57

1

What you are doing is not a neural network - it is called logistic regression. I assume you are using cross entropy as your loss? Either way the biases are appropriate. Again, assuming you have weight decay and your weights are not too big, the large different in bias just means that your variables are well separated. If this is indeed the case you should expect that your first set of weights are just the negative of the other, and thus the bias is the negative of the other (up to noise)
– Stephen Phillips
Nov 10 at 5:00

|
show 7 more comments

up vote
0
down vote

answered Nov 10 at 4:39

Stephen Phillips

answered Nov 10 at 4:39

Stephen Phillips

answered Nov 10 at 4:39

Stephen Phillips

answered Nov 10 at 4:39

Stephen Phillips

answered Nov 10 at 4:39

Stephen Phillips

Actually I'm classifying data into two classes. I'm not using any hidden layer. Simply input nodes and two 2 output nodes. I'm not sure that should I use 2 biases in this case attached to the 2 output nodes or not? My model is showing 98 percent accuracy in this case. I want to figure out why is it so?Why this network is performing so well. If I remove bias its accuracy suddenly decreases. Beside I'm initializing biases with zero value.
– R.joe
Nov 10 at 4:46

Without a hidden layer, isn't it just a linear classifier? How do you apply the non-linearity? And the bias initialization shouldn't matter too much since if you are training with SGD it will converge to high-performing values
– Stephen Phillips
Nov 10 at 4:51

Yes I'm not adding any hidden layer. My network simply consists of input node and 2 output nodes. On the output nodes i use softmax as an activation function. But I also add two biases. Should I add theses 2 biases? If I remove biases network performace is too low. But with biases it gives 98 percent accuracy.I'm curious to know what's going on here? Why these biases are important
– R.joe
Nov 10 at 4:56

Secondly after training both biases have same value but with opposite signs.One is positive and other is negative. What they tell us?
– R.joe
Nov 10 at 4:57

1

What you are doing is not a neural network - it is called logistic regression. I assume you are using cross entropy as your loss? Either way the biases are appropriate. Again, assuming you have weight decay and your weights are not too big, the large different in bias just means that your variables are well separated. If this is indeed the case you should expect that your first set of weights are just the negative of the other, and thus the bias is the negative of the other (up to noise)
– Stephen Phillips
Nov 10 at 5:00

|
show 7 more comments

Actually I'm classifying data into two classes. I'm not using any hidden layer. Simply input nodes and two 2 output nodes. I'm not sure that should I use 2 biases in this case attached to the 2 output nodes or not? My model is showing 98 percent accuracy in this case. I want to figure out why is it so?Why this network is performing so well. If I remove bias its accuracy suddenly decreases. Beside I'm initializing biases with zero value.
– R.joe
Nov 10 at 4:46

Without a hidden layer, isn't it just a linear classifier? How do you apply the non-linearity? And the bias initialization shouldn't matter too much since if you are training with SGD it will converge to high-performing values
– Stephen Phillips
Nov 10 at 4:51

Yes I'm not adding any hidden layer. My network simply consists of input node and 2 output nodes. On the output nodes i use softmax as an activation function. But I also add two biases. Should I add theses 2 biases? If I remove biases network performace is too low. But with biases it gives 98 percent accuracy.I'm curious to know what's going on here? Why these biases are important
– R.joe
Nov 10 at 4:56

Secondly after training both biases have same value but with opposite signs.One is positive and other is negative. What they tell us?
– R.joe
Nov 10 at 4:57

1

What you are doing is not a neural network - it is called logistic regression. I assume you are using cross entropy as your loss? Either way the biases are appropriate. Again, assuming you have weight decay and your weights are not too big, the large different in bias just means that your variables are well separated. If this is indeed the case you should expect that your first set of weights are just the negative of the other, and thus the bias is the negative of the other (up to noise)
– Stephen Phillips
Nov 10 at 5:00

Actually I'm classifying data into two classes. I'm not using any hidden layer. Simply input nodes and two 2 output nodes. I'm not sure that should I use 2 biases in this case attached to the 2 output nodes or not? My model is showing 98 percent accuracy in this case. I want to figure out why is it so?Why this network is performing so well. If I remove bias its accuracy suddenly decreases. Beside I'm initializing biases with zero value.
– R.joe
Nov 10 at 4:46

Without a hidden layer, isn't it just a linear classifier? How do you apply the non-linearity? And the bias initialization shouldn't matter too much since if you are training with SGD it will converge to high-performing values
– Stephen Phillips
Nov 10 at 4:51

Yes I'm not adding any hidden layer. My network simply consists of input node and 2 output nodes. On the output nodes i use softmax as an activation function. But I also add two biases. Should I add theses 2 biases? If I remove biases network performace is too low. But with biases it gives 98 percent accuracy.I'm curious to know what's going on here? Why these biases are important
– R.joe
Nov 10 at 4:56

Secondly after training both biases have same value but with opposite signs.One is positive and other is negative. What they tell us?
– R.joe
Nov 10 at 4:57

What you are doing is not a neural network - it is called logistic regression. I assume you are using cross entropy as your loss? Either way the biases are appropriate. Again, assuming you have weight decay and your weights are not too big, the large different in bias just means that your variables are well separated. If this is indeed the case you should expect that your first set of weights are just the negative of the other, and thus the bias is the negative of the other (up to noise)
– Stephen Phillips
Nov 10 at 5:00

|
show 7 more comments

draft saved

draft discarded

Thanks for contributing an answer to Cross Validated!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

Use MathJax to format equations. MathJax reference.

To learn more, see our tips on writing great answers.

Some of your past answers have not been well-received, and you're in danger of being blocked from answering.

Please pay close attention to the following guidance:

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Xtykutl