If one value of bias is high and one is low then what indication it gives us?
up vote
1
down vote
favorite
I am dealing with fully connected neural network. Where I initialize bias with zero weights. But during training process one bias adopts a high positive value and other adopts negative value. I want to classify my data into two classes. I want to know what these bias values tell us?How they can help in classification problem?
machine-learning
migrated from stackoverflow.com Nov 10 at 10:01
This question came from our site for professional and enthusiast programmers.
add a comment |
up vote
1
down vote
favorite
I am dealing with fully connected neural network. Where I initialize bias with zero weights. But during training process one bias adopts a high positive value and other adopts negative value. I want to classify my data into two classes. I want to know what these bias values tell us?How they can help in classification problem?
machine-learning
migrated from stackoverflow.com Nov 10 at 10:01
This question came from our site for professional and enthusiast programmers.
add a comment |
up vote
1
down vote
favorite
up vote
1
down vote
favorite
I am dealing with fully connected neural network. Where I initialize bias with zero weights. But during training process one bias adopts a high positive value and other adopts negative value. I want to classify my data into two classes. I want to know what these bias values tell us?How they can help in classification problem?
machine-learning
I am dealing with fully connected neural network. Where I initialize bias with zero weights. But during training process one bias adopts a high positive value and other adopts negative value. I want to classify my data into two classes. I want to know what these bias values tell us?How they can help in classification problem?
machine-learning
machine-learning
asked Nov 10 at 4:22
R.joe
migrated from stackoverflow.com Nov 10 at 10:01
This question came from our site for professional and enthusiast programmers.
migrated from stackoverflow.com Nov 10 at 10:01
This question came from our site for professional and enthusiast programmers.
add a comment |
add a comment |
1 Answer
1
active
oldest
votes
up vote
0
down vote
The biases cannot be interpreted independently of the weights. It could just mean your weights are getting quite large, and so the biases get large as well. If you use weight decay properly, though, this shouldn't happen. Assuming you did that, then at first glance, it seems that it means your data can be well separated into two classes. Does your test accuracy reflect this? Can you get a high classification accuracy on the test set?
However, in general it is not a good idea to over-interpret the weights and biases of a neural network. They are created by a random highly non-convex gradient descent. If you were to run the optimization again, you would get different weights and biases, even if you got the same level of accuracy. It is better to try to run experiments on the output of the neural network to see if it does what you want, and not to try to interpret the individual weights.
Actually I'm classifying data into two classes. I'm not using any hidden layer. Simply input nodes and two 2 output nodes. I'm not sure that should I use 2 biases in this case attached to the 2 output nodes or not? My model is showing 98 percent accuracy in this case. I want to figure out why is it so?Why this network is performing so well. If I remove bias its accuracy suddenly decreases. Beside I'm initializing biases with zero value.
– R.joe
Nov 10 at 4:46
Without a hidden layer, isn't it just a linear classifier? How do you apply the non-linearity? And the bias initialization shouldn't matter too much since if you are training with SGD it will converge to high-performing values
– Stephen Phillips
Nov 10 at 4:51
Yes I'm not adding any hidden layer. My network simply consists of input node and 2 output nodes. On the output nodes i use softmax as an activation function. But I also add two biases. Should I add theses 2 biases? If I remove biases network performace is too low. But with biases it gives 98 percent accuracy.I'm curious to know what's going on here? Why these biases are important
– R.joe
Nov 10 at 4:56
Secondly after training both biases have same value but with opposite signs.One is positive and other is negative. What they tell us?
– R.joe
Nov 10 at 4:57
1
What you are doing is not a neural network - it is called logistic regression. I assume you are using cross entropy as your loss? Either way the biases are appropriate. Again, assuming you have weight decay and your weights are not too big, the large different in bias just means that your variables are well separated. If this is indeed the case you should expect that your first set of weights are just the negative of the other, and thus the bias is the negative of the other (up to noise)
– Stephen Phillips
Nov 10 at 5:00
|
show 7 more comments
Your Answer
StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
});
});
}, "mathjax-editing");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "65"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f376305%2fif-one-value-of-bias-is-high-and-one-is-low-then-what-indication-it-gives-us%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
0
down vote
The biases cannot be interpreted independently of the weights. It could just mean your weights are getting quite large, and so the biases get large as well. If you use weight decay properly, though, this shouldn't happen. Assuming you did that, then at first glance, it seems that it means your data can be well separated into two classes. Does your test accuracy reflect this? Can you get a high classification accuracy on the test set?
However, in general it is not a good idea to over-interpret the weights and biases of a neural network. They are created by a random highly non-convex gradient descent. If you were to run the optimization again, you would get different weights and biases, even if you got the same level of accuracy. It is better to try to run experiments on the output of the neural network to see if it does what you want, and not to try to interpret the individual weights.
Actually I'm classifying data into two classes. I'm not using any hidden layer. Simply input nodes and two 2 output nodes. I'm not sure that should I use 2 biases in this case attached to the 2 output nodes or not? My model is showing 98 percent accuracy in this case. I want to figure out why is it so?Why this network is performing so well. If I remove bias its accuracy suddenly decreases. Beside I'm initializing biases with zero value.
– R.joe
Nov 10 at 4:46
Without a hidden layer, isn't it just a linear classifier? How do you apply the non-linearity? And the bias initialization shouldn't matter too much since if you are training with SGD it will converge to high-performing values
– Stephen Phillips
Nov 10 at 4:51
Yes I'm not adding any hidden layer. My network simply consists of input node and 2 output nodes. On the output nodes i use softmax as an activation function. But I also add two biases. Should I add theses 2 biases? If I remove biases network performace is too low. But with biases it gives 98 percent accuracy.I'm curious to know what's going on here? Why these biases are important
– R.joe
Nov 10 at 4:56
Secondly after training both biases have same value but with opposite signs.One is positive and other is negative. What they tell us?
– R.joe
Nov 10 at 4:57
1
What you are doing is not a neural network - it is called logistic regression. I assume you are using cross entropy as your loss? Either way the biases are appropriate. Again, assuming you have weight decay and your weights are not too big, the large different in bias just means that your variables are well separated. If this is indeed the case you should expect that your first set of weights are just the negative of the other, and thus the bias is the negative of the other (up to noise)
– Stephen Phillips
Nov 10 at 5:00
|
show 7 more comments
up vote
0
down vote
The biases cannot be interpreted independently of the weights. It could just mean your weights are getting quite large, and so the biases get large as well. If you use weight decay properly, though, this shouldn't happen. Assuming you did that, then at first glance, it seems that it means your data can be well separated into two classes. Does your test accuracy reflect this? Can you get a high classification accuracy on the test set?
However, in general it is not a good idea to over-interpret the weights and biases of a neural network. They are created by a random highly non-convex gradient descent. If you were to run the optimization again, you would get different weights and biases, even if you got the same level of accuracy. It is better to try to run experiments on the output of the neural network to see if it does what you want, and not to try to interpret the individual weights.
Actually I'm classifying data into two classes. I'm not using any hidden layer. Simply input nodes and two 2 output nodes. I'm not sure that should I use 2 biases in this case attached to the 2 output nodes or not? My model is showing 98 percent accuracy in this case. I want to figure out why is it so?Why this network is performing so well. If I remove bias its accuracy suddenly decreases. Beside I'm initializing biases with zero value.
– R.joe
Nov 10 at 4:46
Without a hidden layer, isn't it just a linear classifier? How do you apply the non-linearity? And the bias initialization shouldn't matter too much since if you are training with SGD it will converge to high-performing values
– Stephen Phillips
Nov 10 at 4:51
Yes I'm not adding any hidden layer. My network simply consists of input node and 2 output nodes. On the output nodes i use softmax as an activation function. But I also add two biases. Should I add theses 2 biases? If I remove biases network performace is too low. But with biases it gives 98 percent accuracy.I'm curious to know what's going on here? Why these biases are important
– R.joe
Nov 10 at 4:56
Secondly after training both biases have same value but with opposite signs.One is positive and other is negative. What they tell us?
– R.joe
Nov 10 at 4:57
1
What you are doing is not a neural network - it is called logistic regression. I assume you are using cross entropy as your loss? Either way the biases are appropriate. Again, assuming you have weight decay and your weights are not too big, the large different in bias just means that your variables are well separated. If this is indeed the case you should expect that your first set of weights are just the negative of the other, and thus the bias is the negative of the other (up to noise)
– Stephen Phillips
Nov 10 at 5:00
|
show 7 more comments
up vote
0
down vote
up vote
0
down vote
The biases cannot be interpreted independently of the weights. It could just mean your weights are getting quite large, and so the biases get large as well. If you use weight decay properly, though, this shouldn't happen. Assuming you did that, then at first glance, it seems that it means your data can be well separated into two classes. Does your test accuracy reflect this? Can you get a high classification accuracy on the test set?
However, in general it is not a good idea to over-interpret the weights and biases of a neural network. They are created by a random highly non-convex gradient descent. If you were to run the optimization again, you would get different weights and biases, even if you got the same level of accuracy. It is better to try to run experiments on the output of the neural network to see if it does what you want, and not to try to interpret the individual weights.
The biases cannot be interpreted independently of the weights. It could just mean your weights are getting quite large, and so the biases get large as well. If you use weight decay properly, though, this shouldn't happen. Assuming you did that, then at first glance, it seems that it means your data can be well separated into two classes. Does your test accuracy reflect this? Can you get a high classification accuracy on the test set?
However, in general it is not a good idea to over-interpret the weights and biases of a neural network. They are created by a random highly non-convex gradient descent. If you were to run the optimization again, you would get different weights and biases, even if you got the same level of accuracy. It is better to try to run experiments on the output of the neural network to see if it does what you want, and not to try to interpret the individual weights.
answered Nov 10 at 4:39
Stephen Phillips
1
1
Actually I'm classifying data into two classes. I'm not using any hidden layer. Simply input nodes and two 2 output nodes. I'm not sure that should I use 2 biases in this case attached to the 2 output nodes or not? My model is showing 98 percent accuracy in this case. I want to figure out why is it so?Why this network is performing so well. If I remove bias its accuracy suddenly decreases. Beside I'm initializing biases with zero value.
– R.joe
Nov 10 at 4:46
Without a hidden layer, isn't it just a linear classifier? How do you apply the non-linearity? And the bias initialization shouldn't matter too much since if you are training with SGD it will converge to high-performing values
– Stephen Phillips
Nov 10 at 4:51
Yes I'm not adding any hidden layer. My network simply consists of input node and 2 output nodes. On the output nodes i use softmax as an activation function. But I also add two biases. Should I add theses 2 biases? If I remove biases network performace is too low. But with biases it gives 98 percent accuracy.I'm curious to know what's going on here? Why these biases are important
– R.joe
Nov 10 at 4:56
Secondly after training both biases have same value but with opposite signs.One is positive and other is negative. What they tell us?
– R.joe
Nov 10 at 4:57
1
What you are doing is not a neural network - it is called logistic regression. I assume you are using cross entropy as your loss? Either way the biases are appropriate. Again, assuming you have weight decay and your weights are not too big, the large different in bias just means that your variables are well separated. If this is indeed the case you should expect that your first set of weights are just the negative of the other, and thus the bias is the negative of the other (up to noise)
– Stephen Phillips
Nov 10 at 5:00
|
show 7 more comments
Actually I'm classifying data into two classes. I'm not using any hidden layer. Simply input nodes and two 2 output nodes. I'm not sure that should I use 2 biases in this case attached to the 2 output nodes or not? My model is showing 98 percent accuracy in this case. I want to figure out why is it so?Why this network is performing so well. If I remove bias its accuracy suddenly decreases. Beside I'm initializing biases with zero value.
– R.joe
Nov 10 at 4:46
Without a hidden layer, isn't it just a linear classifier? How do you apply the non-linearity? And the bias initialization shouldn't matter too much since if you are training with SGD it will converge to high-performing values
– Stephen Phillips
Nov 10 at 4:51
Yes I'm not adding any hidden layer. My network simply consists of input node and 2 output nodes. On the output nodes i use softmax as an activation function. But I also add two biases. Should I add theses 2 biases? If I remove biases network performace is too low. But with biases it gives 98 percent accuracy.I'm curious to know what's going on here? Why these biases are important
– R.joe
Nov 10 at 4:56
Secondly after training both biases have same value but with opposite signs.One is positive and other is negative. What they tell us?
– R.joe
Nov 10 at 4:57
1
What you are doing is not a neural network - it is called logistic regression. I assume you are using cross entropy as your loss? Either way the biases are appropriate. Again, assuming you have weight decay and your weights are not too big, the large different in bias just means that your variables are well separated. If this is indeed the case you should expect that your first set of weights are just the negative of the other, and thus the bias is the negative of the other (up to noise)
– Stephen Phillips
Nov 10 at 5:00
Actually I'm classifying data into two classes. I'm not using any hidden layer. Simply input nodes and two 2 output nodes. I'm not sure that should I use 2 biases in this case attached to the 2 output nodes or not? My model is showing 98 percent accuracy in this case. I want to figure out why is it so?Why this network is performing so well. If I remove bias its accuracy suddenly decreases. Beside I'm initializing biases with zero value.
– R.joe
Nov 10 at 4:46
Actually I'm classifying data into two classes. I'm not using any hidden layer. Simply input nodes and two 2 output nodes. I'm not sure that should I use 2 biases in this case attached to the 2 output nodes or not? My model is showing 98 percent accuracy in this case. I want to figure out why is it so?Why this network is performing so well. If I remove bias its accuracy suddenly decreases. Beside I'm initializing biases with zero value.
– R.joe
Nov 10 at 4:46
Without a hidden layer, isn't it just a linear classifier? How do you apply the non-linearity? And the bias initialization shouldn't matter too much since if you are training with SGD it will converge to high-performing values
– Stephen Phillips
Nov 10 at 4:51
Without a hidden layer, isn't it just a linear classifier? How do you apply the non-linearity? And the bias initialization shouldn't matter too much since if you are training with SGD it will converge to high-performing values
– Stephen Phillips
Nov 10 at 4:51
Yes I'm not adding any hidden layer. My network simply consists of input node and 2 output nodes. On the output nodes i use softmax as an activation function. But I also add two biases. Should I add theses 2 biases? If I remove biases network performace is too low. But with biases it gives 98 percent accuracy.I'm curious to know what's going on here? Why these biases are important
– R.joe
Nov 10 at 4:56
Yes I'm not adding any hidden layer. My network simply consists of input node and 2 output nodes. On the output nodes i use softmax as an activation function. But I also add two biases. Should I add theses 2 biases? If I remove biases network performace is too low. But with biases it gives 98 percent accuracy.I'm curious to know what's going on here? Why these biases are important
– R.joe
Nov 10 at 4:56
Secondly after training both biases have same value but with opposite signs.One is positive and other is negative. What they tell us?
– R.joe
Nov 10 at 4:57
Secondly after training both biases have same value but with opposite signs.One is positive and other is negative. What they tell us?
– R.joe
Nov 10 at 4:57
1
1
What you are doing is not a neural network - it is called logistic regression. I assume you are using cross entropy as your loss? Either way the biases are appropriate. Again, assuming you have weight decay and your weights are not too big, the large different in bias just means that your variables are well separated. If this is indeed the case you should expect that your first set of weights are just the negative of the other, and thus the bias is the negative of the other (up to noise)
– Stephen Phillips
Nov 10 at 5:00
What you are doing is not a neural network - it is called logistic regression. I assume you are using cross entropy as your loss? Either way the biases are appropriate. Again, assuming you have weight decay and your weights are not too big, the large different in bias just means that your variables are well separated. If this is indeed the case you should expect that your first set of weights are just the negative of the other, and thus the bias is the negative of the other (up to noise)
– Stephen Phillips
Nov 10 at 5:00
|
show 7 more comments
Thanks for contributing an answer to Cross Validated!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f376305%2fif-one-value-of-bias-is-high-and-one-is-low-then-what-indication-it-gives-us%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown