c# - changing activation function from Sigmoid to Tanh? -
i'm trying change neural net using sigmoid activation hidden , output layer tanh function. i'm confused should change. output calculation neurons or error calculation propagation? output calculation:
public void calcoutput() { if (!isbias) { float sum = 0; float bias = 0; //system.out.println("looking through " + connections.size() + " connections"); (int = 0; < connections.count; i++) { connection c = (connection) connections[i]; node = c.getfrom(); node = c.getto(); // connection moving forward // ignore connections send our output if (to == this) { // isn't necessary // treating bias individually in case need @ point if (from.isbias) bias = from.getoutput()*c.getweight(); else sum += from.getoutput()*c.getweight(); } } // output result of sigmoid function output = tanh(bias+sum); } }
it works great how trained before, want want train give 1 or -1 output. when change output = sigmoid(bias+sum); output = tanh(bias+sum); result messed up...
sigmoid:
public static float sigmoid(float x) { return 1.0f / (1.0f + (float) mathf.exp(-x)); }
tanh:
public float tanh(float x) { //return (float)(mathf.exp(x) - mathf.exp(-x)) / (mathf.exp(x) + mathf.exp(-x)); //return (float)(1.7159f * system.math.tanh(2/3 * x)); return (float)system.math.tanh(x); }
as can see tried different formula found tanh none outputs make sense, -1 ask 0 or 0.76159 ask 1 or keeps flipping between positive , negative number when asking -1 , other mismatches...
-edit- updated working code (changed above calcouput use now):
public float[] train(float[] inputs, float[] answer) { float[] result = feedforward(inputs); deltaoutput = new float[result.length]; for(int ii=0; ii<result.length; ii++) { deltaoutput[ii] = 0.66666667f * (1.7159f - (result[ii]*result[ii])) * (answer[ii]-result[ii]); } // backpropogation for(int ii=0; ii<output.length; ii++) { arraylist connections = output[ii].getconnections(); (int = 0; < connections.count; i++) { connection c = (connection) connections[i]; node node = c.getfrom(); float o = node.getoutput(); float deltaweight = o*deltaoutput[ii]; c.adjustweight(learning_constant*deltaweight); } } // adjust hidden weights (int = 0; < hidden.length; i++) { arraylist connections = hidden[i].getconnections(); //debug.log(connections.count); float sum = 0; // sum output delta * hidden layer connections (just 1 output) (int j = 0; j < connections.count; j++) { connection c = (connection) connections[j]; // connection hidden layer next layer (output)? if (c.getfrom() == hidden[i]) { for(int k=0; k<deltaoutput.length; k++) sum += c.getweight()*deltaoutput[k]; } } // adjust weights coming in based: // above sum * derivative of sigmoid output function hidden neurons (int j = 0; j < connections.count; j++) { connection c = (connection) connections[j]; // connection previous layer (input) hidden layer? if (c.getto() == hidden[i]) { float o = hidden[i].getoutput(); float deltahidden = o * (1 - o); // derivative of sigmoid(x) deltahidden *= sum; node node = c.getfrom(); float deltaweight = node.getoutput()*deltahidden; c.adjustweight(learning_constant*deltaweight); } } } return result; }
i'm confused should change. output calculation neurons or error calculation propagation? output calculation:
you should using derivative of sigmoid function somewhere in backpropagation code. need replace derivative of tanh
function, 1 - (tanh(x))^2
.
your code looks c#. this:
console.writeline(math.tanh(0)); // prints 0 console.writeline(math.tanh(-1)); // prints -0.761594155955765 console.writeline(math.tanh(1)); // prints 0.761594155955765 console.writeline(math.tanh(0.234)); // prints 0.229820548214317 console.writeline(math.tanh(-4)); // prints -0.999329299739067
which in line tanh
plot:
i think you're reading results wrong: correct answer 1
. sure -1
tanh(0)
?
if you're sure there's problem, please post more code.
Comments
Post a Comment