Dietmar Posted April 1, 2023 Author Posted April 1, 2023 @Mark-XP Here, the Neurons always fire. So, it is not possible with only positiv values, to make a neuron shut its mouth. But with negativ values it can be done. One Neuron just can delete to zero the output of another. When you look for pattern, intensive as much as possible, you may need this. Not important, if your input is always positiv Dietmar
Mark-XP Posted April 1, 2023 Posted April 1, 2023 @Dietmar then, what about feedForward like ... = 2.0 * sigmoid(sum) - 1.0; The output will be in around 0, but halftimes negative
Dietmar Posted April 1, 2023 Author Posted April 1, 2023 @Mark-XP This is Tanh, which I use now Dietmar
Mark-XP Posted April 1, 2023 Posted April 1, 2023 @Dietmar Not exactly Tanh, it's scaled. Let sSig be the above funtion: sSig = 2 * sig(x) - 1 : Then sSig' (0) = 0.5 but Tanh'(0) = 1. Anyway, with Sigmoid/tanh you're on the right track.
Dietmar Posted April 2, 2023 Author Posted April 2, 2023 @Mark-XP This is my first try, using the network with Tanh from above, for to train it for Prime numbers. This is really crazy hard work. A Prime number is found, when the output is near to +0.5. This I have to teach by hand until now, if the input is a prime or not. When the output is near to -0.5 it is called "no prime". First I start with 3 Input Neurons, so that I can check primes until 9 9 9. I use 0 0 7 here. To separate the number in its decimales is important, for not to run in problems with normalisation because of Tanh for big numbers Dietmar package neuralnetwork; import java.util.Arrays; import java.util.concurrent.ThreadLocalRandom; import java.util.Scanner; public class NeuralNetwork { private int numInputNodes; private int numHiddenNodes1; private int numHiddenNodes2; private int numOutputNodes; private double[][] weights1; private double[][] weights2; private double[][] weights3; private double[] bias1; private double[] bias2; private double[] bias3; public NeuralNetwork(int numInputNodes, int numHiddenNodes1, int numHiddenNodes2, int numOutputNodes) { this.numInputNodes = numInputNodes; this.numHiddenNodes1 = numHiddenNodes1; this.numHiddenNodes2 = numHiddenNodes2; this.numOutputNodes = numOutputNodes; this.weights1 = new double[numInputNodes][numHiddenNodes1]; for (int i = 0; i < numInputNodes; i++) { for (int j = 0; j < numHiddenNodes1; j++) { this.weights1[i][j] = ThreadLocalRandom.current().nextDouble(-1, 1); } } this.weights2 = new double[numHiddenNodes1][numHiddenNodes2]; for (int i = 0; i < numHiddenNodes1; i++) { for (int j = 0; j < numHiddenNodes2; j++) { this.weights2[i][j] = ThreadLocalRandom.current().nextDouble(-1, 1); } } this.weights3 = new double[numHiddenNodes2][numOutputNodes]; for (int i = 0; i < numHiddenNodes2; i++) { for (int j = 0; j < numOutputNodes; j++) { this.weights3[i][j] = ThreadLocalRandom.current().nextDouble(-1, 1); } } this.bias1 = new double[numHiddenNodes1]; for (int i = 0; i < numHiddenNodes1; i++) { this.bias1[i] = ThreadLocalRandom.current().nextDouble(-1, 1); } this.bias2 = new double[numHiddenNodes2]; for (int i = 0; i < numHiddenNodes2; i++) { this.bias2[i] = ThreadLocalRandom.current().nextDouble(-1, 1); } this.bias3 = new double[numOutputNodes]; for (int i = 0; i < numOutputNodes; i++) { this.bias3[i] = ThreadLocalRandom.current().nextDouble(-1, 1); } } public double[] feedForward(double[] inputs) { double[] hidden1 = new double[numHiddenNodes1]; double[] hidden2 = new double[numHiddenNodes2]; double[] outputs = new double[numOutputNodes]; // Calculate outputs of hidden layer 1 for (int j = 0; j < numHiddenNodes1; j++) { double sum = 0; for (int i = 0; i < numInputNodes; i++) { sum += inputs[i] * weights1[i][j]; } sum += bias1[j]; hidden1[j] = Math.tanh(sum); } // Calculate outputs of hidden layer 2 for (int j = 0; j < numHiddenNodes2; j++) { double sum = 0; for (int i = 0; i < numHiddenNodes1; i++) { sum += hidden1[i] * weights2[i][j]; } sum += bias2[j]; hidden2[j] = Math.tanh(sum);; } // Calculate outputs for (int j = 0; j < numOutputNodes; j++) { double sum = 0; for (int i = 0; i < numHiddenNodes2; i++) { sum += hidden2[i] * weights3[i][j]; } sum += bias3[j]; outputs[j] = Math.tanh(sum);; } return outputs; } public double[][] getWeights1() { return this.weights1; } public double[][] getWeights2() { return this.weights2; } public double[][] getWeights3() { return this.weights3; } public double[] getBias1() { return this.bias1; } public double[] getBias2() { return this.bias2; } public double[] getBias3() { return this.bias3; } // Backward Propagation public void backPropagate(double[] inputs, double[] expectedOutputs, double learningRate) { // Feed forward to get outputs double[] hidden1 = new double[numHiddenNodes1]; double[] hidden2 = new double[numHiddenNodes2]; double[] outputs = feedForward(inputs); // Calculate error in output layer double[] outputErrors = new double[numOutputNodes]; for (int i = 0; i < numOutputNodes; i++) { outputErrors[i] = expectedOutputs[i] - outputs[i]; } // Calculate error in hidden layer 2 double[] hidden2Errors = new double[numHiddenNodes2]; for (int i = 0; i < numHiddenNodes2; i++) { double error = 0; for (int j = 0; j < numOutputNodes; j++) { error += outputErrors[j] * weights3[i][j]; } hidden2Errors[i] = (1 - Math.pow(Math.tanh(hidden2[i]), 2)) * error; } // Calculate error in hidden layer 1 double[] hidden1Errors = new double[numHiddenNodes1]; for (int i = 0; i < numHiddenNodes1; i++) { double error = 0; for (int j = 0; j < numHiddenNodes2; j++) { error += hidden2Errors[j] * weights2[i][j]; } hidden1Errors[i] = (1 - Math.pow(Math.tanh(hidden1[i]), 2)) * error; } // Update weights and biases in output layer for (int i = 0; i < numHiddenNodes2; i++) { for (int j = 0; j < numOutputNodes; j++) { double delta = outputErrors[j] * tanhDerivative(outputs[j]) * hidden2[i]; weights3[i][j] += learningRate * delta; } } for (int i = 0; i < numOutputNodes; i++) { bias3[i] += learningRate * outputErrors[i] * tanhDerivative(outputs[i]); } // Calculate error in hidden layer 1 for (int i = 0; i < numHiddenNodes1; i++) { double error = 0; for (int j = 0; j < numHiddenNodes2; j++) { error += hidden2Errors[j] * weights2[i][j]; } hidden1Errors[i] = error * tanhDerivative(hidden1[i]); } // Update weights and biases in hidden layer 2 for (int i = 0; i < numHiddenNodes1; i++) { for (int j = 0; j < numHiddenNodes2; j++) { double delta = hidden2Errors[j] * tanhDerivative(hidden2[j]) * hidden1[i]; weights2[i][j] += learningRate * delta; } } for (int i = 0; i < numHiddenNodes2; i++) { bias2[i] += learningRate * hidden2Errors[i] * tanhDerivative(hidden2[i]); } // Update weights and biases in hidden layer 1 for (int i = 0; i < numInputNodes; i++) { for (int j = 0; j < numHiddenNodes1; j++) { double delta = hidden1Errors[j] * tanhDerivative(hidden1[j]) * inputs[i]; weights1[i][j] += learningRate * delta; } } for (int i = 0; i < numHiddenNodes1; i++) { bias1[i] += learningRate * hidden1Errors[i] * tanhDerivative(hidden1[i]); } } // Helper method to calculate the derivative of the hyperbolic tangent function private double tanhDerivative(double x) { double tanh = Math.tanh(x); return 1 - tanh * tanh; } public static void main(String[] args) { NeuralNetwork nn = new NeuralNetwork(3, 4, 5, 1); double[] inputs = {0, 0, 7}; double[] expectedOutputs = {0.5}; for (int i = 0; i < 10000; i++) { nn.backPropagate(inputs, expectedOutputs, 0.01); } double[] outputs = nn.feedForward(inputs); System.out.println(Arrays.toString(outputs)); } }
Dietmar Posted April 2, 2023 Author Posted April 2, 2023 @Mark-XP Here is the next one, this is really nice !!! You can see, how the program "learns", what a prime is Dietmar package neuralnetwork; import java.util.Arrays; import java.util.concurrent.ThreadLocalRandom; import java.util.Scanner; public class NeuralNetwork { private int numInputNodes; private int numHiddenNodes1; private int numHiddenNodes2; private int numOutputNodes; private double[][] weights1; private double[][] weights2; private double[][] weights3; private double[] bias1; private double[] bias2; private double[] bias3; public NeuralNetwork(int numInputNodes, int numHiddenNodes1, int numHiddenNodes2, int numOutputNodes) { this.numInputNodes = numInputNodes; this.numHiddenNodes1 = numHiddenNodes1; this.numHiddenNodes2 = numHiddenNodes2; this.numOutputNodes = numOutputNodes; this.weights1 = new double[numInputNodes][numHiddenNodes1]; for (int i = 0; i < numInputNodes; i++) { for (int j = 0; j < numHiddenNodes1; j++) { this.weights1[i][j] = ThreadLocalRandom.current().nextDouble(-1, 1); } } this.weights2 = new double[numHiddenNodes1][numHiddenNodes2]; for (int i = 0; i < numHiddenNodes1; i++) { for (int j = 0; j < numHiddenNodes2; j++) { this.weights2[i][j] = ThreadLocalRandom.current().nextDouble(-1, 1); } } this.weights3 = new double[numHiddenNodes2][numOutputNodes]; for (int i = 0; i < numHiddenNodes2; i++) { for (int j = 0; j < numOutputNodes; j++) { this.weights3[i][j] = ThreadLocalRandom.current().nextDouble(-1, 1); } } this.bias1 = new double[numHiddenNodes1]; for (int i = 0; i < numHiddenNodes1; i++) { this.bias1[i] = ThreadLocalRandom.current().nextDouble(-1, 1); } this.bias2 = new double[numHiddenNodes2]; for (int i = 0; i < numHiddenNodes2; i++) { this.bias2[i] = ThreadLocalRandom.current().nextDouble(-1, 1); } this.bias3 = new double[numOutputNodes]; for (int i = 0; i < numOutputNodes; i++) { this.bias3[i] = ThreadLocalRandom.current().nextDouble(-1, 1); } } public double[] feedForward(double[] inputs) { double[] hidden1 = new double[numHiddenNodes1]; double[] hidden2 = new double[numHiddenNodes2]; double[] outputs = new double[numOutputNodes]; // Calculate outputs of hidden layer 1 for (int j = 0; j < numHiddenNodes1; j++) { double sum = 0; for (int i = 0; i < numInputNodes; i++) { sum += inputs[i] * weights1[i][j]; } sum += bias1[j]; hidden1[j] = Math.tanh(sum); } // Calculate outputs of hidden layer 2 for (int j = 0; j < numHiddenNodes2; j++) { double sum = 0; for (int i = 0; i < numHiddenNodes1; i++) { sum += hidden1[i] * weights2[i][j]; } sum += bias2[j]; hidden2[j] = Math.tanh(sum);; } // Calculate outputs for (int j = 0; j < numOutputNodes; j++) { double sum = 0; for (int i = 0; i < numHiddenNodes2; i++) { sum += hidden2[i] * weights3[i][j]; } sum += bias3[j]; outputs[j] = Math.tanh(sum);; } return outputs; } public double[][] getWeights1() { return this.weights1; } public double[][] getWeights2() { return this.weights2; } public double[][] getWeights3() { return this.weights3; } public double[] getBias1() { return this.bias1; } public double[] getBias2() { return this.bias2; } public double[] getBias3() { return this.bias3; } // Backward Propagation public void backPropagate(double[] inputs, double[] expectedOutputs, double learningRate) { // Feed forward to get outputs double[] hidden1 = new double[numHiddenNodes1]; double[] hidden2 = new double[numHiddenNodes2]; double[] outputs = feedForward(inputs); // Calculate error in output layer double[] outputErrors = new double[numOutputNodes]; for (int i = 0; i < numOutputNodes; i++) { outputErrors[i] = expectedOutputs[i] - outputs[i]; } // Calculate error in hidden layer 2 double[] hidden2Errors = new double[numHiddenNodes2]; for (int i = 0; i < numHiddenNodes2; i++) { double error = 0; for (int j = 0; j < numOutputNodes; j++) { error += outputErrors[j] * weights3[i][j]; } hidden2Errors[i] = (1 - Math.pow(Math.tanh(hidden2[i]), 2)) * error; } // Calculate error in hidden layer 1 double[] hidden1Errors = new double[numHiddenNodes1]; for (int i = 0; i < numHiddenNodes1; i++) { double error = 0; for (int j = 0; j < numHiddenNodes2; j++) { error += hidden2Errors[j] * weights2[i][j]; } hidden1Errors[i] = (1 - Math.pow(Math.tanh(hidden1[i]), 2)) * error; } // Update weights and biases in output layer for (int i = 0; i < numHiddenNodes2; i++) { for (int j = 0; j < numOutputNodes; j++) { double delta = outputErrors[j] * tanhDerivative(outputs[j]) * hidden2[i]; weights3[i][j] += learningRate * delta; } } for (int i = 0; i < numOutputNodes; i++) { bias3[i] += learningRate * outputErrors[i] * tanhDerivative(outputs[i]); } // Calculate error in hidden layer 1 for (int i = 0; i < numHiddenNodes1; i++) { double error = 0; for (int j = 0; j < numHiddenNodes2; j++) { error += hidden2Errors[j] * weights2[i][j]; } hidden1Errors[i] = error * tanhDerivative(hidden1[i]); } // Update weights and biases in hidden layer 2 for (int i = 0; i < numHiddenNodes1; i++) { for (int j = 0; j < numHiddenNodes2; j++) { double delta = hidden2Errors[j] * tanhDerivative(hidden2[j]) * hidden1[i]; weights2[i][j] += learningRate * delta; } } for (int i = 0; i < numHiddenNodes2; i++) { bias2[i] += learningRate * hidden2Errors[i] * tanhDerivative(hidden2[i]); } // Update weights and biases in hidden layer 1 for (int i = 0; i < numInputNodes; i++) { for (int j = 0; j < numHiddenNodes1; j++) { double delta = hidden1Errors[j] * tanhDerivative(hidden1[j]) * inputs[i]; weights1[i][j] += learningRate * delta; } } for (int i = 0; i < numHiddenNodes1; i++) { bias1[i] += learningRate * hidden1Errors[i] * tanhDerivative(hidden1[i]); } } // Helper method to calculate the derivative of the hyperbolic tangent function private double tanhDerivative(double x) { double tanh = Math.tanh(x); return 1 - tanh * tanh; } public static void main(String[] args) { NeuralNetwork nn = new NeuralNetwork(3, 2, 2, 1); double[] inputs = {0, 0, 7}; double[] expectedOutputs = {0.5}; for (int i = 0; i < 20; i++) { nn.backPropagate(inputs, expectedOutputs, 0.2); double[] outputs = nn.feedForward(inputs); System.out.println("\nIteration " + (i+1)); System.out.println("Weights1:"); System.out.println(Arrays.deepToString(nn.getWeights1())); System.out.println("Bias1:"); System.out.println(Arrays.toString(nn.getBias1())); System.out.println("Weights2:"); System.out.println(Arrays.deepToString(nn.getWeights2())); System.out.println("Bias2:"); System.out.println(Arrays.toString(nn.getBias2())); System.out.println("Weights3:"); System.out.println(Arrays.deepToString(nn.getWeights3())); System.out.println("Bias3:"); System.out.println(Arrays.toString(nn.getBias3())); System.out.println("Output:"); System.out.println(Arrays.toString(outputs)); try { Thread.sleep(1000); } catch (InterruptedException e) { e.printStackTrace(); } } Scanner scanner = new Scanner(System.in); System.out.println("\nEnter any key to continue:"); scanner.nextLine(); System.out.println("Initial weights and biases:"); System.out.println(Arrays.deepToString(nn.getWeights1())); System.out.println(Arrays.toString(nn.getBias1())); System.out.println(Arrays.deepToString(nn.getWeights2())); System.out.println(Arrays.toString(nn.getBias2())); System.out.println(Arrays.deepToString(nn.getWeights3())); System.out.println(Arrays.toString(nn.getBias3())); while (true) { System.out.println("\nEnter 'q' to quit or any other key to continue:"); String input = scanner.nextLine(); if (input.equals("q")) { break; } // Change weight and bias values nn.getWeights1()[1][2] += 0.5; nn.getBias2()[1] -= 0.5; nn.getWeights3()[0][0] *= 2; // Output modified weights and biases System.out.println("\nModified weights and biases:"); System.out.println(Arrays.deepToString(nn.getWeights1())); System.out.println(Arrays.toString(nn.getBias1())); System.out.println(Arrays.deepToString(nn.getWeights2())); System.out.println(Arrays.toString(nn.getBias2())); System.out.println(Arrays.deepToString(nn.getWeights3())); System.out.println(Arrays.toString(nn.getBias3())); // Calculate and output new output with modified weights and biases double[] outputs = nn.feedForward(inputs); System.out.println("New output:"); System.out.println(Arrays.toString(outputs)); try { Thread.sleep(1000); } catch (InterruptedException e) { e.printStackTrace(); } } } }
Dietmar Posted April 2, 2023 Author Posted April 2, 2023 @Mark-XP Here is another example. As you can see, it runs also stable with 1000 Neurons in each of the 2 Hidden Layers. Just now, it is more stable than Neuroph. And I also understand, for what the BIG BIG amount of memory is needed: ALL the Bias and Weights and Outputs have to be stored for each iteration. Only then, the network can learn Dietmar PS: Question is, if after training most of the used memory can be set free because only the last Bias, Weights and Outputs have to be stored. But at this example you can see, how much resources during training are really needed. Gigabyte. package neuralnetwork; import java.util.Arrays; import java.util.concurrent.ThreadLocalRandom; import java.util.Scanner; public class NeuralNetwork { private int numInputNodes; private int numHiddenNodes1; private int numHiddenNodes2; private int numOutputNodes; private double[][] weights1; private double[][] weights2; private double[][] weights3; private double[] bias1; private double[] bias2; private double[] bias3; public NeuralNetwork(int numInputNodes, int numHiddenNodes1, int numHiddenNodes2, int numOutputNodes) { this.numInputNodes = numInputNodes; this.numHiddenNodes1 = numHiddenNodes1; this.numHiddenNodes2 = numHiddenNodes2; this.numOutputNodes = numOutputNodes; this.weights1 = new double[numInputNodes][numHiddenNodes1]; for (int i = 0; i < numInputNodes; i++) { for (int j = 0; j < numHiddenNodes1; j++) { this.weights1[i][j] = ThreadLocalRandom.current().nextDouble(-1, 1); } } this.weights2 = new double[numHiddenNodes1][numHiddenNodes2]; for (int i = 0; i < numHiddenNodes1; i++) { for (int j = 0; j < numHiddenNodes2; j++) { this.weights2[i][j] = ThreadLocalRandom.current().nextDouble(-1, 1); } } this.weights3 = new double[numHiddenNodes2][numOutputNodes]; for (int i = 0; i < numHiddenNodes2; i++) { for (int j = 0; j < numOutputNodes; j++) { this.weights3[i][j] = ThreadLocalRandom.current().nextDouble(-1, 1); } } this.bias1 = new double[numHiddenNodes1]; for (int i = 0; i < numHiddenNodes1; i++) { this.bias1[i] = ThreadLocalRandom.current().nextDouble(-1, 1); } this.bias2 = new double[numHiddenNodes2]; for (int i = 0; i < numHiddenNodes2; i++) { this.bias2[i] = ThreadLocalRandom.current().nextDouble(-1, 1); } this.bias3 = new double[numOutputNodes]; for (int i = 0; i < numOutputNodes; i++) { this.bias3[i] = ThreadLocalRandom.current().nextDouble(-1, 1); } } public double[] feedForward(double[] inputs) { double[] hidden1 = new double[numHiddenNodes1]; double[] hidden2 = new double[numHiddenNodes2]; double[] outputs = new double[numOutputNodes]; // Calculate outputs of hidden layer 1 for (int j = 0; j < numHiddenNodes1; j++) { double sum = 0; for (int i = 0; i < numInputNodes; i++) { sum += inputs[i] * weights1[i][j]; } sum += bias1[j]; hidden1[j] = Math.tanh(sum); } // Calculate outputs of hidden layer 2 for (int j = 0; j < numHiddenNodes2; j++) { double sum = 0; for (int i = 0; i < numHiddenNodes1; i++) { sum += hidden1[i] * weights2[i][j]; } sum += bias2[j]; hidden2[j] = Math.tanh(sum);; } // Calculate outputs for (int j = 0; j < numOutputNodes; j++) { double sum = 0; for (int i = 0; i < numHiddenNodes2; i++) { sum += hidden2[i] * weights3[i][j]; } sum += bias3[j]; outputs[j] = Math.tanh(sum);; } return outputs; } public double[][] getWeights1() { return this.weights1; } public double[][] getWeights2() { return this.weights2; } public double[][] getWeights3() { return this.weights3; } public double[] getBias1() { return this.bias1; } public double[] getBias2() { return this.bias2; } public double[] getBias3() { return this.bias3; } // Backward Propagation public void backPropagate(double[] inputs, double[] expectedOutputs, double learningRate) { // Feed forward to get outputs double[] hidden1 = new double[numHiddenNodes1]; double[] hidden2 = new double[numHiddenNodes2]; double[] outputs = feedForward(inputs); // Calculate error in output layer double[] outputErrors = new double[numOutputNodes]; for (int i = 0; i < numOutputNodes; i++) { outputErrors[i] = expectedOutputs[i] - outputs[i]; } // Calculate error in hidden layer 2 double[] hidden2Errors = new double[numHiddenNodes2]; for (int i = 0; i < numHiddenNodes2; i++) { double error = 0; for (int j = 0; j < numOutputNodes; j++) { error += outputErrors[j] * weights3[i][j]; } hidden2Errors[i] = (1 - Math.pow(Math.tanh(hidden2[i]), 2)) * error; } // Calculate error in hidden layer 1 double[] hidden1Errors = new double[numHiddenNodes1]; for (int i = 0; i < numHiddenNodes1; i++) { double error = 0; for (int j = 0; j < numHiddenNodes2; j++) { error += hidden2Errors[j] * weights2[i][j]; } hidden1Errors[i] = (1 - Math.pow(Math.tanh(hidden1[i]), 2)) * error; } // Update weights and biases in output layer for (int i = 0; i < numHiddenNodes2; i++) { for (int j = 0; j < numOutputNodes; j++) { double delta = outputErrors[j] * tanhDerivative(outputs[j]) * hidden2[i]; weights3[i][j] += learningRate * delta; } } for (int i = 0; i < numOutputNodes; i++) { bias3[i] += learningRate * outputErrors[i] * tanhDerivative(outputs[i]); } // Calculate error in hidden layer 1 for (int i = 0; i < numHiddenNodes1; i++) { double error = 0; for (int j = 0; j < numHiddenNodes2; j++) { error += hidden2Errors[j] * weights2[i][j]; } hidden1Errors[i] = error * tanhDerivative(hidden1[i]); } // Update weights and biases in hidden layer 2 for (int i = 0; i < numHiddenNodes1; i++) { for (int j = 0; j < numHiddenNodes2; j++) { double delta = hidden2Errors[j] * tanhDerivative(hidden2[j]) * hidden1[i]; weights2[i][j] += learningRate * delta; } } for (int i = 0; i < numHiddenNodes2; i++) { bias2[i] += learningRate * hidden2Errors[i] * tanhDerivative(hidden2[i]); } // Update weights and biases in hidden layer 1 for (int i = 0; i < numInputNodes; i++) { for (int j = 0; j < numHiddenNodes1; j++) { double delta = hidden1Errors[j] * tanhDerivative(hidden1[j]) * inputs[i]; weights1[i][j] += learningRate * delta; } } for (int i = 0; i < numHiddenNodes1; i++) { bias1[i] += learningRate * hidden1Errors[i] * tanhDerivative(hidden1[i]); } } // Helper method to calculate the derivative of the hyperbolic tangent function private double tanhDerivative(double x) { double tanh = Math.tanh(x); return 1 - tanh * tanh; } public static void main(String[] args) { NeuralNetwork nn = new NeuralNetwork(3, 1000, 1000, 1); double[] inputs = {7, 3, 9}; double[] expectedOutputs = {0.5}; for (int i = 0; i < 20; i++) { nn.backPropagate(inputs, expectedOutputs, 0.00001); double[] outputs = nn.feedForward(inputs); System.out.println("\nIteration " + (i+1)); System.out.println("Weights1:"); System.out.println(Arrays.deepToString(nn.getWeights1())); System.out.println("Bias1:"); System.out.println(Arrays.toString(nn.getBias1())); System.out.println("Weights2:"); System.out.println(Arrays.deepToString(nn.getWeights2())); System.out.println("Bias2:"); System.out.println(Arrays.toString(nn.getBias2())); System.out.println("Weights3:"); System.out.println(Arrays.deepToString(nn.getWeights3())); System.out.println("Bias3:"); System.out.println(Arrays.toString(nn.getBias3())); System.out.println("Output:"); System.out.println(Arrays.toString(outputs)); try { Thread.sleep(1000); } catch (InterruptedException e) { e.printStackTrace(); } } Scanner scanner = new Scanner(System.in); System.out.println("\nEnter any key to continue:"); scanner.nextLine(); System.out.println("Initial weights and biases:"); System.out.println(Arrays.deepToString(nn.getWeights1())); System.out.println(Arrays.toString(nn.getBias1())); System.out.println(Arrays.deepToString(nn.getWeights2())); System.out.println(Arrays.toString(nn.getBias2())); System.out.println(Arrays.deepToString(nn.getWeights3())); System.out.println(Arrays.toString(nn.getBias3())); while (true) { System.out.println("\nEnter 'q' to quit or any other key to continue:"); String input = scanner.nextLine(); if (input.equals("q")) { break; } // Change weight and bias values nn.getWeights1()[1][2] += 0.5; nn.getBias2()[1] -= 0.5; nn.getWeights3()[0][0] *= 2; // Output modified weights and biases System.out.println("\nModified weights and biases:"); System.out.println(Arrays.deepToString(nn.getWeights1())); System.out.println(Arrays.toString(nn.getBias1())); System.out.println(Arrays.deepToString(nn.getWeights2())); System.out.println(Arrays.toString(nn.getBias2())); System.out.println(Arrays.deepToString(nn.getWeights3())); System.out.println(Arrays.toString(nn.getBias3())); // Calculate and output new output with modified weights and biases double[] outputs = nn.feedForward(inputs); System.out.println("New output:"); System.out.println(Arrays.toString(outputs)); try { Thread.sleep(1000); } catch (InterruptedException e) { e.printStackTrace(); } } } }
Dietmar Posted April 2, 2023 Author Posted April 2, 2023 (edited) @Mark-XP A very interesting thing is, to compare all the Bias and Weights after training. I mean, to compare the last Bias and Weights between 2 different runs of the program. As you can see, the values change stetig after some iterations in one and the same run. But I think, that the Neural Network only "understands" the problem, IF in each run at the end the weights are ALL nearly identic. If not, there exist many diffferent solutions for the same(!) task and the Neural Network has no idea, which way to choose Dietmar Edited April 2, 2023 by Dietmar
Dietmar Posted April 2, 2023 Author Posted April 2, 2023 (edited) @Mark-XP I notice 2 things: Always the Wights and Bias are different, during different runs. And there appear Weights and Bias >1 and < -1 ??? Chatgpt tells, that this is normal. I would say, it is a big mistake. I really start to wonder, what an Ai is really learning. It can learn facts. But then? As long as there are 2 possible ways, with complete different Weights and Bias for to reach the same result, the Ai has no chance and takes randomally any way. Any "understanding" in such a situation is impossible, Dietmar PS: Those values >1 or <-1 are produced from the Backpropagate algorithmus, brr.. So I think, much more theoretical examination is necessary for Neural Networks. Edited April 2, 2023 by Dietmar
Mark-XP Posted April 2, 2023 Posted April 2, 2023 (edited) Hi @Dietmar, the AI learns by experiences, just to refine weight's and biases of it's Neurons. This is what made me thinking "For (0 and 0) it is 10 times more secure about the result than for (1 and 0)... it hasn't understood the essence of the matter (logical and) at all" (here) The bias values outside [-1, 1] are not astonishing to me a priori: your Backpropagation has to be examinated carefully: bias1[i] += learningRate * hidden1Errors[i] * tanhDerivative(hidden1[i] Btw.: on my current System Ivy Bridge with 4 GB RAM (built years ago to host Win-XP 32) it's not possible to run the last example with a 1000 nodes per hidden Layer! Had to reduce it to 100 nodes... Edited April 2, 2023 by Mark-XP
Dietmar Posted April 2, 2023 Author Posted April 2, 2023 (edited) @Mark-XP Here is a new version. It shows all my helpless (or the helpless of Ai). First I use the Input 0 0 0. Program calculates all Bias and Weights, so that it reaches about 0.5. Then, after 20 Iterations, the last Bias and Values are stored. Now, without any Backpropagation, I use this set of Bias and weights, for the new Input 0 0 1. And now, the crazy things happen: After each run, the output for the 0 0 0 is about 0.5. But the output for the 0 0 1 changes at lot with each run. ChatGPT tells me, to take the Middle of all the used Bias and Weights. But I think, this is helpless as much as possible. So, I have no idea to really train my Network for primes Dietmar package neuralnetwork; import java.util.Arrays; import java.util.concurrent.ThreadLocalRandom; import java.util.Scanner; public class NeuralNetwork { private int numInputNodes; private int numHiddenNodes1; private int numHiddenNodes2; private int numOutputNodes; private double[][] weights1; private double[][] weights2; private double[][] weights3; private double[] bias1; private double[] bias2; private double[] bias3; public NeuralNetwork(int numInputNodes, int numHiddenNodes1, int numHiddenNodes2, int numOutputNodes) { this.numInputNodes = numInputNodes; this.numHiddenNodes1 = numHiddenNodes1; this.numHiddenNodes2 = numHiddenNodes2; this.numOutputNodes = numOutputNodes; this.weights1 = new double[numInputNodes][numHiddenNodes1]; for (int i = 0; i < numInputNodes; i++) { for (int j = 0; j < numHiddenNodes1; j++) { this.weights1[i][j] = ThreadLocalRandom.current().nextDouble(-1, 1); } } this.weights2 = new double[numHiddenNodes1][numHiddenNodes2]; for (int i = 0; i < numHiddenNodes1; i++) { for (int j = 0; j < numHiddenNodes2; j++) { this.weights2[i][j] = ThreadLocalRandom.current().nextDouble(-1, 1); } } this.weights3 = new double[numHiddenNodes2][numOutputNodes]; for (int i = 0; i < numHiddenNodes2; i++) { for (int j = 0; j < numOutputNodes; j++) { this.weights3[i][j] = ThreadLocalRandom.current().nextDouble(-1, 1); } } this.bias1 = new double[numHiddenNodes1]; for (int i = 0; i < numHiddenNodes1; i++) { this.bias1[i] = ThreadLocalRandom.current().nextDouble(-1, 1); } this.bias2 = new double[numHiddenNodes2]; for (int i = 0; i < numHiddenNodes2; i++) { this.bias2[i] = ThreadLocalRandom.current().nextDouble(-1, 1); } this.bias3 = new double[numOutputNodes]; for (int i = 0; i < numOutputNodes; i++) { this.bias3[i] = ThreadLocalRandom.current().nextDouble(-1, 1); } } public double[] feedForward(double[] inputs) { double[] hidden1 = new double[numHiddenNodes1]; double[] hidden2 = new double[numHiddenNodes2]; double[] outputs = new double[numOutputNodes]; // Calculate outputs of hidden layer 1 for (int j = 0; j < numHiddenNodes1; j++) { double sum = 0; for (int i = 0; i < numInputNodes; i++) { sum += inputs[i] * weights1[i][j]; } sum += bias1[j]; hidden1[j] = Math.tanh(sum); } // Calculate outputs of hidden layer 2 for (int j = 0; j < numHiddenNodes2; j++) { double sum = 0; for (int i = 0; i < numHiddenNodes1; i++) { sum += hidden1[i] * weights2[i][j]; } sum += bias2[j]; hidden2[j] = Math.tanh(sum);; } // Calculate outputs for (int j = 0; j < numOutputNodes; j++) { double sum = 0; for (int i = 0; i < numHiddenNodes2; i++) { sum += hidden2[i] * weights3[i][j]; } sum += bias3[j]; outputs[j] = Math.tanh(sum);; } return outputs; } public void setWeights1(double[][] weights) { this.weights1 = weights; } public void setBias1(double[] bias) { this.bias1 = bias; } public void setWeights2(double[][] weights) { this.weights2 = weights; } public void setBias2(double[] bias) { this.bias2 = bias; } public void setWeights3(double[][] weights) { this.weights3 = weights; } public void setBias3(double[] bias) { this.bias3 = bias; } public double[][] getWeights1() { return this.weights1; } public double[][] getWeights2() { return this.weights2; } public double[][] getWeights3() { return this.weights3; } public double[] getBias1() { return this.bias1; } public double[] getBias2() { return this.bias2; } public double[] getBias3() { return this.bias3; } // Backward Propagation public void backPropagate(double[] inputs, double[] expectedOutputs, double learningRate) { // Feed forward to get outputs double[] hidden1 = new double[numHiddenNodes1]; double[] hidden2 = new double[numHiddenNodes2]; double[] outputs = feedForward(inputs); // Calculate error in output layer double[] outputErrors = new double[numOutputNodes]; for (int i = 0; i < numOutputNodes; i++) { outputErrors[i] = expectedOutputs[i] - outputs[i]; } // Calculate error in hidden layer 2 double[] hidden2Errors = new double[numHiddenNodes2]; for (int i = 0; i < numHiddenNodes2; i++) { double error = 0; for (int j = 0; j < numOutputNodes; j++) { error += outputErrors[j] * weights3[i][j]; } hidden2Errors[i] = (1 - Math.pow(Math.tanh(hidden2[i]), 2)) * error; } // Calculate error in hidden layer 1 double[] hidden1Errors = new double[numHiddenNodes1]; for (int i = 0; i < numHiddenNodes1; i++) { double error = 0; for (int j = 0; j < numHiddenNodes2; j++) { error += hidden2Errors[j] * weights2[i][j]; } hidden1Errors[i] = (1 - Math.pow(Math.tanh(hidden1[i]), 2)) * error; } // Update weights and biases in output layer for (int i = 0; i < numHiddenNodes2; i++) { for (int j = 0; j < numOutputNodes; j++) { double delta = outputErrors[j] * tanhDerivative(outputs[j]) * hidden2[i]; weights3[i][j] += learningRate * delta; } } for (int i = 0; i < numOutputNodes; i++) { bias3[i] += learningRate * outputErrors[i] * tanhDerivative(outputs[i]); } // Calculate error in hidden layer 1 for (int i = 0; i < numHiddenNodes1; i++) { double error = 0; for (int j = 0; j < numHiddenNodes2; j++) { error += hidden2Errors[j] * weights2[i][j]; } hidden1Errors[i] = error * tanhDerivative(hidden1[i]); } // Update weights and biases in hidden layer 2 for (int i = 0; i < numHiddenNodes1; i++) { for (int j = 0; j < numHiddenNodes2; j++) { double delta = hidden2Errors[j] * tanhDerivative(hidden2[j]) * hidden1[i]; weights2[i][j] += learningRate * delta; } } for (int i = 0; i < numHiddenNodes2; i++) { bias2[i] += learningRate * hidden2Errors[i] * tanhDerivative(hidden2[i]); } // Update weights and biases in hidden layer 1 for (int i = 0; i < numInputNodes; i++) { for (int j = 0; j < numHiddenNodes1; j++) { double delta = hidden1Errors[j] * tanhDerivative(hidden1[j]) * inputs[i]; weights1[i][j] += learningRate * delta; } } for (int i = 0; i < numHiddenNodes1; i++) { bias1[i] += learningRate * hidden1Errors[i] * tanhDerivative(hidden1[i]); } } // Helper method to calculate the derivative of the hyperbolic tangent function private double tanhDerivative(double x) { double tanh = Math.tanh(x); return 1 - tanh * tanh; } public static void main(String[] args) { NeuralNetwork nn = new NeuralNetwork(3, 4, 4, 1); double[] inputs = {0, 0, 0}; double[] expectedOutputs = {0.5}; for (int i = 0; i < 20; i++) { nn.backPropagate(inputs, expectedOutputs, 0.1); double[] outputs = nn.feedForward(inputs); System.out.println("\nIteration " + (i+1)); System.out.println("Weights1:"); System.out.println(Arrays.deepToString(nn.getWeights1())); System.out.println("Bias1:"); System.out.println(Arrays.toString(nn.getBias1())); System.out.println("Weights2:"); System.out.println(Arrays.deepToString(nn.getWeights2())); System.out.println("Bias2:"); System.out.println(Arrays.toString(nn.getBias2())); System.out.println("Weights3:"); System.out.println(Arrays.deepToString(nn.getWeights3())); System.out.println("Bias3:"); System.out.println(Arrays.toString(nn.getBias3())); System.out.println("Output:"); System.out.println(Arrays.toString(outputs)); try { Thread.sleep(1000); } catch (InterruptedException e) { e.printStackTrace(); } } Scanner scanner = new Scanner(System.in); System.out.println("\nEnter any key to continue:"); scanner.nextLine(); System.out.println("Initial weights and biases:"); System.out.println(Arrays.deepToString(nn.getWeights1())); System.out.println(Arrays.toString(nn.getBias1())); System.out.println(Arrays.deepToString(nn.getWeights2())); System.out.println(Arrays.toString(nn.getBias2())); System.out.println(Arrays.deepToString(nn.getWeights3())); System.out.println(Arrays.toString(nn.getBias3())); while (true) { System.out.println("\nEnter 'q' to quit or any other key to continue:"); String input = scanner.nextLine(); if (input.equals("q")) { break; } // Change weight and bias values nn.getWeights1()[1][2] += 0.5; nn.getBias2()[1] -= 0.5; nn.getWeights3()[0][0] *= 2; // Output modified weights and biases System.out.println("\nModified weights and biases:"); System.out.println(Arrays.deepToString(nn.getWeights1())); System.out.println(Arrays.toString(nn.getBias1())); System.out.println(Arrays.deepToString(nn.getWeights2())); System.out.println(Arrays.toString(nn.getBias2())); System.out.println(Arrays.deepToString(nn.getWeights3())); System.out.println(Arrays.toString(nn.getBias3())); // Feed-forward with new inputs using the last trained weights and biases double[] newInputs = {0, 0, 1}; nn.setWeights1(nn.getWeights1()); nn.setBias1(nn.getBias1()); nn.setWeights2(nn.getWeights2()); nn.setBias2(nn.getBias2()); nn.setWeights3(nn.getWeights3()); nn.setBias3(nn.getBias3()); double[] newOutputs = nn.feedForward(newInputs); System.out.println("New output:"); System.out.println(Arrays.toString(newOutputs)); try { Thread.sleep(1000); } catch (InterruptedException e) { e.printStackTrace(); } } } } Edited April 2, 2023 by Dietmar
Dietmar Posted April 4, 2023 Author Posted April 4, 2023 @Mark-XP I write a new Prime Number program with Neuroph. It is the best until now. But it cant find primes. If there is anything, that Neural Networks can found, that it is a pattern. So, even with really BIG learning, no pattern can be found in Primes via a Neural network Dietmar package geradeungeradenetzwerk; import java.io.BufferedWriter; import java.io.FileWriter; import java.io.IOException; import java.nio.file.Files; import java.nio.file.Paths; import java.util.ArrayList; import java.util.HashMap; import java.util.List; import java.util.Map; import java.util.Scanner; import org.neuroph.core.NeuralNetwork; import org.neuroph.core.data.DataSet; import org.neuroph.core.data.DataSetRow; import org.neuroph.nnet.MultiLayerPerceptron; import org.neuroph.nnet.learning.BackPropagation; public class GeradeUngeradeNetzwerk { public static void main(String[] args) throws IOException { String logFileName = "input_log.txt"; Map<Integer, Double> userInputMap = new HashMap<>(); // Load input-output pairs from log file DataSet dataset = new DataSet(1, 1); if (Files.exists(Paths.get(logFileName))) { List<String> lines = Files.readAllLines(Paths.get(logFileName)); for (String line : lines) { String[] parts = line.split(","); int input = Integer.parseInt(parts[0]); double output = Double.parseDouble(parts[1]); double[] inputArr = new double[] { input }; double[] outputArr = new double[] { output }; dataset.add(new DataSetRow(inputArr, outputArr)); userInputMap.put(input, output); } } // Train neural network on input-output pairs MultiLayerPerceptron neuralNet = new MultiLayerPerceptron(1, 300, 300,300,300,300,300,300, 1); BackPropagation learningRule = neuralNet.getLearningRule(); learningRule.setMaxError(0.1); learningRule.setMaxIterations(1000); neuralNet.learn(dataset); // Use trained neural network to classify new integers as prime or not prime Scanner scanner = new Scanner(System.in); while (true) { System.out.print("Enter an integer (or 'exit' to quit): "); String inputStr = scanner.nextLine(); if (inputStr.equals("exit")) { break; } int input; try { input = Integer.parseInt(inputStr); } catch (NumberFormatException e) { System.out.println("Invalid input. Please enter an integer."); continue; } if(userInputMap.containsKey(input)){ double result = userInputMap.get(input); if (result >= 0.5) { System.out.println(input + " is prime (according to the neural network)"); } else { System.out.println(input + " is not prime (according to the neural network)"); } } else{ double[] inputArr = new double[] { input }; neuralNet.setInput(inputArr); neuralNet.calculate(); double[] output = neuralNet.getOutput(); double result = output[0]; if (result >= 0.5) { System.out.println(input + " is prime (according to the neural network)"); } else { System.out.println(input + " is not prime (according to the neural network)"); } // Ask user if result is correct and store input-output pair in log file System.out.print("Is this result correct? (y/n): "); String answer = scanner.nextLine().toLowerCase(); if (answer.equals("y")) { double[] outputArr = new double[]{result}; dataset.add(new DataSetRow(inputArr, outputArr)); BufferedWriter writer = new BufferedWriter(new FileWriter(logFileName, true)); writer.write(input + "," + result + "\n"); writer.close(); userInputMap.put(input, result); } else if (answer.equals("n")) { // Change output to the correct result and add input-output pair to log file System.out.print("What is the correct result? "); String correctResultStr = scanner.nextLine(); try { double correctResult = Double.parseDouble(correctResultStr); double[] outputArr = new double[]{correctResult}; dataset.add(new DataSetRow(inputArr, outputArr)); BufferedWriter writer = new BufferedWriter(new FileWriter(logFileName, true)); writer.write(input + "," + correctResult + "\n"); writer.close(); userInputMap.put(input, correctResult); } catch (NumberFormatException e) { System.out.println("Invalid input, the result was not recorded"); } } } }}}
Dietmar Posted April 4, 2023 Author Posted April 4, 2023 (edited) The Prime Program produces a file "input_log.txt". This file is stored. Now I write another program, for to put correct information about all no primes and primes into this file. It works. All the numbers in this file are now correct identified as primes or no prime from the prim program above. To teach the prim program, you have to answer with "0" for no prime and "1" for prime. Neural Networks can find pattern. I check with even and not even numbers, works. Here is the generator program for a filled "input_log.txt" until 1000 with correct information, cool Dietmar package generateinputlog; import java.io.BufferedWriter; import java.io.FileWriter; import java.io.IOException; public class GenerateInputLog { public static void main(String[] args) throws IOException { String logFileName = "input_log.txt"; BufferedWriter writer = new BufferedWriter(new FileWriter(logFileName)); for (int i = 0; i <= 1000; i++) { boolean isPrime = isPrime(i); int output = isPrime ? 1 : 0; writer.write(i + "," + output + "\n"); } writer.close(); } private static boolean isPrime(int num) { if (num <= 1) { return false; } for (int i = 2; i <= Math.sqrt(num); i++) { if (num % i == 0) { return false; } } return true; } } Edited April 4, 2023 by Dietmar
Dietmar Posted April 4, 2023 Author Posted April 4, 2023 I make an interesting observation. Until now, I do not succeed to train an Neural Network, so that it can recognice the multiplers of 3. Hm, interesting: ChatGPT cant find until now all multiplieres of 7. So, some Mathematics seems to be a cool Intelligence Test for any Artificial Intelligence. Soon, I understand most pupils at school.. The task for the Ai was: I train with numbers to 1000. The multiplers of 2 she found all correct. But even with loong training, that 1002 is a multipler from 3, I gave up with 1005^^, oh my Dietmar PS: No tools like modulo 3 = 0 are allowed. Only the naked input data.
Dietmar Posted April 4, 2023 Author Posted April 4, 2023 Today I find by myself a very precise definition, what Intelligence really is: 1.) You find only by yourself a pattern. 2.) Then you can describe this pattern to somebody other. Very careful and exact. And why this pattern always works. Maximum, what any kind of Artificially Intelligence can do today, April 2023 is, to find a pattern. But not to describe it and why this works. Few months ago the Ai find a new, better way to multiply matritzes. But mathematics needs months for to understand, what the Ai is doing. Because the Ai understands nothing and so cant help them. Roentgen does not understand the nature of X-rays. Einstein understands his theory. Here, you can see the big big difference at once. SkyNet with terminator moves via this way soso much far away to future, I think >>40 years Dietmar
Recommended Posts
Please sign in to comment
You will be able to leave a comment after signing in
Sign In Now