QBoard » Artificial Intelligence & ML » AI and ML - Conceptual » Artificial neural networks benchmark

Artificial neural networks benchmark

  • Are there any benchmarks that can be used to check if implementation of ANN is correct?
    I want to have some input and output data, and some information like:
    - The output of Feedforward neural network with 3 layers should be correct in 90% of test data.
    I need this information to be sure that this kind of ANN is able to deal with such problem.
      October 13, 2021 1:43 PM IST
    0
  • You can use the MNIST database of handwritten digits, with a 60k training and a 10k test set, to compare the error rate of your implementation against various other machine learning algorithms like K-NN, SVM, Convolutional networks (Deep learning) and of course different ANN configurations.

     
      October 15, 2021 1:51 PM IST
    0
  • Probably the best thing you can do is design a neural network that learns the XOR function. Here is a web site that shows sample runs: http://www.generation5.org/content/2001/xornet.asp
    I had a homework in which our teacher gave us the first few runs of the neural network with given weights... if you set your neural network with the same weights, then you should get the same results (with straight backpropagation).
    If you have a neural network with 1 input layer (with 2 input neurons + 1 constant), 1 hidden layer (with 2 neurons + 1 constant) and 1 output layer and you initialize all your weights to 0.6, and make your constant neurons always return -1, then you should get the exact same results in your first 10 runs:
    * Data File: xor.csv * Number of examples: 4 Number of input units: 2 Number of hidden units: 2 Maximum Epochs: 10 Learning Rate: 0.100000 Error Margin: 0.100000 ==== Initial Weights ==== Input (3) --> Hidden (3) : 1 2 0 0.600000 0.600000 1 0.600000 0.600000 2 0.600000 0.600000 Hidden (3) --> Output: 0 0.600000 1 0.600000 2 0.600000 ***** Epoch 1 ***** Maximum RMSE: 0.5435466682137927 Average RMSE: 0.4999991292217466 Percent Correct: 0% Input (3) --> Hidden (3) : 1 2 0 0.599691 0.599691 1 0.599987 0.599987 2 0.599985 0.599985 Hidden (3) --> Output: 0 0.599864 1 0.599712 2 0.599712 ***** Epoch 2 ***** Maximum RMSE: 0.5435080531724404 Average RMSE: 0.4999982558452263 Percent Correct: 0% Input (3) --> Hidden (3) : 1 2 0 0.599382 0.599382 1 0.599973 0.599973 2 0.599970 0.599970 Hidden (3) --> Output: 0 0.599726 1 0.599425 2 0.599425 ***** Epoch 3 ***** Maximum RMSE: 0.5434701135827593 Average RMSE: 0.4999973799942081 Percent Correct: 0% Input (3) --> Hidden (3) : 1 2 0 0.599072 0.599072 1 0.599960 0.599960 2 0.599956 0.599956 Hidden (3) --> Output: 0 0.599587 1 0.599139 2 0.599139 ***** Epoch 4 ***** Maximum RMSE: 0.5434328258833577 Average RMSE: 0.49999650178769495 Percent Correct: 0% Input (3) --> Hidden (3) : 1 2 0 0.598763 0.598763 1 0.599948 0.599948 2 0.599941 0.599941 Hidden (3) --> Output: 0 0.599446 1 0.598854 2 0.598854 ***** Epoch 5 ***** Maximum RMSE: 0.5433961673713259 Average RMSE: 0.49999562134010495 Percent Correct: 0% Input (3) --> Hidden (3) : 1 2 0 0.598454 0.598454 1 0.599936 0.599936 2 0.599927 0.599927 Hidden (3) --> Output: 0 0.599304 1 0.598570 2 0.598570 ***** Epoch 6 ***** Maximum RMSE: 0.5433601161709642 Average RMSE: 0.49999473876144657 Percent Correct: 0% Input (3) --> Hidden (3) : 1 2 0 0.598144 0.598144 1 0.599924 0.599924 2 0.599914 0.599914 Hidden (3) --> Output: 0 0.599161 1 0.598287 2 0.598287 ***** Epoch 7 ***** Maximum RMSE: 0.5433246512036478 Average RMSE: 0.49999385415748615 Percent Correct: 0% Input (3) --> Hidden (3) : 1 2 0 0.597835 0.597835 1 0.599912 0.599912 2 0.599900 0.599900 Hidden (3) --> Output: 0 0.599017 1 0.598005 2 0.598005 ***** Epoch 8 ***** Maximum RMSE: 0.5432897521587884 Average RMSE: 0.49999296762990975 Percent Correct: 0% Input (3) --> Hidden (3) : 1 2 0 0.597526 0.597526 1 0.599901 0.599901 2 0.599887 0.599887 Hidden (3) --> Output: 0 0.598872 1 0.597723 2 0.597723 ***** Epoch 9 ***** Maximum RMSE: 0.5432553994658493 Average RMSE: 0.49999207927647754 Percent Correct: 0% Input (3) --> Hidden (3) : 1 2 0 0.597216 0.597216 1 0.599889 0.599889 2 0.599874 0.599874 Hidden (3) --> Output: 0 0.598726 1 0.597443 2 0.597443 ***** Epoch 10 ***** Maximum RMSE: 0.5432215742673802 Average RMSE: 0.4999911891911738 Percent Correct: 0% Input (3) --> Hidden (3) : 1 2 0 0.596907 0.596907 1 0.599879 0.599879 2 0.599862 0.599862 Hidden (3) --> Output: 0 0.598579 1 0.597163 2 0.597163 Input (3) --> Hidden (3) : 1 2 0 0.596907 0.596907 1 0.599879 0.599879 2 0.599862 0.599862 Hidden (3) --> Output: 0 0.598579 1 0.597163 2 0.597163
    xor.csv contains the following data:
    0.000000,0.000000,0 0.000000,1.000000,1 1.000000,0.000000,1 1.000000,1.000000,0
    Your neural network should look like this (disregard the weights, yellow is the constant input neuron): alt text
      October 16, 2021 1:31 PM IST
    0
  • When the software models an orbit without interference, we can automatically label that orbit ‘Correct’. To generate ‘Incorrect’ orbits, we introduce a few perturbations — at some point in the simulation, we randomly nudge a few of the ‘planets’, so that they fail to follow physics. The neural network’s task is to distinguish ‘Correct’ orbital mechanics from ‘Incorrect’ orbits. This is the benchmark that allows us to compare the performance of different neural architectures.

    Because we are generating our data, we can adjust how much perturbation occurs. If a single ‘planet’ is nudged only slightly, we classify that as ‘Slightly Incorrect’. Meanwhile, when many ‘planets’ are nudged by large amounts, we can classify their orbits as ‘Very Incorrect’. In this fashion, we form a continuum: ‘Perfectly Correct’ → ‘Wildly Incorrect’. This is a crucial capacity, which is lacking in cat photos and handwritten digits.

    Comparing Networks’ Sensitivity and Complexity

    Suppose that you want to benchmark a new neural network against the existing state-of-the-art. Using the orbit mechanics software, you feed each network billions of orbits, and measure their respective accuracy. After equal training, both networks identify when an orbit is ‘Wildly Incorrect’ — they both spot large perturbations. However, your new network is better at identifying the ‘Slightly Incorrect’ orbits! This demonstrates that your new network has greater sensitivity.

    And, you can train both networks on orbits with increasing numbers of ‘planets’. When there are only 3 or 4 ‘planets’, both networks perform well. Yet, when the number of ‘planets’ grows to 7 or 8, your new network is still accurate, while the other network begins to fail. This demonstrates that your new network handles greater complexity.

    This allows us to measure the value of network depth explicitly. If a 4-layer convolutional neural network handles 3 ‘planets’ well, but becomes faulty when given 4 ‘planets’, then that 4-layer network has ‘3-planet complexity’. To diagnose orbits of 4 or more ‘planets’, we would need to increase the network’s depth.

    The Value of Deep Networks

    By successively increasing the depth of a neural network, and testing how many ‘planets’ it can handle, we have a metric of network complexity as a function of depth. We can answer the structural question: “If I double the network’s depth, can I double the number of planets?” Perhaps, deeper networks handle complexity at an increasing rate — if a 4-layer network handles ‘3-planet complexity’, an 8-layer network might succeed at ‘7-planet complexity’. If that is the case, it is the strongest argument for building insanely deep networks.

    However, if deeper networks slow down, (e.g. 8-layer networks only operate at ‘5-planet complexity’) that is an argument for letting many shallow networks operate in tandem. This is currently an unsolved problem. Cat photos will never be able to answer it. Generating data sets along a continuum of correctness and complexity offers us a path to the answer.

      October 23, 2021 2:02 PM IST
    0