I work in an environment in which computational resources are shared, i.e., we have a few server machines equipped with a few Nvidia Titan X GPUs each.
For small to moderate size... moreI work in an environment in which computational resources are shared, i.e., we have a few server machines equipped with a few Nvidia Titan X GPUs each.
For small to moderate size models, the 12 GB of the Titan X is usually enough for 2–3 people to run training concurrently on the same GPU. If the models are small enough that a single model does not take full advantage of all the computational units of the GPU, this can actually result in a speedup compared with running one training process after the other. Even in cases where the concurrent access to the GPU does slow down the individual training time, it is still nice to have the flexibility of having multiple users simultaneously train on the GPU.
The problem with TensorFlow is that, by default, it allocates the full amount of available GPU memory when it is launched. Even for a small two-layer neural network, I see that all 12 GB of the GPU memory is used up.
Is there a way to make TensorFlow only allocate, say, 4 GB of GPU memory, if one knows... less
from your experience, which is the most effective approach to implement artificial neural networks prototypes? It is a lot of hype about R (free, but I didn't work with it) or... morefrom your experience, which is the most effective approach to implement artificial neural networks prototypes? It is a lot of hype about R (free, but I didn't work with it) or Matlab (not free), another possible choice is to use a language like C++/Java/C#. The question is mainly targeting the people that tried to test some neural networks architectures or learning algorithms.
If your choice is to use a programming language different from the three mentioned above, can you tell me their names and some explanations concerning your choice (excepting: this is the only/most used language known by me). less
Perhaps too general a question, but can anyone explain what would cause a Convolutional Neural Network to diverge?Specifics:I am using Tensorflow's iris_training model with some... morePerhaps too general a question, but can anyone explain what would cause a Convolutional Neural Network to diverge?Specifics:I am using Tensorflow's iris_training model with some of my own data and keep gettingERROR:tensorflow: Model diverged with loss = NaN.Traceback...tensorflow.contrib.learn.python.learn.monitors.NanLossDuringTrainingError: NaN loss during training.Traceback originated with line:
tf.contrib.learn.DNNClassifier(feature_columns=feature_columns,
hidden_units=,
#optimizer=tf.train.ProximalAdagradOptimizer(learning_rate=0.001, l1_regularization_strength=0.00001),
n_classes=11,
model_dir="/tmp/iris_model")
I've tried adjusting the optimizer, using a zero for learning rate, and using no optimizer. Any insights into network layers, data size, etc is appreciated. less
I found in many available neural network code implemented using TensorFlow that regularization terms are often implemented by manually adding an additional term to loss value.My... moreI found in many available neural network code implemented using TensorFlow that regularization terms are often implemented by manually adding an additional term to loss value.My questions are:1.Is there a more elegant or recommended way of regularization than doing it manually?2.I also find that get_variable has an argument regularizer. How should it be used? According to my observation, if we pass a regularizer to it (such as tf.contrib.layers.l2_regularizer, a tensor representing regularized term will be computed and added to a graph collection named tf.GraphKeys.REGULARIZATOIN_LOSSES. Will that collection be automatically used by TensorFlow (e.g. used by optimizers when training)? Or is it expected that I should use that collection by myself? less
I have built a 3 layer neural network to perform a binary mapping (2016 inputs, 288 outputs.) I am getting decent results with mean square error and stochastic gradient decent. My... moreI have built a 3 layer neural network to perform a binary mapping (2016 inputs, 288 outputs.) I am getting decent results with mean square error and stochastic gradient decent. My question is: Is there a more appropriate loss function for regression when the output is binary?
I am confused about the difference between batch and growing batch q learning. Also, if I only have historical data, can I implement growing batch q learning?Thank you!
We are only using the RNN decoder (without encoder) for text generation, how is RNN decoder different from pure RNN operation?RNN Decoder in... moreWe are only using the RNN decoder (without encoder) for text generation, how is RNN decoder different from pure RNN operation?RNN Decoder in TensorFlow: https://www.tensorflow.org/api_docs/python/tf/contrib/seq2seq/dynamic_rnn_decoderPure RNN in TensorFlow: https://www.tensorflow.org/api_docs/python/tf/nn/dynamic_rnnThanks for your time
What does it mean to "unroll a RNN dynamically". I've seen this specifically mentioned in the Tensorflow source code, but I'm looking for a conceptual explanation that extends to... moreWhat does it mean to "unroll a RNN dynamically". I've seen this specifically mentioned in the Tensorflow source code, but I'm looking for a conceptual explanation that extends to RNN in general.In the tensorflow rnn method, it is documented:If the sequence_length vector is provided, dynamic calculation is performed. This method of calculation does not compute the RNN steps past the maximum sequence length of the minibatch (thus saving computational time),But in the dynamic_rnn method it mentions:The parameter sequence_length is optional and is used to copy-through state and zero-out outputs when past a batch element's sequence length. So it's more for correctness than performance, unlike in rnn().So does this mean rnn is more performant for variable length sequences? What is the conceptual difference between dynamic_rnn and rnn? less
I had a tough evening today trying to convince one of my colleagues that NLP or Natural Language Processing is the super set and Text Analyticsis a sub set of it. At the best... moreI had a tough evening today trying to convince one of my colleagues that NLP or Natural Language Processing is the super set and Text Analyticsis a sub set of it. At the best probably both are synonymous and can be used interchangeably.
Is that correct? Anybody who has a crystal clarity as to whether these terms have a boundary well defined or can be used interchangeably?