利用 tf.gradients 在 TensorFlow 中實(shí)現(xiàn)梯度下降

ckllj 發(fā)布于2019-07-30 15:10 / 557人閱讀

摘要：使用內(nèi)置的優(yōu)化器對(duì)數(shù)據(jù)集進(jìn)行回歸在使用實(shí)現(xiàn)梯度下降之前，我們先嘗試使用的內(nèi)置優(yōu)化器比如來(lái)解決數(shù)據(jù)集分類問題。使用對(duì)數(shù)據(jù)集進(jìn)行回歸通過(guò)梯度下降公式，權(quán)重的更新方式如下為了實(shí)現(xiàn)梯度下降，我將不使用優(yōu)化器的代碼，而是采用自己寫的權(quán)重更新。

作者：chen_h
微信號(hào) & QQ：862251340
微信公眾號(hào)：coderpai
簡(jiǎn)書地址：http://www.jianshu.com/p/13e0...

我喜歡 TensorFlow 的其中一個(gè)原因是它可以自動(dòng)的計(jì)算函數(shù)的梯度。我們只需要設(shè)計(jì)我們的函數(shù)，然后去調(diào)用 tf.gradients 函數(shù)就可以了。是不是非常簡(jiǎn)單。

接下來(lái)讓我們來(lái)舉個(gè)例子，具體說(shuō)明一下。

使用 TensorFlow 內(nèi)置的優(yōu)化器對(duì) MNIST 數(shù)據(jù)集進(jìn)行 softmax 回歸

在使用 tf.gradients 實(shí)現(xiàn)梯度下降之前，我們先嘗試使用 TensorFlow 的內(nèi)置優(yōu)化器（比如 GradientDescentOptimizer）來(lái)解決MNIST數(shù)據(jù)集分類問題。

import tensorflow as tf

# Import MNIST data
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("/tmp/data/", one_hot=True)

# Parameters
learning_rate = 0.01
training_epochs = 10
batch_size = 100
display_step = 1


# tf Graph Input
x = tf.placeholder(tf.float32, [None, 784]) # mnist data image of shape 28*28=784
y = tf.placeholder(tf.float32, [None, 10]) # 0-9 digits recognition => 10 classes

# Set model weights
W = tf.Variable(tf.zeros([784, 10]))
b = tf.Variable(tf.zeros([10]))

# Construct model
pred = tf.nn.softmax(tf.matmul(x, W) + b) # Softmax

# Minimize error using cross entropy
cost = tf.reduce_mean(-tf.reduce_sum(y*tf.log(pred), reduction_indices=1))

optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(cost)

# Start training
with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())

    # Training cycle
    for epoch in range(training_epochs):
        avg_cost = 0.
        total_batch = int(mnist.train.num_examples/batch_size)
        # Loop over all batches
        for i in range(total_batch):
            batch_xs, batch_ys = mnist.train.next_batch(batch_size)
            # Fit training using batch data
            _, c = sess.run([optimizer, cost], feed_dict={x: batch_xs,
                                                       y: batch_ys})
            
#             print(__w)
            
            # Compute average loss
            avg_cost += c / total_batch
        # Display logs per epoch step
        if (epoch+1) % display_step == 0:
#             print(sess.run(W))
            print ("Epoch:", "%04d" % (epoch+1), "cost=", "{:.9f}".format(avg_cost))

    print ("Optimization Finished!")

    # Test model
    correct_prediction = tf.equal(tf.argmax(pred, 1), tf.argmax(y, 1))
    # Calculate accuracy for 3000 examples
    accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
    print ("Accuracy:", accuracy.eval({x: mnist.test.images[:3000], y: mnist.test.labels[:3000]}))
    
    
#### Output
    
# Extracting /tmp/data/train-images-idx3-ubyte.gz
# Extracting /tmp/data/train-labels-idx1-ubyte.gz
# Extracting /tmp/data/t10k-images-idx3-ubyte.gz
# Extracting /tmp/data/t10k-labels-idx1-ubyte.gz
# Epoch: 0001 cost= 1.184285608
# Epoch: 0002 cost= 0.665428013
# Epoch: 0003 cost= 0.552858426
# Epoch: 0004 cost= 0.498728328
# Epoch: 0005 cost= 0.465593693
# Epoch: 0006 cost= 0.442609185
# Epoch: 0007 cost= 0.425552949
# Epoch: 0008 cost= 0.412188290
# Epoch: 0009 cost= 0.401390140
# Epoch: 0010 cost= 0.392354651
# Optimization Finished!
# Accuracy: 0.873333

所以，我們?cè)谶@里做的是利用內(nèi)置的優(yōu)化器來(lái)計(jì)算損失值。如果我們想自己計(jì)算漸變過(guò)程和更新權(quán)重，那應(yīng)該怎么辦？這就是 tf.gradients 的作用了。

使用 tf.gradients 對(duì)MNIST數(shù)據(jù)集進(jìn)行 softmax 回歸

通過(guò)梯度下降公式，權(quán)重的更新方式如下：

為了實(shí)現(xiàn)梯度下降，我將不使用優(yōu)化器的代碼，而是采用自己寫的權(quán)重更新。

因?yàn)檫@里有權(quán)重矩陣 w 和偏差項(xiàng)矩陣 b，所以我們需要去計(jì)算這些矩陣的梯度。所以實(shí)現(xiàn)的代碼如下：

# Computing the gradient of cost with respect to W and b
grad_W, grad_b = tf.gradients(xs=[W, b], ys=cost)

# Gradient Step
new_W = W.assign(W - learning_rate * grad_W)
new_b = b.assign(b - learning_rate * grad_b)

這三行代碼只是替代前面的一行代碼，干嘛給自己造成這么大的麻煩呢？因?yàn)槿绻阈枰约旱膿p失函數(shù)的梯度，并且你不想編寫嚴(yán)格的數(shù)學(xué)函數(shù)，那么 TensorFlow 就可以幫助你了。

我們已經(jīng)構(gòu)建好了計(jì)算圖，所以接下來(lái)我們只需要在會(huì)話中運(yùn)行這個(gè)計(jì)算圖就行了。讓我來(lái)試試吧。

# Fit training using batch data
            _, _,  c = sess.run([new_W, new_b ,cost], feed_dict={x: batch_xs, y: batch_ys})

我們不需要 new_W 和 new_b 的輸出，所以我忽略了這些變量。

完整代碼如下：

import tensorflow as tf

# Import MNIST data
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("/tmp/data/", one_hot=True)

# Parameters
learning_rate = 0.01
training_epochs = 10
batch_size = 100
display_step = 1

# Parameters
learning_rate = 0.01
training_epochs = 10
batch_size = 100
display_step = 1

# tf Graph Input
x = tf.placeholder(tf.float32, [None, 784]) # mnist data image of shape 28*28=784
y = tf.placeholder(tf.float32, [None, 10]) # 0-9 digits recognition => 10 classes

# Set model weights
W = tf.Variable(tf.zeros([784, 10]))
b = tf.Variable(tf.zeros([10]))

# Construct model
pred = tf.nn.softmax(tf.matmul(x, W) + b) # Softmax

# Minimize error using cross entropy
cost = tf.reduce_mean(-tf.reduce_sum(y*tf.log(pred), reduction_indices=1))

grad_W, grad_b = tf.gradients(xs=[W, b], ys=cost)


new_W = W.assign(W - learning_rate * grad_W)
new_b = b.assign(b - learning_rate * grad_b)

# Initialize the variables (i.e. assign their default value)
init = tf.global_variables_initializer()

# Start training
with tf.Session() as sess:
    sess.run(init)

    # Training cycle
    for epoch in range(training_epochs):
        avg_cost = 0.
        total_batch = int(mnist.train.num_examples/batch_size)
        # Loop over all batches
        for i in range(total_batch):
            batch_xs, batch_ys = mnist.train.next_batch(batch_size)
            # Fit training using batch data
            _, _,  c = sess.run([new_W, new_b ,cost], feed_dict={x: batch_xs,
                                                       y: batch_ys})
            
            # Compute average loss
            avg_cost += c / total_batch
        # Display logs per epoch step
        if (epoch+1) % display_step == 0:
#             print(sess.run(W))
            print ("Epoch:", "%04d" % (epoch+1), "cost=", "{:.9f}".format(avg_cost))

    print ("Optimization Finished!")

    # Test model
    correct_prediction = tf.equal(tf.argmax(pred, 1), tf.argmax(y, 1))
    # Calculate accuracy for 3000 examples
    accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
    print ("Accuracy:", accuracy.eval({x: mnist.test.images[:3000], y: mnist.test.labels[:3000]}))
    
    
# Output
# Epoch: 0001 cost= 1.183741399
# Epoch: 0002 cost= 0.665312284
# Epoch: 0003 cost= 0.552796521
# Epoch: 0004 cost= 0.498697014
# Epoch: 0005 cost= 0.465521633
# Epoch: 0006 cost= 0.442611256
# Epoch: 0007 cost= 0.425528946
# Epoch: 0008 cost= 0.412203073
# Epoch: 0009 cost= 0.401364554
# Epoch: 0010 cost= 0.392398663
# Optimization Finished!
# Accuracy: 0.874

使用梯度公式的 softmax 回歸

我們對(duì)于權(quán)重 w 的梯度處理如下：

如前所示，不使用 tf.gradients 或使用 TensorFlow 的內(nèi)置優(yōu)化器，這樣可以實(shí)現(xiàn)梯度方程。完整代碼如下：

import tensorflow as tf

# Import MNIST data
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("/tmp/data/", one_hot=True)

# Parameters
learning_rate = 0.01
training_epochs = 10
batch_size = 100
display_step = 1

# Parameters
learning_rate = 0.01
training_epochs = 10
batch_size = 100
display_step = 1

# tf Graph Input
x = tf.placeholder(tf.float32, [None, 784]) # mnist data image of shape 28*28=784
y = tf.placeholder(tf.float32, [None, 10]) # 0-9 digits recognition => 10 classes

# Set model weights
W = tf.Variable(tf.zeros([784, 10]))
b = tf.Variable(tf.zeros([10]))

# Construct model
pred = tf.nn.softmax(tf.matmul(x, W)) # Softmax

# Minimize error using cross entropy
cost = tf.reduce_mean(-tf.reduce_sum(y*tf.log(pred), reduction_indices=1))


W_grad =  - tf.matmul ( tf.transpose(x) , y - pred) 
b_grad = - tf.reduce_mean( tf.matmul(tf.transpose(x), y - pred), reduction_indices=0)

new_W = W.assign(W - learning_rate * W_grad)
new_b = b.assign(b - learning_rate * b_grad)

init = tf.global_variables_initializer()


with tf.Session() as sess:
    sess.run(init)

    # Training cycle
    for epoch in range(training_epochs):
        avg_cost = 0.
        total_batch = int(mnist.train.num_examples/batch_size)
        # Loop over all batches
        for i in range(total_batch):
            batch_xs, batch_ys = mnist.train.next_batch(batch_size)
            # Fit training using batch data
            _, _, c = sess.run([new_W, new_b, cost], feed_dict={x: batch_xs, y: batch_ys})
            
        
            # Compute average loss
            avg_cost += c / total_batch
        # Display logs per epoch step
        if (epoch+1) % display_step == 0:
            print ("Epoch:", "%04d" % (epoch+1), "cost=", "{:.9f}".format(avg_cost))

    print ("Optimization Finished!")

    # Test model
    correct_prediction = tf.equal(tf.argmax(pred, 1), tf.argmax(y, 1))
    # Calculate accuracy for 3000 examples
    accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
    print ("Accuracy:", accuracy.eval({x: mnist.test.images[:3000], y: mnist.test.labels[:3000]}))
    
    
# Output
# Extracting /tmp/data/train-images-idx3-ubyte.gz
# Extracting /tmp/data/train-labels-idx1-ubyte.gz
# Extracting /tmp/data/t10k-images-idx3-ubyte.gz
# Extracting /tmp/data/t10k-labels-idx1-ubyte.gz
# Epoch: 0001 cost= 0.432943137
# Epoch: 0002 cost= 0.330031527
# Epoch: 0003 cost= 0.313661941
# Epoch: 0004 cost= 0.306443773
# Epoch: 0005 cost= 0.300219418
# Epoch: 0006 cost= 0.298976618
# Epoch: 0007 cost= 0.293222957
# Epoch: 0008 cost= 0.291407861
# Epoch: 0009 cost= 0.288372261
# Epoch: 0010 cost= 0.286749691
# Optimization Finished!
# Accuracy: 0.898

Tensorflow 是如何計(jì)算梯度的？

你可以在思考，TensorFlow是如何計(jì)算函數(shù)的梯度？

TensorFlow 使用的是一種稱為 Automatic Differentiation 的方法，具體你可以查看 Wikipedia。

我希望這篇文章對(duì)你有幫會(huì)幫助。

算法直播課：請(qǐng)點(diǎn)擊這里

作者：chen_h
微信號(hào) & QQ：862251340
簡(jiǎn)書地址：http://www.jianshu.com/p/13e0...

CoderPai 是一個(gè)專注于算法實(shí)戰(zhàn)的平臺(tái)，從基礎(chǔ)的算法到人工智能算法都有設(shè)計(jì)。如果你對(duì)算法實(shí)戰(zhàn)感興趣，請(qǐng)快快關(guān)注我們吧。加入AI實(shí)戰(zhàn)微信群，AI實(shí)戰(zhàn)QQ群，ACM算法微信群，ACM算法QQ群。長(zhǎng)按或者掃描如下二維碼，關(guān)注 “CoderPai” 微信號(hào)（coderpai）

GPU云服務(wù)器云服務(wù)器 fec在webrtc中實(shí)現(xiàn) 在瀏覽器中實(shí)現(xiàn)webrtc視頻傳輸自適應(yīng)梯度下降算法中實(shí)現(xiàn)單點(diǎn)登錄

文章版權(quán)歸作者所有，未經(jīng)允許請(qǐng)勿轉(zhuǎn)載,若此文章存在違規(guī)行為，您可以聯(lián)系管理員刪除。

轉(zhuǎn)載請(qǐng)注明本文地址：http://m.specialneedsforspecialkids.com/yun/41085.html

發(fā)表評(píng)論

登陸后可評(píng)論

0條評(píng)論

ckllj

男|高級(jí)講師

我要關(guān)注我要私信

TA的文章

LINUX：程序和進(jìn)程

閱讀 2348·2021-11-23 09:51
短信驗(yàn)證碼平臺(tái)有哪些比較好用？需從這3個(gè)方面來(lái)決定！

閱讀 1152·2021-11-22 13:52
[11.11]CMIVPS年度大促VPS主機(jī)5折,香港大帶寬/直連線路月付3.5美元起

閱讀 3623·2021-11-10 11:35
Tmwhost，澳門VPS(7.5折優(yōu)惠)，$5.62/月，1核/1G內(nèi)存/50G Raid5 SS

閱讀 1205·2021-10-25 09:47
Resultful API的攔截（過(guò)濾器——Filter）

閱讀 3010·2021-09-07 09:58
前端每日實(shí)戰(zhàn)：145# 視頻演示如何用純 CSS 創(chuàng)作一個(gè)電源開關(guān)控件

閱讀 1074·2019-08-30 15:54
PHP基于Thinkphp5的砍價(jià)活動(dòng)相關(guān)設(shè)計(jì)

閱讀 2830·2019-08-29 14:21
CSS形狀之border-radius

閱讀 3042·2019-08-29 12:20

国产xxxx99真实实拍_久久不雅视频_高清韩国a级特黄毛片_嗯老师别我我受不了了小说

資訊專欄INFORMATION COLUMN

上云采購(gòu)季！| 2核2G4M爆款云服務(wù)器低至59元/年，更有多臺(tái)、長(zhǎng)期優(yōu)惠，快來(lái)選購(gòu)！

利用 tf.gradients 在 TensorFlow 中實(shí)現(xiàn)梯度下降

相關(guān)文章

OpenAI開源TF梯度替換插件，十倍模型計(jì)算時(shí)間僅增加20%

WGAN最新進(jìn)展：從weight clipping到gradient penalty

使用 LSTM 智能作詩(shī)送新年祝福

發(fā)表評(píng)論

0條評(píng)論

ckllj

男|高級(jí)講師

TA的文章

LINUX：程序和進(jìn)程

短信驗(yàn)證碼平臺(tái)有哪些比較好用？需從這3個(gè)方面來(lái)決定！

[11.11]CMIVPS年度大促VPS主機(jī)5折,香港大帶寬/直連線路月付3.5美元起

Tmwhost，澳門VPS(7.5折優(yōu)惠)，$5.62/月，1核/1G內(nèi)存/50G Raid5 SS

Resultful API的攔截（過(guò)濾器——Filter）

前端每日實(shí)戰(zhàn)：145# 視頻演示如何用純 CSS 創(chuàng)作一個(gè)電源開關(guān)控件

PHP基于Thinkphp5的砍價(jià)活動(dòng)相關(guān)設(shè)計(jì)

CSS形狀之border-radius

最新活動(dòng)