摘要:前言本文使用訓練多元線性回歸模型,并將其與做比較。在這個例子中,變量一個是面積,一個是房間數,量級相差很大,如果不歸一化,面積在目標函數和梯度中就會占據主導地位,導致收斂極慢。
前言
本文使用tensorflow訓練多元線性回歸模型,并將其與scikit-learn做比較。數據集來自Andrew Ng的網上公開課程Deep Learning
代碼#!/usr/bin/env python # -*- coding=utf-8 -*- # @author: 陳水平 # @date: 2016-12-30 # @description: compare multi linear regression of tensor flow to scikit-learn based on data from deep learning cource of Andrew Ng # @ref: http://openclassroom.stanford.edu/MainFolder/DocumentPage.php?course=DeepLearning&doc=exercises/ex3/ex3.html # import numpy as np import tensorflow as tf from sklearn import linear_model from sklearn import preprocessing # Read x and y x_data = np.loadtxt("ex3x.dat").astype(np.float32) y_data = np.loadtxt("ex3y.dat").astype(np.float32) # We evaluate the x and y by sklearn to get a sense of the coefficients. reg = linear_model.LinearRegression() reg.fit(x_data, y_data) print "Coefficients of sklearn: K=%s, b=%f" % (reg.coef_, reg.intercept_) # Now we use tensorflow to get similar results. # Before we put the x_data into tensorflow, we need to standardize it # in order to achieve better performance in gradient descent; # If not standardized, the convergency speed could not be tolearated. # Reason: If a feature has a variance that is orders of magnitude larger than others, # it might dominate the objective function # and make the estimator unable to learn from other features correctly as expected. scaler = preprocessing.StandardScaler().fit(x_data) print scaler.mean_, scaler.scale_ x_data_standard = scaler.transform(x_data) W = tf.Variable(tf.zeros([2, 1])) b = tf.Variable(tf.zeros([1, 1])) y = tf.matmul(x_data_standard, W) + b loss = tf.reduce_mean(tf.square(y - y_data.reshape(-1, 1)))/2 optimizer = tf.train.GradientDescentOptimizer(0.3) train = optimizer.minimize(loss) init = tf.initialize_all_variables() sess = tf.Session() sess.run(init) for step in range(100): sess.run(train) if step % 10 == 0: print step, sess.run(W).flatten(), sess.run(b).flatten() print "Coefficients of tensorflow (input should be standardized): K=%s, b=%s" % (sess.run(W).flatten(), sess.run(b).flatten()) print "Coefficients of tensorflow (raw input): K=%s, b=%s" % (sess.run(W).flatten() / scaler.scale_, sess.run(b).flatten() - np.dot(scaler.mean_ / scaler.scale_, sess.run(W)))
輸出如下:
Coefficients of sklearn: K=[ 139.21066284 -8738.02148438], b=89597.927966 [ 2000.6809082 3.17021275] [ 7.86202576e+02 7.52842903e-01] 0 [ 31729.23632812 16412.6484375 ] [ 102123.7890625] 10 [ 97174.78125 5595.25585938] [ 333681.59375] 20 [ 106480.5703125 -3611.31201172] [ 340222.53125] 30 [ 108727.5390625 -5858.10302734] [ 340407.28125] 40 [ 109272.953125 -6403.52148438] [ 340412.5] 50 [ 109405.3515625 -6535.91503906] [ 340412.625] 60 [ 109437.4921875 -6568.05371094] [ 340412.625] 70 [ 109445.296875 -6575.85644531] [ 340412.625] 80 [ 109447.1875 -6577.75097656] [ 340412.625] 90 [ 109447.640625 -6578.20654297] [ 340412.625] Coefficients of tensorflow (input should be standardized): K=[ 109447.7421875 -6578.31152344], b=[ 340412.625] Coefficients of tensorflow (raw input): K=[ 139.21061707 -8737.9609375 ], b=[ 89597.78125]思考
對于梯度下降算法,變量是否標準化很重要。在這個例子中,變量一個是面積,一個是房間數,量級相差很大,如果不歸一化,面積在目標函數和梯度中就會占據主導地位,導致收斂極慢。
文章版權歸作者所有,未經允許請勿轉載,若此文章存在違規行為,您可以聯系管理員刪除。
轉載請注明本文地址:http://m.specialneedsforspecialkids.com/yun/38316.html
摘要:貢獻者飛龍版本最近總是有人問我,把這些資料看完一遍要用多長時間,如果你一本書一本書看的話,的確要用很長時間。為了方便大家,我就把每本書的章節拆開,再按照知識點合并,手動整理了這個知識樹。 Special Sponsors showImg(https://segmentfault.com/img/remote/1460000018907426?w=1760&h=200); 貢獻者:飛龍版...
摘要:前言本文使用訓練線性回歸模型,并將其與做比較。數據集來自的網上公開課程代碼陳水平輸出如下思考對于,梯度下降的步長參數需要很仔細的設置,步子太大容易扯到蛋導致無法收斂步子太小容易等得蛋疼。迭代次數也需要細致的嘗試。 前言 本文使用tensorflow訓練線性回歸模型,并將其與scikit-learn做比較。數據集來自Andrew Ng的網上公開課程Deep Learning 代碼 #!/...
閱讀 716·2021-11-16 11:44
閱讀 3548·2019-08-26 12:13
閱讀 3243·2019-08-26 10:46
閱讀 2357·2019-08-23 12:37
閱讀 1189·2019-08-22 18:30
閱讀 2532·2019-08-22 17:30
閱讀 1841·2019-08-22 17:26
閱讀 2293·2019-08-22 16:20