通過遷移學(xué)習(xí)實(shí)現(xiàn)OCT圖像識(shí)別

olle 發(fā)布于2019-07-30 18:36 / 3563人閱讀

摘要：遷移學(xué)習(xí)遷移學(xué)習(xí)就是用別人已經(jīng)訓(xùn)練好的模型，如，等，把它當(dāng)做，幫助我們提取特征。總結(jié)通過遷移學(xué)習(xí)我們可以使用較少的數(shù)據(jù)訓(xùn)練出來一個(gè)相對(duì)不錯(cuò)的模型，簡(jiǎn)化了機(jī)器學(xué)習(xí)編程特別是在分布式環(huán)境下。代碼實(shí)現(xiàn)部分參考，在此表示感謝。

遷移學(xué)習(xí)

遷移學(xué)習(xí)就是用別人已經(jīng)訓(xùn)練好的模型，如：Inception Model，Resnet Model等，把它當(dāng)做Pre-trained Model，幫助我們提取特征。常用方法是去除Pre-trained Model的最后一層，按照自己的需求重新更改，然后用訓(xùn)練集訓(xùn)練。
因?yàn)?strong>Pre-trained Model可能已經(jīng)使用過大量數(shù)據(jù)集，經(jīng)過了長(zhǎng)時(shí)間的訓(xùn)練，所以我們通過遷移學(xué)習(xí)可以使用較少的數(shù)據(jù)集訓(xùn)練就可以獲得相對(duì)不錯(cuò)的結(jié)果。

由于項(xiàng)目中使用到Estimator，所以我們?cè)俸?jiǎn)單介紹下Estimator。

TF Estimator

這里引用下官網(wǎng) Estimator的介紹。

您可以在本地主機(jī)上或分布式多服務(wù)器環(huán)境中運(yùn)行基于 Estimator 的模型，而無需更改模型。此外，您可以在 CPU、GPU 或 TPU 上運(yùn)行基于 Estimator 的模型，而無需重新編碼模型。

Estimator 簡(jiǎn)化了在模型開發(fā)者之間共享實(shí)現(xiàn)的過程。

您可以使用高級(jí)直觀代碼開發(fā)先進(jìn)的模型。簡(jiǎn)言之，采用 Estimator 創(chuàng)建模型通常比采用低階 TensorFlow API 更簡(jiǎn)單。

Estimator 本身在 tf.layers 之上構(gòu)建而成，可以簡(jiǎn)化自定義過程。

Estimator 會(huì)為您構(gòu)建圖。

Estimator 提供安全的分布式訓(xùn)練循環(huán)，可以控制如何以及何時(shí)：構(gòu)建圖，初始化變量，開始排隊(duì)，處理異常，創(chuàng)建檢查點(diǎn)并從故障中恢復(fù)，保存TensorBoard的摘要。

使用 Estimator 編寫應(yīng)用時(shí)，您必須將數(shù)據(jù)輸入管道從模型中分離出來。這種分離簡(jiǎn)化了不同數(shù)據(jù)集的實(shí)驗(yàn)流程。

案例

我們可以使用“tf.keras.estimator.model_to_estimator”將keras轉(zhuǎn)換Estimator。這里使用的數(shù)據(jù)集是Fashion-MNIST。

Fashion-MNIST數(shù)據(jù)標(biāo)簽：

數(shù)據(jù)導(dǎo)入：

import  os
import time
import tensorflow as tf
import numpy as np
import tensorflow.contrib as tcon

(train_image,train_lables),(test_image,test_labels)=tf.keras.datasets.fashion_mnist.load_data()
TRAINING_SIZE=len(train_image)
TEST_SIZE=len(test_image)

# 將像素值由0-255 轉(zhuǎn)為0-1 之間
train_image=np.asarray(train_image,dtype=np.float32)/255
# 4維張量[batch_size,height,width,channels]
train_image=train_image.reshape(shape=(TRAINING_SIZE,28,28,1))
test_image=np.asarray(test_image,dtype=np.float32)/255
test_image=test_image.reshape(shape=(TEST_SIZE,28,28,1))

使用tf.keras.utils.to_categorical將標(biāo)簽轉(zhuǎn)為獨(dú)熱編碼表示：

# lables 轉(zhuǎn)為 one_hot表示
# 類別數(shù)量
LABEL_DIMENSIONS=10
train_lables_onehot=tf.keras.utils.to_categorical(
    y=train_lables,num_classes=LABEL_DIMENSIONS
)
test_labels_onehot=tf.keras.utils.to_categorical(
    y=test_labels,num_classes=LABEL_DIMENSIONS
)
train_lables_onehot=train_lables_onehot.astype(np.float32)
test_labels_onehot=test_labels_onehot.astype(np.float32)

創(chuàng)建Keras模型：

“”“
3層卷積層，2層池化層，最后展平添加全連接層使用softmax分類
”“”
inputs=tf.keras.Input(shape=(28,28,1))
conv_1=tf.keras.layers.Conv2D(
    filters=32,
    kernel_size=3,
    # relu激活函數(shù)在輸入值為負(fù)值時(shí)，激活值為0，此時(shí)可以使用LeakyReLU
    activation=tf.nn.relu
)(inputs)
pool_1=tf.keras.layers.MaxPooling2D(
    pool_size=2,
    strides=2
)(conv_1)
conv_2=tf.keras.layers.Conv2D(
    filters=64,
    kernel_size=3,
    activation=tf.nn.relu
)(pool_1)
pool_2=tf.keras.layers.MaxPooling2D(
    pool_size=2,
    strides=2
)(conv_2)
conv_3=tf.keras.layers.Conv2D(
    filters=64,
    kernel_size=3,
    activation=tf.nn.relu
)(pool_2)

conv_flat=tf.keras.layers.Flatten()(conv_3)
dense_64=tf.keras.layers.Dense(
    units=64,
    activation=tf.nn.relu
)(conv_flat)

predictions=tf.keras.layers.Dense(
    units=LABEL_DIMENSIONS,
    activation=tf.nn.softmax
)(dense_64)

模型配置：

model=tf.keras.Model(
    inputs=inputs,
    outputs=predictions
)
model.compile(
    loss="categorical_crossentropy",
    optimizer=tf.train.AdamOptimizer(learning_rate=0.001),
    metrics=["accuracy"]
)

創(chuàng)建Estimator

指定GPU數(shù)量，然后將keras轉(zhuǎn)為Estimator，代碼如下：

NUM_GPUS=2
strategy=tcon.distribute.MirroredStrategy(num_gpus=NUM_GPUS)
config=tf.estimator.RunConfig(train_distribute=strategy)

estimator=tf.keras.estimator.model_to_estimator(
    keras_model=model,config=config
)

前面說到過使用 Estimator 編寫應(yīng)用時(shí)，您必須將數(shù)據(jù)輸入管道從模型中分離出來，所以，我們先創(chuàng)建input function。使用prefetch將data預(yù)置緩沖區(qū)可以加快數(shù)據(jù)讀取。因?yàn)橄旅娴倪w移訓(xùn)練使用的數(shù)據(jù)集較大，所以在這里有必要介紹下優(yōu)化數(shù)據(jù)輸入管道的相關(guān)內(nèi)容。

優(yōu)化數(shù)據(jù)輸入管道

TensorFlow數(shù)據(jù)輸入管道是以下三個(gè)過程：

Extract：數(shù)據(jù)讀取，如本地，服務(wù)端

Transform：使用CPU處理數(shù)據(jù)，如圖片翻轉(zhuǎn)，裁剪，數(shù)據(jù)shuffle等

Load：將數(shù)據(jù)轉(zhuǎn)給GPU進(jìn)行計(jì)算

數(shù)據(jù)讀取：

通常，當(dāng)CPU為計(jì)算準(zhǔn)備數(shù)據(jù)時(shí)，GPU/TPU處于閑置狀態(tài)；當(dāng)GPU/TPU運(yùn)行時(shí)，CPU處于閑置，顯然設(shè)備沒有被合理利用。

tf.data.Dataset.prefetch可以將上述行為并行實(shí)現(xiàn)，當(dāng)GPU/TPU執(zhí)行第N次訓(xùn)練，此時(shí)讓CPU準(zhǔn)備N+1次訓(xùn)練使兩個(gè)操作重疊，從而利用設(shè)備空閑時(shí)間。

通過使用tf.contrib.data.parallel_interleave可以并行從多個(gè)文件讀取數(shù)據(jù)，并行文件數(shù)有cycle_length指定。
數(shù)據(jù)轉(zhuǎn)換：

使用tf.data.Dataset.map對(duì)數(shù)據(jù)集中的數(shù)據(jù)進(jìn)行處理，由于數(shù)據(jù)獨(dú)立，所以可以并行處理。此函數(shù)讀取的文件是含有確定性順序，如果順序?qū)τ?xùn)練沒有影響，也可以取消確定性順序加快訓(xùn)練。

def input_fn(images,labels,epochs,batch_size):
    ds=tf.data.Dataset.from_tensor_slices((images,labels))
    # repeat值為None或者-1時(shí)將無限制迭代
    ds=ds.shuffle(500).repeat(epochs).batch(batch_size).prefetch(batch_size)

    return ds

模型訓(xùn)練

# 用于計(jì)算迭代時(shí)間
class TimeHistory(tf.train.SessionRunHook):
    def begin(self):
        self.times = []
    def before_run(self, run_context):
        self.iter_time_start = time.time()
    def after_run(self, run_context, run_values):
        self.times.append(time.time() - self.iter_time_start)

time_hist = TimeHistory()
BATCH_SIZE = 512
EPOCHS = 5
# lambda為了填寫參數(shù)
estimator.train(lambda:input_fn(train_images,
                                train_labels,
                                epochs=EPOCHS,
                                batch_size=BATCH_SIZE),
                hooks=[time_hist])

# 訓(xùn)練時(shí)間
total_time = sum(time_hist.times)
print(f"total time with {NUM_GPUS} GPU(s): {total_time} seconds")

# 訓(xùn)練數(shù)據(jù)量
avg_time_per_batch = np.mean(time_hist.times)
print(f"{BATCH_SIZE*NUM_GPUS/avg_time_per_batch} images/second with
        {NUM_GPUS} GPU(s)")

結(jié)果如圖：

得益于Estimator數(shù)據(jù)輸入和模型的分離，評(píng)估方法很簡(jiǎn)單。

estimator.evaluate(lambda:input_fn(test_images, 
                                   test_labels,
                                   epochs=1,
                                   batch_size=BATCH_SIZE))

遷移學(xué)習(xí)訓(xùn)練新模型

我們使用Retinal OCT images數(shù)據(jù)集進(jìn)行遷移訓(xùn)練，數(shù)據(jù)標(biāo)簽為：NORMAL, CNV, DME DRUSEN，包含分辨率為512*296，84495張照片。

數(shù)據(jù)讀取，設(shè)置input_fn:

labels = ["CNV", "DME", "DRUSEN", "NORMAL"]
train_folder = os.path.join("OCT2017", "train", "**", "*.jpeg")
test_folder = os.path.join("OCT2017", "test", "**", "*.jpeg")

def input_fn(file_pattern, labels,
             image_size=(224,224),
             shuffle=False,
             batch_size=64, 
             num_epochs=None, 
             buffer_size=4096,
             prefetch_buffer_size=None):
    # 創(chuàng)建查找表，將string 轉(zhuǎn)為 int 64ID
    table = tcon.lookup.index_table_from_tensor(mapping=tf.constant(labels))
    num_classes = len(labels)

    def _map_func(filename):
        # sep = "/"
        label = tf.string_split([filename], delimiter=os.sep).values[-2]
        image = tf.image.decode_jpeg(tf.read_file(filename), channels=3)
        image = tf.image.convert_image_dtype(image, dtype=tf.float32)
        image = tf.image.resize_images(image, size=image_size)
        # tf.one_hot:根據(jù)輸入的depth返回one_hot張量
        # indices = [0, 1, 2]
        # depth = 3
        # tf.one_hot(indices, depth) return：
        # [[1., 0., 0.],
        #  [0., 1., 0.],
        #  [0., 0., 1.]]
        return (image, tf.one_hot(table.lookup(label), num_classes))
    
    dataset = tf.data.Dataset.list_files(file_pattern, shuffle=shuffle)

    if num_epochs is not None and shuffle:
        dataset = dataset.apply(
            tcon.data.shuffle_and_repeat(buffer_size, num_epochs))
    elif shuffle:
        dataset = dataset.shuffle(buffer_size)
    elif num_epochs is not None:
        dataset = dataset.repeat(num_epochs)
   
    dataset = dataset.apply(
        tcon.data.map_and_batch(map_func=_map_func,
                                      batch_size=batch_size,
                                      num_parallel_calls=os.cpu_count()))
    dataset = dataset.prefetch(buffer_size=prefetch_buffer_size)
    
    return dataset

使用VGG16網(wǎng)絡(luò)

通過keras使用預(yù)訓(xùn)練的VGG16網(wǎng)絡(luò)，我們重訓(xùn)練最后5層:

# include_top:不包含最后3個(gè)全連接層
keras_vgg16 = tf.keras.applications.VGG16(input_shape=(224,224,3),
                                          include_top=False)
output = keras_vgg16.output
output = tf.keras.layers.Flatten()(output)
prediction = tf.keras.layers.Dense(len(labels),
                                   activation=tf.nn.softmax)(output)
model = tf.keras.Model(inputs=keras_vgg16.input,
                       outputs=prediction)
# 后5層不訓(xùn)練
for layer in keras_vgg16.layers[:-4]:
    layer.trainable = False

重新訓(xùn)練模型：

# 通過遷移學(xué)習(xí)得到模型
model.compile(loss="categorical_crossentropy", 
              # 使用默認(rèn)學(xué)習(xí)率
              optimizer=tf.train.AdamOptimizer(),
              metrics=["accuracy"])
NUM_GPUS = 2
strategy = tf.contrib.distribute.MirroredStrategy(num_gpus=NUM_GPUS)
config = tf.estimator.RunConfig(train_distribute=strategy)

# 轉(zhuǎn)至estimator
estimator = tf.keras.estimator.model_to_estimator(model,
                                                  config=config)
BATCH_SIZE = 64
EPOCHS = 1
estimator.train(input_fn=lambda:input_fn(train_folder,
                                         labels,
                                         shuffle=True,
                                         batch_size=BATCH_SIZE,
                                         buffer_size=2048,
                                         num_epochs=EPOCHS,
                                         prefetch_buffer_size=4),
                hooks=[time_hist])
# 模型評(píng)估：
estimator.evaluate(input_fn=lambda:input_fn(test_folder,
                                            labels, 
                                            shuffle=False,
                                            batch_size=BATCH_SIZE,
                                            buffer_size=1024,
                                            num_epochs=1))

VGG16網(wǎng)絡(luò)

如圖所示，VGG16有13個(gè)卷積層和3個(gè)全連接層。VGG16輸入為[224,224,3],卷積核大小為（3，3），池化大小為（2，2）步長(zhǎng)為2。各層的詳細(xì)參數(shù)可以查看VGG ILSVRC 16 layers因?yàn)閳D片較大，這里只給出部分截圖，詳情請(qǐng)點(diǎn)擊鏈接查看。

VGG16模型結(jié)構(gòu)規(guī)整，簡(jiǎn)單，通過幾個(gè)小卷積核（3，3）卷積層組合比大卷積核如（7，7）更好，因?yàn)槎鄠€(gè)（3，3）卷積比一個(gè)大的卷積擁有更多的非線性，更少的參數(shù)。此外，驗(yàn)證了不斷加深的網(wǎng)絡(luò)結(jié)構(gòu)可以提升性能（卷積+卷積+卷積+池化，代替卷積+池化，這樣減少W的同時(shí)有可以擬合更復(fù)雜的數(shù)據(jù)），不過VGG16參數(shù)量很多，占用內(nèi)存較大。

總結(jié)

通過遷移學(xué)習(xí)我們可以使用較少的數(shù)據(jù)訓(xùn)練出來一個(gè)相對(duì)不錯(cuò)的模型，Estimator簡(jiǎn)化了機(jī)器學(xué)習(xí)編程特別是在分布式環(huán)境下。對(duì)于輸入數(shù)據(jù)較多的情況我們要從Extract，Transform，Load三方面考慮進(jìn)行優(yōu)化處理。當(dāng)然，除了VGG16我們還有很多選擇，如：Inception Model，Resnet Model。

代碼實(shí)現(xiàn)部分參考Kashif Rasul，在此表示感謝。

云服務(wù)器 GPU云服務(wù)器如何實(shí)現(xiàn)圖像識(shí)別圖像識(shí)別算法實(shí)現(xiàn)代碼 keras實(shí)現(xiàn)圖像識(shí)別 android實(shí)現(xiàn)圖像識(shí)別

文章版權(quán)歸作者所有，未經(jīng)允許請(qǐng)勿轉(zhuǎn)載,若此文章存在違規(guī)行為，您可以聯(lián)系管理員刪除。

轉(zhuǎn)載請(qǐng)注明本文地址：http://m.specialneedsforspecialkids.com/yun/42743.html

發(fā)表評(píng)論

登陸后可評(píng)論

0條評(píng)論

olle

男|高級(jí)講師

我要關(guān)注我要私信

TA的文章

【物聯(lián)網(wǎng)】33.物聯(lián)網(wǎng)開發(fā) - 機(jī)器人

閱讀 3437·2021-11-19 09:40
Hostigger：土耳其vps/荷蘭vps/美國(guó)vps，首月9折，1核/1G/30G SSD/2T

閱讀 1330·2021-10-11 11:07
主機(jī)名和ip地址分別指什么-如何查看主機(jī)名和IP地址？

閱讀 4864·2021-09-22 15:07
九乘九口訣表和乘法口訣表

閱讀 2899·2021-09-02 15:15
gulp--sass

閱讀 1972·2019-08-30 15:55
CSS 火焰？不在話下

閱讀 545·2019-08-30 15:43
SEER主網(wǎng)網(wǎng)頁(yè)錢包API節(jié)點(diǎn)列表更新加入以獲得更快的連接速度

閱讀 888·2019-08-30 11:13
從零開始寫一個(gè)輪播

閱讀 1456·2019-08-29 15:36

国产xxxx99真实实拍_久久不雅视频_高清韩国a级特黄毛片_嗯老师别我我受不了了小说

資訊專欄INFORMATION COLUMN

上云采購(gòu)季！| 2核2G4M爆款云服務(wù)器低至59元/年，更有多臺(tái)、長(zhǎng)期優(yōu)惠，快來選購(gòu)！

通過遷移學(xué)習(xí)實(shí)現(xiàn)OCT圖像識(shí)別

相關(guān)文章

Goodfellow回谷歌后首篇GAN論文：可遷移性對(duì)抗樣本空間

**深度學(xué)習(xí)在人臉識(shí)別中的應(yīng)用 —— 優(yōu)圖祖母模型的「進(jìn)化」**

**深度學(xué)習(xí)實(shí)現(xiàn)自動(dòng)生成圖片字幕**

發(fā)表評(píng)論

0條評(píng)論

olle

男|高級(jí)講師

TA的文章

【物聯(lián)網(wǎng)】33.物聯(lián)網(wǎng)開發(fā) - 機(jī)器人

Hostigger：土耳其vps/荷蘭vps/美國(guó)vps，首月9折，1核/1G/30G SSD/2T

主機(jī)名和ip地址分別指什么-如何查看主機(jī)名和IP地址？

九乘九口訣表和乘法口訣表

gulp--sass

CSS 火焰？不在話下

SEER主網(wǎng)網(wǎng)頁(yè)錢包API節(jié)點(diǎn)列表更新加入以獲得更快的連接速度

從零開始寫一個(gè)輪播

最新活動(dòng)