Web Application: 第３回　flickr（フリッカー）で画像収集

こんにちはKeita_Nakamori(´・ω・`)。

flickrのサイトから画像を引っ張ってきてTensorFlowに流すサンプルデータにしようと思います。

登録とAPIキーの取得

トップページの一番下のDeveloperをおしてAPIキーのリスエストを行います。まずはメルアドの登録などをやってサインアップします。

そうすると APIキーのリスエストができるようになりますので、APIキーを取得します。

flickrapiのインストール

flickrのapiにアクセスするためのモジュールflickrapiをインストールします。

VSコードのコマンドプロンプト内で、

PS C:\Users\keita\anaconda_projects\djangoai> pip install flickrapi

とすると、flickrapi-2.4.0 が入りました。が、これは使えません。

Anaconda プロンプトでconda activate djangoで仮想環境にはいり(djangoai) C:\Users\keita>pip install flickrapi　します。

flickrapi-2.4.0が入りました。

Imageをdownloadするスクリプト

ではflickrapiを使ってimageを取得していきましょう。djangoaiフォルダ内へ新たにdownload_images.pyを作成します。

from flickrapi import FlickrAPI
from urllib.request import urlretrieve
import os
import time
import sys

key = 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx'
secret = 'xxxxxxxxxxxxxxxx'
wait_time = 1 # 1 request per 1 second

# 2nd argument  on cmd becomes searching keyword
keyword = sys.argv[1] 

# directry for saving image files
save_dir ='./' + keyword

# cliant object to access to api
flickr = FlickrAPI(key, secret, format='parsed-json')

# api excuted result
result =flickr.photos.search(
    text = keyword,
    per_page = 400,     # the number of images
    media = 'photos',   # define collecting type
    sort = 'relevance', # new images
    safe_serch= 1,      # to avoid violence
    extras = 'url_q, license' # with url and licence data
)

# extracted photos(as key) from result object
photos = result['photos']

# extract photo in order from photos object and do numbering
for i, photo in enumerate(photos['photo']):

    # extract url
    url_q = photo['url_q']

    # make filepath : directory / photo id .jpg
    filepath = save_dir + '/' +photo['id'] + '.jpg'

    # need to make directory(mkdir) in advance
    # before excution thins script
    if os.path.exists(filepath): continue
    
    # save download data
    # arg1= download url , arg2= save directory/file name
    urlretrieve(url_q, filepath)
    print('url_q, filepath : ', url_q,'  ', filepath)

    # download interval
    time.sleep(wait_time)

print('==== Script is done. ====')

from flickrapi import FlickrAPI

from urllib.request import urlretrieve

import os

import time

import sys

key = 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx'

secret = 'xxxxxxxxxxxxxxxx'

wait_time = 1 # 1 request per 1 second

# 2nd argument on cmd becomes searching keyword

keyword = sys.argv[1]

# directry for saving image files

save_dir ='./' + keyword

# cliant object to access to api

flickr = FlickrAPI(key, secret, format='parsed-json')

# api excuted result

result =flickr.photos.search(

text = keyword,

per_page = 400, # the number of images

media = 'photos', # define collecting type

sort = 'relevance', # new images

safe_serch= 1, # to avoid violence

extras = 'url_q, license' # with url and licence data

)

# extracted photos(as key) from result object

photos = result['photos']

# extract photo in order from photos object and do numbering

for i, photo in enumerate(photos['photo']):

# extract url

url_q = photo['url_q']

# make filepath : directory / photo id .jpg

filepath = save_dir + '/' +photo['id'] + '.jpg'

# need to make directory(mkdir) in advance

# before excution thins script

if os.path.exists(filepath): continue

# save download data

# arg1= download url , arg2= save directory/file name

urlretrieve(url_q, filepath)

print('url_q, filepath : ', url_q,' ', filepath)

# download interval

time.sleep(wait_time)

print('==== Script is done. ====')

Imageを保存するフォルダを作成

車とバイクの画像をいれるフォルダを作っておきます。

(djangoai) C:\Users\keita\anaconda_projects\djangoai>mkdir car
(djangoai) C:\Users\keita\anaconda_projects\djangoai>mkdir motorbike

スクリプトを実行する。

仮想環境(conda activate djangoai)に入って、まずはモーターバイクの画像を収集しましょう。

(djangoai) C:\Users\keita\anaconda_projects\djangoai>python download_images.py motorbike

じゃんじゃん入ってきました。

次回

Web Application: 第４回　はじめてのwebアプリ

Web Application: 第２回　VS codeのインストールとセッティング

こんにちは　Keita_Nakamori(´・ω・`)　です。

前回はTensor Flowの試運転に成功しました。

今後アプリ開発をやっていく上でVSコードを使っていこうと思います。PyCharmよさようなら・・・。

VSコードのインストール

https://code.visualstudio.com/でWindows版を選択してインストーラーをダウンロードした後、実行します。

Pathは通しておきましょう。再起動後、画面左側にあるExtentionsボタンをおして、検索窓でpythonと入力して、pythonをインストールしましょう。

データ収集プログラムを作ってみる

では、左上のエクスプローラーボタンを押して、第１回目に作成した keita>Anaconda_projects>djangoaiフォルダを選択しましょう。

Jupyter Notebookで作成した tensorflow_test.ipynbが入っています。

練習のためHello World をやってみましょう

djangoaiフォルダの右側にNew Fileボタンを押してhello_world.pyを作ります。

右側のウィンドウにスクリプトが書けるように成りますのでテキトーにhello worldをプリント出力しましょう。

実行方法

どのPythonバージョンを使用するかの選択

メニュー>View>Command Pallet>Select interpriterで

第１回目で作成した仮想環境であるdjangoai:condaのPython3.7.4 64-bitを選択します。

画面の下に青い帯で変更されたことが確認できます。

ついでにpylintが入っていませんようと警告がでますのでInstallしましょう。condaかpipか聞かれますのでcondaにしてみました。

実行

スクリプト窓内で右クリックをしてrun python file in terminal　します。

するとコマンドプロンプト窓内で実行されます。

次回

Web Application: 第３回　はじめてのwebアプリ

Web Application: 第１回　Anaconda3のインストールと仮想環境構築

こんにちは、Keita_Nakamori(´・ω・`)です。

webアプリというものを少し作ってみたいと思います。

言語：Python
機械学習：TensorFlow
フレームワーク：Django
データベース : MySQL
サーバー：Xserver

あたりを使っていきます。

Anacondaのインストール

Anaconda 2019.07 for Windows Installerをインストールしました。

PathとRegister 両方ともチェックをいれました。

仮想環境djangoaiを作って、ついでにtensorflowを入れます。

Anaconda プロンプトを開いて、

(base) C:\Users\keita>conda create -n djangoai tensorflow

-nってなんでしょう。

# To activate this environment, use
#
# $ conda activate djangoai
#
# To deactivate an active environment, use
#
# $ conda deactivate

ということなので、

$ conda activate djangoai

して使ってみます。モジュール群を確認してみましょう。

(djangoai) C:\Users\keita>pip list

Package Version
——————– ———
absl-py 0.7.1
astor 0.8.0
certifi 2019.6.16
gast 0.2.2
grpcio 1.16.1
h5py 2.9.0
Keras-Applications 1.0.8
Keras-Preprocessing 1.1.0
Markdown 3.1.1
mkl-fft 1.0.14
mkl-random 1.0.2
mkl-service 2.3.0
numpy 1.16.5
pip 19.2.2
protobuf 3.8.0
pyreadline 2.1
scipy 1.3.1
setuptools 41.0.1
six 1.12.0
tensorboard 1.14.0
tensorflow 1.14.0
tensorflow-estimator 1.14.0
termcolor 1.1.0
Werkzeug 0.15.5
wheel 0.33.4
wincertstore 0.2
wrapt 1.11.2

確かにtensorflow 1.14.0が入っていますね。OKです。

Kerasも自動的に入ってきていますね。

仮想環境を抜けましょう。

(djangoai) C:\Users\keita>conda deactivate

(base) C:\Users\keita>

頭の(djangoai)が(base)に切り替わり仮想環境を抜けたことがわかります。

Anaconda Navigator

次に、Anaconda Navigatorを使ってみます。

Anaconda Navigatorを立ち上げて、Application on (base)にの部分をdjangoaiに切り替えます。このときまだJupyter NotebookはインストールされていませんのでInstallボタンを押します。完了したらLaunchにボタンが変わりますのでLaunchします。

これでいつものJupyter Notebookが起動しますが。

今回作っていくdjangoaiアプリはユーザーフォルダkeitaの下にanaconda_projectフォルダを作って、その下djangoaiフォルダを作って、その中にスクリプトを入れていきます。

では、Jupyter Notebookを起動したらanaconda_project > djangoフォルダに移動して新規にNewボタン > Python3 しましょう。

TensorFlowの試運転

初心者向けのテストスクリプトがありましたので、実行してみます。

サンプルデータを取得します。

import tensorflow as tf

# サンプルデータとしてmnistのデータをダウンロード
mnist = tf.keras.datasets.mnist
(x_train, y_train),(x_test, y_test) = mnist.load_data()

# 入力データの形を確認する
print(x_train.shape) # (60000, 28, 28)
'''
60000データあってそれぞれが
(28行x28列)の数字画像データ
'''
# １つのデータを見てみる
print(x_train[0].shape) # (28, 28)
print(x_train[0])       # 0から255までの数字が入っている。

# 正解データの形を確認する
print(y_train.shape) # (60000,) 60000個の１次元データ
print(y_train) # [5 0 4 ... 5 6 8]

import tensorflow as tf

# サンプルデータとしてmnistのデータをダウンロード

mnist = tf.keras.datasets.mnist

(x_train, y_train),(x_test, y_test) = mnist.load_data()

# 入力データの形を確認する

print(x_train.shape) # (60000, 28, 28)

'''

60000データあってそれぞれが

(28行x28列)の数字画像データ

'''

# １つのデータを見てみる

print(x_train[0].shape) # (28, 28)

print(x_train[0]) # 0から255までの数字が入っている。

# 正解データの形を確認する

print(y_train.shape) # (60000,) 60000個の１次元データ

print(y_train) # [5 0 4 ... 5 6 8]

訓練しやすいようにデータを加工します

# 入力データの数値 0-255 を255で割り算して0-1 に正規化する
x_train, x_test = x_train / 255.0, x_test / 255.0

1 2	# 入力データの数値 0-255 を255で割り算して0-1 に正規化する x_train, x_test = x_train / 255.0, x_test / 255.0

機械学習モデルと訓練と評価

# ニューラルネットワークモデルを作成する
model = tf.keras.models.Sequential([
    # 入力層の定義：入力データの形を教えて１行にフラット化
  tf.keras.layers.Flatten(input_shape=(28, 28)),

    # 中間層の定義：128ノード　活性化関数はrelu
  tf.keras.layers.Dense(128, activation='relu'),

    # データに偏りが発生しないように20%を捨てる
  tf.keras.layers.Dropout(0.2),

    # 出力層の定義：0-9の数字を判定したいので10ノード用意する。
    # 活性化関数はsoftmax
  tf.keras.layers.Dense(10, activation='softmax')
])

# 機械学習モデルのコンパイル
model.compile(optimizer='adam',
              #損失関数の定義：最適化（今回は最小化）する対象を定義
              loss='sparse_categorical_crossentropy',
              
              metrics=['accuracy'])

# 訓練する エポック数=5
model.fit(x_train, y_train, epochs=5)
'''
Epoch 1/5
60000/60000 [==============================] - 4s 68us/sample - loss: 0.2970 - acc: 0.9146
Epoch 2/5
60000/60000 [==============================] - 4s 65us/sample - loss: 0.1440 - acc: 0.9567
Epoch 3/5
60000/60000 [==============================] - 4s 75us/sample - loss: 0.1077 - acc: 0.9667
Epoch 4/5
60000/60000 [==============================] - 5s 84us/sample - loss: 0.0884 - acc: 0.9731
Epoch 5/5
60000/60000 [==============================] - 4s 74us/sample - loss: 0.0747 - acc: 0.9768
10000/10000 [==============================] - 0s 41us/sample - loss: 0.0709 - acc: 0.9780
'''

# 評価する
model.evaluate(x_test, y_test) 
'''
損失と精度
[0.07094512566379271, 0.978]
'''

# ニューラルネットワークモデルを作成する

model = tf.keras.models.Sequential([

# 入力層の定義：入力データの形を教えて１行にフラット化

tf.keras.layers.Flatten(input_shape=(28, 28)),

# 中間層の定義：128ノード　活性化関数はrelu

tf.keras.layers.Dense(128, activation='relu'),

# データに偏りが発生しないように20%を捨てる

tf.keras.layers.Dropout(0.2),

# 出力層の定義：0-9の数字を判定したいので10ノード用意する。

# 活性化関数はsoftmax

tf.keras.layers.Dense(10, activation='softmax')

])

# 機械学習モデルのコンパイル

model.compile(optimizer='adam',

#損失関数の定義：最適化（今回は最小化）する対象を定義

loss='sparse_categorical_crossentropy',

metrics=['accuracy'])

# 訓練するエポック数=5

model.fit(x_train, y_train, epochs=5)

'''

Epoch 1/5

60000/60000 [==============================] - 4s 68us/sample - loss: 0.2970 - acc: 0.9146

Epoch 2/5

60000/60000 [==============================] - 4s 65us/sample - loss: 0.1440 - acc: 0.9567

Epoch 3/5

60000/60000 [==============================] - 4s 75us/sample - loss: 0.1077 - acc: 0.9667

Epoch 4/5

60000/60000 [==============================] - 5s 84us/sample - loss: 0.0884 - acc: 0.9731

Epoch 5/5

60000/60000 [==============================] - 4s 74us/sample - loss: 0.0747 - acc: 0.9768

10000/10000 [==============================] - 0s 41us/sample - loss: 0.0709 - acc: 0.9780

'''

# 評価する

model.evaluate(x_test, y_test)

'''

損失と精度

[0.07094512566379271, 0.978]

'''

はい、ちゃんと動きました。

次回予告

Web Application: 第２回　はじめてのwebアプリ

月	火	水	木	金	土	日
						1
2	3	4	5	6	7	8
9	10	11	12	13	14	15
16	17	18	19	20	21	22
23	24	25	26	27	28	29
30