こんにちはKeita_Nakamori(´・ω・`)です。
前回flickrから取得した画像データをTensor Flowが読めるように数値データに変換していきます。
必要なモジュールをインストール
- pip install Pillow :
- pip install scikit-learn :
Pillow-6.1.0とscikit-learn-0.21.3が入りました
スクリプト:generate_inputdata.py
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 |
from PIL import Image # image operation import os # get file list import glob # image treatment import numpy as np # culculation from sklearn.model_selection import train_test_split # split data # initialize parameters classes = ['car', 'motorbike'] # define 2 classses num_classes = len(classes) # the number of classes image_size = 150 # pixels of width or height # read images and cnvert numpy array X = [] # ready to make list Y = [] # ready to make list # numbering and for index, class_label in enumerate(classes): # create class_label directories (car or motorbike) photos_dir ='./' + class_label # search jpeg fileb and create files object files = glob.glob(photos_dir + '/*.jpg') for i, file in enumerate(files): # open image files as instance image = Image.open(file) # convert image into RGB value data image = image.convert('RGB') # align to the same size (just in case) image = image.resize((image_size, image_size)) # convert RGB value into numpy array data = np.asarray(image) # normalize data = data / 255.0 # append value to list X.append(data) Y.append(index) # convert list into numpy array X = np.array(X) Y = np.array(Y) # split data (training and test) X_train, X_test, y_train, y_test = train_test_split(X, Y) # replace 1 variable xy =(X_train, X_test, y_train, y_test) # save values as npy file np.save('./image_files.npy', xy) |
実行しましょう
(djangoai) C:\Users\keita\anaconda_projects\djangoai>python generate_inputdata.py
フォルダを見ると新しく「image_files.npy」 609,516,317 (609MB?)のデータが生成されました。
次回
Web Application: 第5回 はじめてのwebアプリ