问题描述
使用finetune后的图像分类模型对一批图片进行特征提取时,发现:随着时间推移,每张图片处理耗时增多,占用内存不断变大。tensorflow有类似的issue。
问题代码
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
| ... with tf.Graph().as_default(): with slim.arg_scope(inception_resnet_v2.inception_resnet_v2_arg_scope()): preprocessed_image = tf.placeholder(tf.float32, shape=(image_size,image_size,3), name="preprocessed_images") processed_image = tf.expand_dims(preprocessed_image, 0) logits, end_points = inception_resnet_v2.inception_resnet_v2(processed_image, num_classes=_NUM_CLASSES, is_training=False) probabilities = logits init_fn = slim.assign_from_checkpoint_fn(sys.argv[1], slim.get_model_variables('InceptionResnetV2')) with tf.Session() as sess: init_fn(sess) for line in sys.stdin: start_time = time.time() line = line.strip(" \r\n") if len(line) == 0: continue try: image_string_tmp = tf.gfile.FastGFile(line, 'rb').read() image_decode_tmp = tf.image.decode_image(testImage_string_tmp, channels=3) preprocessed_image_tmp = inception_preprocessing.preprocess_image(image_decode_tmp, image_size, image_size, is_training=False) preprocessed_image_tmp_val = sess.run([preprocessed_image_tmp]) np_probabilities = sess.run(probabilities,{"preprocessed_image:0":preprocessed_image_tmp_val[0]}) np_probabilities = np_probabilities[0, 0:] imgfea = np_probabilities.tolist() sys.stdout.write("%s\t%s\n" % (line, " ".join(["%.17f"%x for x in imgfea]))) except Exception,e: pass print >>sys.stderr, (time.time() - start_time) * 1000
|
解决过程
tensorflow都是预先构建好graph,输入使用placeholder占位替代,然后再运行,即一次构建,多次运行。凭直觉,上面的代码中可能存在一个问题:inception_preprocessing.preprocess_image构建图操作放在了运行阶段。所以,第一步尝试把inception_preprocessing.preprocess_image从运行阶段放到构建图阶段,然而问题并未解决。之后查阅相关问题,按照issue上面的做法,详细记录各个步骤的耗时和内存占用。具体地,使用time.time()和resource.getrusage(resource.RUSAGE_SELF).ru_maxrss / 1024分别记录每步大耗时和内存占用情况。举个例子:
1 2 3 4 5 6 7 8 9
| ... import time import resource ... end_read_time = time.time() image_decode_tmp = tf.image.decode_image(testImage_string_tmp, channels=3) end_decode_time = time.time() print >>sys.stderr, "[decode image] timecost=%f memory_usage=%f" % (end_decode_time - end_read_time, resource.getrusage(resource.RUSAGE_SELF).ru_maxrss / 1024) ...
|
从记录日志来看,主要是tf.image.decode_image这一步耗时和内存不断增长。所以需要把这一步也挪到构建图阶段。
解决方案
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
| with tf.Graph().as_default(): with slim.arg_scope(inception_resnet_v2.inception_resnet_v2_arg_scope()): image_str = tf.placeholder(tf.string) image_decode = tf.image.decode_image(image_str, channels=3) image_tensor = tf.placeholder(tf.uint8, shape=[None, None, 3]) preprocessed_image = inception_preprocessing.preprocess_image(image_tensor, image_size, image_size, is_training=False) processed_image = tf.expand_dims(preprocessed_image, 0) logits, end_points = inception_resnet_v2.inception_resnet_v2(processed_image, num_classes=_NUM_CLASSES, is_training=False) init_fn = slim.assign_from_checkpoint_fn(sys.argv[1], slim.get_model_variables('InceptionResnetV2')) with tf.Session() as sess: init_fn(sess) for line in sys.stdin: start_time = time.time() line = line.strip(" \r\n") if len(line) == 0: continue try: with open(line, "r") as f: image_string_tmp = f.read() image_decode_tmp = sess.run([image_decode], {image_str: image_string_tmp}) image_feature = sess.run(logits, {image_tensor:image_decode_tmp[0]}) image_feature = image_feature[0, 0:] imgfea = image_feature.tolist() sys.stdout.write("%s\t%s\n" % (line, " ".join(["%.17f"%x for x in imgfea]))) except Exception,e: sys.stderr.write("%s" % traceback.format_exc()) print "cost:", (time.time() - start_time) * 1000
|
总结
tf.image.decode_image仅仅是对图片进行图片解码(把图片字符转换成tensor),看似人畜无害,其实也暗藏陷阱。个人推测,每次构建图时,会为tensor分配内存。如果在运行时不断构建图,会导致内存急剧上升;时间上涨的原因待探索。所以,使用tensorflow时,尽量把tensor相关操作一次性定义在graph中,避免在运行阶段构建图。