Android Firebase自定义模型推断速度vs Tensorflow Lite_Android_Tensorflow Lite_Firebase Mlkit

Android Firebase自定义模型推断速度vs Tensorflow Lite

android

Android Firebase自定义模型推断速度vs Tensorflow Lite,android,tensorflow-lite,firebase-mlkit,Android,Tensorflow Lite,Firebase Mlkit,我训练了一个对象检测模型（基于ssd mobilenet v1），并将其转换为tflite（没有量化，该模型仍然是浮动的）。我使用firebase ml模型解释器（v20.0.1）在摄像机馈送上运行模型（大小调整为300x300）。科特林的代码然后我使用tensorflow lite（v1.14.0）做了同样的事情。Java代码与TF相比，firebase中的推理速度较慢，有一些令人不快的尖峰，我试图理解为什么我真的想使用firebase，因为它已经包含在我的项目中这里是推断速度比较（每

我训练了一个对象检测模型（基于ssd mobilenet v1），并将其转换为tflite（没有量化，该模型仍然是浮动的）。我使用firebase ml模型解释器（v20.0.1）在摄像机馈送上运行模型（大小调整为300x300）。科特林的代码

然后我使用tensorflow lite（v1.14.0）做了同样的事情。Java代码

与TF相比，firebase中的推理速度较慢，有一些令人不快的尖峰，我试图理解为什么我真的想使用firebase，因为它已经包含在我的项目中

这里是推断速度比较（每个100帧，与手机静止时相同的帧）。Firebase应用程序使用该组件，而TF应用程序直接使用摄像头API。

以下是每个项目的相关代码：

带Kotlin的Firebase 使用Java的Tensorflow Lite

void loadModel（）{
...
tfLite=新的解释器（模型文件）；
setNumThreads（4）；
int numBytesPerChannel=4//Float
imgData=ByteBuffer.allocateDirect（1*300*300*3*numBytesPerChannel）
imgData.order（ByteOrder.nativeOrder（））
...
}
void runModel（）{
...
imgData.rewind（）
对于（int i=0；i<300；++i）{
对于（int j=0；j<300；++j）{
imgData.putFloat（…）//R
imgData.putFloat（…）//G
imgData.putFloat（…）//B
}
}
outputLocations=新浮点[1][10][4]；
outputClasses=新浮点[1][10]；
outputScores=新浮动[1][10]；
numDetections=newfloat[1]；
对象[]inputArray={imgData}；
Map outputMap=newhashmap（）；
outputMap.put（0，outputLocations）；
outputMap.put（1，OutputClass）；
outputMap.put（2，outputScores）；
outputMap.put（3，numDetections）；
长启动=System.currentTimeMillis（）；
tfLite.runForMultipleInputsOutputs（inputArray，outputMap）；
长时间运行=System.currentTimeMillis（）-开始；
LOGGER.w（“检测完成%d毫秒-TF”，已过）；
}

你知道为什么会这样吗？ thx

fun loadModel() {
    ...
    // Load local model
    interpreter = FirebaseModelInterpreter.getInstance(modelOptions)

    dataOptions = FirebaseModelInputOutputOptions.Builder()
        .setInputFormat(0, dataType, intArrayOf(1, 300, 300, 3))
        .setOutputFormat(0, FirebaseModelDataType.FLOAT32, intArrayOf(1, 10, 4)) // Boxes
        .setOutputFormat(1, FirebaseModelDataType.FLOAT32, intArrayOf(1, 10)) // Classes
        .setOutputFormat(2, FirebaseModelDataType.FLOAT32, intArrayOf(1, 10)) // Scores
        .setOutputFormat(3, FirebaseModelDataType.FLOAT32, intArrayOf(1)) // Num detections
        .build()

    val numBytesPerChannel = 4 // Float
    imgData = ByteBuffer.allocateDirect(1 * 300 * 300 * 3 * numBytesPerChannel)
    imgData.order(ByteOrder.nativeOrder())
    ...
}

fun runModel() {
    ...
    imgData.rewind()
    for (x in 0 until 300) {
        for (y in 0 until 300) {
            imgData.putFloat(...) // R
            imgData.putFloat(...) // G
            imgData.putFloat(...) // B
        }
    }

    val inputs = FirebaseModelInputs.Builder().add(imgData).build()
    val start = System.currentTimeMillis()
    interpreter.run(inputs, dataOptions)
        .addOnSuccessListener { result ->
            val elapsed = System.currentTimeMillis() - start
            Log.i(TAG, "Detection finished $elapsed msec - MLKIT")
        }
        .addOnFailureListener {
            // Handle Error
        }
}

void loadModel() {
    ...
    tfLite = new Interpreter(modelFile);
    tfLite.setNumThreads(4);

    int numBytesPerChannel = 4 // Float
    imgData = ByteBuffer.allocateDirect(1 * 300 * 300 * 3 * numBytesPerChannel)
    imgData.order(ByteOrder.nativeOrder())
    ...
}

void runModel() {
    ...
    imgData.rewind()
    for (int i = 0; i < 300; ++i) {
        for (int j = 0; j < 300; ++j) {
            imgData.putFloat(...) // R
            imgData.putFloat(...) // G
            imgData.putFloat(...) // B
        }
    }

    outputLocations = new float[1][10][4];
    outputClasses = new float[1][10];
    outputScores = new float[1][10];
    numDetections = new float[1];

    Object[] inputArray = {imgData};
    Map<Integer, Object> outputMap = new HashMap<>();
    outputMap.put(0, outputLocations);
    outputMap.put(1, outputClasses);
    outputMap.put(2, outputScores);
    outputMap.put(3, numDetections);

    long start = System.currentTimeMillis();
    tfLite.runForMultipleInputsOutputs(inputArray, outputMap);
    long elapsed = System.currentTimeMillis() - start;
    LOGGER.w("Detection finished %d msec - TF", elapsed);
}