在java中，使用byte或short代替int和float代替double是否更有效？_Java_Performance_Int_Double_Primitive Types

在java中，使用byte或short代替int和float代替double是否更有效？

java performance

在java中，使用byte或short代替int和float代替double是否更有效？,java,performance,int,double,primitive-types,Java,Performance,Int,Double,Primitive Types,我注意到我总是使用int和double，不管这个数字需要多小或多大。那么在java中，使用byte或short而不是int和float而不是double更有效吗假设我有一个包含大量整数和双精度的程序。如果我知道这个数字合适的话，是否值得仔细检查并将int改为bytes或short 我知道java没有无符号类型，但是如果我知道这个数字只有正数，我还能做些什么呢我说的高效主要是指处理。我假设如果所有变量的大小都是一半，那么垃圾收集器的速度会快得多，而且计算速度可能也会快一些。（我想既然我在安卓

我注意到我总是使用int和double，不管这个数字需要多小或多大。那么在java中，使用

byte

或

short

而不是

int

和

float

而不是

double

更有效吗

假设我有一个包含大量整数和双精度的程序。如果我知道这个数字合适的话，是否值得仔细检查并将int改为bytes或short

我知道java没有无符号类型，但是如果我知道这个数字只有正数，我还能做些什么呢

我说的高效主要是指处理。我假设如果所有变量的大小都是一半，那么垃圾收集器的速度会快得多，而且计算速度可能也会快一些。（我想既然我在安卓上工作，我也需要担心ram）

（我假设垃圾收集器只处理对象而不是原语，但仍然删除废弃对象中的所有原语，对吗？）

我尝试了一个小的安卓应用程序，但并没有发现有什么不同。（尽管我没有“科学地”测量任何东西。）

我认为它应该更快、更高效，这是错的吗？我不想在一个庞大的项目中经历和改变一切，结果发现我浪费了时间

当我开始一个新项目时，从一开始就值得做吗？（我的意思是，我认为每一点都会有帮助，但如果是这样，为什么看起来没有人这么做。）

这种差异几乎不明显！这更多的是一个设计、适当性、一致性、习惯等问题。。。有时这只是一个品味的问题。当您所关心的只是程序启动并运行，并且用

float

替换

int

不会损害正确性时，我认为使用这两种类型都没有好处，除非您可以证明使用这两种类型都会改变性能。基于2或3个字节中不同的类型调整性能实际上是您最不应该关心的事情；Donald Knuth曾说过：“过早优化是万恶之源”（不确定是他干的，如果你有答案，请编辑）

我认为它应该更快、更高效，这是错的吗？我不想在一个庞大的项目中经历和改变一切，结果发现我浪费了时间

简短回答是的，你错了。在大多数情况下，就使用的空间而言，它几乎没有什么区别

不值得尝试优化这个。。。除非你有明确的证据证明需要优化。如果您确实需要优化对象字段的内存使用，那么可能需要采取其他（更有效的）措施

更长的答案 Java虚拟机使用偏移量对堆栈和对象字段进行建模，偏移量实际上是32位基本单元大小的倍数。因此，当您将局部变量或对象字段声明为（比如）字节

时，变量/字段将存储在32位单元中，就像int
一样
这有两个例外：

long
和double
值需要两个基本的32位单元
基元类型的数组以压缩形式表示，因此（例如）一个字节数组每32位字可容纳4个字节

因此，优化long
和double
的使用可能是值得的。。。和大量的基元数组。但总的来说不是
理论上，JIT可能能够优化这一点，但在实践中，我从来没有听说过JIT能够做到这一点。一个障碍是JIT通常无法运行，直到创建了被编译的类的实例。如果JIT优化了内存布局，您可以拥有两个（或更多）相同类的“风格”对象。。。这将带来巨大的困难

重游
在@meriton的答案中查看基准测试结果，似乎使用short
和byte
而不是int会导致乘法的性能损失。事实上，如果你孤立地考虑行动，惩罚是很重要的。（你不应该孤立地考虑他们……但这是另一个话题。）
我认为解释是JIT可能在每种情况下都使用32位乘法指令进行乘法。但是在byte
和short
情况下，它执行额外的指令，在每次循环迭代中将中间32位值转换为byte
或short
。（理论上，这种转换可以在循环结束时进行一次……但我怀疑优化器是否能够解决这个问题。）
无论如何，这确实指出了另一个问题，即作为优化切换到short
和byte
。这可能会使性能更差。。。在算术和计算密集型算法中。字节通常被认为是8位。
short通常被认为是16位
在一个“纯”的环境中，不是java，因为所有字节和长、短的实现以及其他有趣的东西通常都对您隐藏，字节可以更好地利用空间
但是，您的计算机可能不是8位，也可能不是16位。这意味着
特别是为了获得16或8位，它需要借助于浪费时间的“欺骗”，以假装它有能力在需要时访问这些类型
在这一点上，这取决于硬件是如何实现的。然而，从我开始，
最好的速度是通过将东西分块存储来实现的，这样可以让您的CPU舒适地使用。64位处理器喜欢处理64位元素，任何低于64位的元素通常需要“工程魔法”来假装喜欢处理它们。这取决于JVM的实现以及底层硬件
public class Benchmark {

    public static void benchmark(String label, Code code) {
        print(25, label);
        
        try {
            for (int iterations = 1; ; iterations *= 2) { // detect reasonable iteration count and warm up the code under test
                System.gc(); // clean up previous runs, so we don't benchmark their cleanup
                long previouslyUsedMemory = usedMemory();
                long start = System.nanoTime();
                code.execute(iterations);
                long duration = System.nanoTime() - start;
                long memoryUsed = usedMemory() - previouslyUsedMemory;
                
                if (iterations > 1E8 || duration > 1E9) { 
                    print(25, new BigDecimal(duration * 1000 / iterations).movePointLeft(3) + " ns / iteration");
                    print(30, new BigDecimal(memoryUsed * 1000 / iterations).movePointLeft(3) + " bytes / iteration\n");
                    return;
                }
            }
        } catch (Throwable e) {
            throw new RuntimeException(e);
        }
    }
    
    private static void print(int desiredLength, String message) {
        System.out.print(" ".repeat(Math.max(1, desiredLength - message.length())) + message);
    }
    
    private static long usedMemory() {
        return Runtime.getRuntime().totalMemory() - Runtime.getRuntime().freeMemory();
    }

    @FunctionalInterface
    interface Code {
        /**
         * Executes the code under test.
         * 
         * @param iterations
         *            number of iterations to perform
         * @return any value that requires the entire code to be executed (to
         *         prevent dead code elimination by the just in time compiler)
         * @throws Throwable
         *             if the test could not complete successfully
         */
        Object execute(int iterations);
    }

    public static void main(String[] args) {
        benchmark("long[] traversal", (iterations) -> {
            long[] array = new long[iterations];
            for (int i = 0; i < iterations; i++) {
                array[i] = i;
            }
            return array;
        });
        benchmark("int[] traversal", (iterations) -> {
            int[] array = new int[iterations];
            for (int i = 0; i < iterations; i++) {
                array[i] = i;
            }
            return array;
        });
        benchmark("short[] traversal", (iterations) -> {
            short[] array = new short[iterations];
            for (int i = 0; i < iterations; i++) {
                array[i] = (short) i;
            }
            return array;
        });
        benchmark("byte[] traversal", (iterations) -> {
            byte[] array = new byte[iterations];
            for (int i = 0; i < iterations; i++) {
                array[i] = (byte) i;
            }
            return array;
        });
        
        benchmark("long fields", (iterations) -> {
            class C {
                long a = 1;
                long b = 2;
            }
            
            C[] array = new C[iterations];
            for (int i = 0; i < iterations; i++) {
                array[i] = new C();
            }
            return array;
        });
        benchmark("int fields", (iterations) -> {
            class C {
                int a = 1;
                int b = 2;
            }
            
            C[] array = new C[iterations];
            for (int i = 0; i < iterations; i++) {
                array[i] = new C();
            }
            return array;
        });
        benchmark("short fields", (iterations) -> {
            class C {
                short a = 1;
                short b = 2;
            }
            
            C[] array = new C[iterations];
            for (int i = 0; i < iterations; i++) {
                array[i] = new C();
            }
            return array;
        });
        benchmark("byte fields", (iterations) -> {
            class C {
                byte a = 1;
                byte b = 2;
            }
            
            C[] array = new C[iterations];
            for (int i = 0; i < iterations; i++) {
                array[i] = new C();
            }
            return array;
        });

        benchmark("long multiplication", (iterations) -> {
            long result = 1;
            for (int i = 0; i < iterations; i++) {
                result *= 3;
            }
            return result;
        });
        benchmark("int multiplication", (iterations) -> {
            int result = 1;
            for (int i = 0; i < iterations; i++) {
                result *= 3;
            }
            return result;
        });
        benchmark("short multiplication", (iterations) -> {
            short result = 1;
            for (int i = 0; i < iterations; i++) {
                result *= 3;
            }
            return result;
        });
        benchmark("byte multiplication", (iterations) -> {
            byte result = 1;
            for (int i = 0; i < iterations; i++) {
                result *= 3;
            }
            return result;
        });
    }
}

     long[] traversal     3.206 ns / iteration      8.007 bytes / iteration
      int[] traversal     1.557 ns / iteration      4.007 bytes / iteration
    short[] traversal     0.881 ns / iteration      2.007 bytes / iteration
     byte[] traversal     0.584 ns / iteration      1.007 bytes / iteration
          long fields    25.485 ns / iteration     36.359 bytes / iteration
           int fields    23.126 ns / iteration     28.304 bytes / iteration
         short fields    21.717 ns / iteration     20.296 bytes / iteration
          byte fields    21.767 ns / iteration     20.273 bytes / iteration
  long multiplication     0.538 ns / iteration      0.000 bytes / iteration
   int multiplication     0.526 ns / iteration      0.000 bytes / iteration
 short multiplication     0.786 ns / iteration      0.000 bytes / iteration
  byte multiplication     0.784 ns / iteration      0.000 bytes / iteration

import java.lang.management.*;

public class SpeedTest {

/** Get CPU time in nanoseconds. */
public static long getCpuTime() {
    ThreadMXBean bean = ManagementFactory.getThreadMXBean();
    return bean.isCurrentThreadCpuTimeSupported() ? bean
            .getCurrentThreadCpuTime() : 0L;
}

public static void main(String[] args) {
    long durationTotal = 0;
    int numberOfTests=0;

    for (int j = 1; j < 51; j++) {
        long beforeTask = getCpuTime();
        // MEASURES THIS AREA------------------------------------------
        long x = 20000000;// 20 millions
        for (long i = 0; i < x; i++) {
                           TestClass s = new TestClass(); 

        }
        // MEASURES THIS AREA------------------------------------------
        long duration = getCpuTime() - beforeTask;
        System.out.println("TEST " + j + ": duration = " + duration + "ns = "
                + (int) duration / 1000000);
        durationTotal += duration;
        numberOfTests++;
    }
    double average = durationTotal/numberOfTests;
    System.out.println("-----------------------------------");
    System.out.println("Average Duration = " + average + " ns = "
            + (int)average / 1000000 +" ms (Approximately)");


}

 public class TestClass {
     int a1= 5;
     int a2= 5; 
     int a3= 5;
     int a4= 5; 
     int a5= 5;
     int a6= 5; 
     int a7= 5;
     int a8= 5; 
     int a9= 5;
     int a10= 5; 
     int a11= 5;
     int a12=5; 
     int a13= 5;
     int a14= 5; 
 }

 Average Duration = 8.9625E8 ns = 896 ms (Approximately)

 Average Duration = 6.94375E8 ns = 694 ms (Approximately)

void spin() {
 int i;
 for (i = 0; i < 100; i++) {
 ; // Loop body is empty
 }
}

0 iconst_0 // Push int constant 0
1 istore_1 // Store into local variable 1 (i=0)
2 goto 8 // First time through don't increment
5 iinc 1 1 // Increment local variable 1 by 1 (i++)
8 iload_1 // Push local variable 1 (i)
9 bipush 100 // Push int constant 100
11 if_icmplt 5 // Compare and loop if less than (i < 100)
14 return // Return void when done

void sspin() {
 short i;
 for (i = 0; i < 100; i++) {
 ; // Loop body is empty
 }
}

0 iconst_0
1 istore_1
2 goto 10
5 iload_1 // The short is treated as though an int
6 iconst_1
7 iadd
8 i2s // Truncate int to short
9 istore_1
10 iload_1
11 bipush 100
13 if_icmplt 5
16 return