The problems of code obfuscation

Code obfuscation is the earliest solution applied to Java code protection and also the most direct solution.

Code obfuscation typically has the following four methods:

  1. Package name, class name, variable name conversion
  2. Control structures are changed, such as control flow flattening, adding immutable predicates, and so on.
  3. String obfuscation or encryption
  4. Add useless code

Obfuscation can greatly reduce the readability of decompiled code, increase the difficulty of static analysis, but no matter how code obfuscation is performed, the program's execution logic will not change.

JVM bytecode is a kind of intermediate code with clear and explicit semantics that is extremely readable. Even for obfuscated class files, it is still possible to analyze them at the bytecode level, even if they cannot be restored to readable Java source code. Because of the high semantic nature of Java bytecode, this process is actually quite easy.

We have developed a JVM bytecode execution engine using Java and Kotlin languages. Users can use this project to dynamically debug Java programs at the bytecode level in IDEA. Please refer to the following article for details.

https://protector4j.com/articles/jvm-bytecode-engine-written-with-java-and-kotlin/

And we use this engine to try to crack a well-known obfuscated code, the specific process can be found in the article below

http://protector4j.com/articles/deobfuscate-with-vlx-vmengine/

Conclusion

From the analysis above, it can be seen that due to the high semantic nature of the JVM bytecode, it is very easy to analyze and read. Using dynamic debugging, its operational logic can be easily analyzed. The development of dynamic debugging tools is not a very complex task, so obfuscation is not a reliable protection solution.